Guidance to Data Mining in Python

Authors(1) :-Aashish Mamgain

Python has become top programming language in the field of data mining in recent years. Around 45% of data scientists are using python programming language for data mining. Python is ahead from other analytical tools such as R. Data mining is the technique in which large datasets is analyzed for generating predictive patterns, information. Data mining is used to detect various applications such as marketing, medical, telecommunications and so on. This paper presents classification algorithms such as Random Forest, Support Vector Machine, Decision Tree, Logistic Regression etc. This guide provides data mining classification techniques in python programming language.

Authors and Affiliations

Aashish Mamgain
CSE, HMR Institute of Technology and Management, New Delhi, India

Python, Data Mining, Classification, Random Forest, Support Vector Machine, Decision Tree, Logistic Regression

  1. Andreas C. Müller & Sarah Guido. 2016. Introduction to Machine Learning with Python. O’Reilly Media.
  2. Gavin Hackeling. 2014. Mastering Machine Learning with scikit-learn. Packt Publishing.
  3.  Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort. (2011) Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 2825-2830.
  4. Eli Bressert. 2012. SciPy and NumPy. O’Reilly Media.
  5. Ren, J., Lee, S. D., Chen, X., Kao, B., Cheng, R., & Cheung, D. (2009, December). Naive bayes classification of uncertain data. In Data Mining, 2009. ICDM'09. Ninth IEEE International Conference on (pp. 944-949). IEEE.
  6. Geng, X., Liu, T. Y., Qin, T., Arnold, A., Li, H., & Shum, H. Y. (2008, July). Query dependent ranking using k-nearest neighbor. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval (pp. 115-122). ACM
  7. Zhang, X., & Yang, L. (2012). Improving SVM through a risk decision rule running on MATLAB. Journal of Software, 7(10), 2252- 2257
  8. Moosavian, A., Ahmadi, H., & Tabatabaeefar, A. (2012). Journal-bearing fault detection based on vibration analysis using feature selection and classification techniques.
  9. Taniar, D., & Rahayu, W. (2013). A taxonomy for nearest neighbour queries in spatial databases. Journal of Computer and System Sciences.
  10. Karatsiolis, S., & Schizas, C. N. (2012, November). Region based Support Vector Machine algorithm for medical diagnosis on Pima Indian Diabetes dataset. In Bioinformatics & Bioengineering (BIBE), 2012 IEEE 12th International Conference on (pp. 139-144). IEEE
  11. Eesha Goel, Er. Abhilasha (2017, January). Random Forest: A Review, International Journal of Advanced Research in Computer Science and Software Engineering, ISSN: 2277 128X DOI: 10.23956/ijarcsse/V7I1/01113
  12. Seema Sharma, Jitendra Agrawal , Shikha Agarwal , Sanjeev Sharma (2013). Machine Learning Techniques for Data Mining: A Survey, 2013 IEEE International Conference on Computational Intelligence and Computing Research, 978-1-4799-1597-2/13
  13. Shi Na, Guan yong, Liu Xumin (2010). Research on k-means Clustering algorithm: An Improved k-means Clustering Algorithm, Third International Symposium on Intelligent Information Technology and Security Informatics IEEE, ISSN: 978-0-7695-4020-7/10 DOI 10.1109/IITSI.2010.74
  14. Fahim A M,Salem A M,Torkey F A, “An efficient enhanced k-means clustering algorithm” Journal of Zhejiang University Science A, Vol.10, pp:1626-1633,July 2006.
  15. Ms. Sonali. B. Maind, Ms. Priyanka Wankar (2014, January), Research Paper on Basic of Artificial Neural Network. International Journal on Recent and Innovation Trends in Computing and Communication. ISSN: 2321-8169
  16.  Abhishek Gupta, Ankit Jain, Samartha Yadav. (2018). Literature survey on detection of web attacks using machine learning. International Journal of Scientific Research in Computer Science, Engineering and Information Technology. 2456-3307.

Publication Details

Published in : Volume 3 | Issue 6 | July-August 2018
Date of Publication : 2018-08-30
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 596-601
Manuscript Number : CSEIT1836128
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Aashish Mamgain, "Guidance to Data Mining in Python", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 6, pp.596-601, July-August-2018.
Journal URL :

Article Preview