Data Mining Algorithm in Cloud Computing Using Map Reduce Framework

Authors(2) :-Achi Sandeep, K. Rammohan Goud

Today's Cloud computing technology has been emerged to manage large data sets efficiently and due to rapid growth of data, large-scale data processing is becoming a major point of information technique. The Hadoop Distributed File System (HDFS) is designed for reliable storage of very large data sets and to stream those data sets at high bandwidth to user applications. In a large cluster, hundreds of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow on demand while remaining economical at every size. Map Reduce has been widely used for large-scale data analysis in the Cloud. Hadoop is an open source implementation of Map Reduce which can achieve better performance with the allocation of more compute nodes from the cloud to speed up computation; however, this approach of 'renting more nodes' isn't cost effective in a pay-as-you-go environment.

Authors and Affiliations

Achi Sandeep
Assistant Professor, CSE Department, Sri Indu College of Engineering and Technology, JNTU Hyderabad, Hyderabad, India
K. Rammohan Goud
Assistant Professor, CSE Department, Sri Indu College of Engineering and Technology, JNTU Hyderabad, Hyderabad, India

Cloud Computing, Distributed Data Mining, Hadoop, Hadoop Distributed File System, Map Reduce.

  1. SouptikDatta,KanishkaBhaduri,Chris Giannella, Ran Wolff,and HillolKargupta,Distributed Data Mining in Peer-to-Peer Networks,Universityof Maryland, Baltimore County, Baltimore, MD, USA, Journal IEEEInternet Computing archive Volume 10 Issue 4,Pages 18-26,July 2006.
  2. MafruzZamanAshrafi, David Taniar,and Kate A.Smith, A Data MiningArchitecture for Distributed Environments, pages 27-34, Springer-VerlagLondon, UK, 2007.
  3. GrigoriosTsoumakas and IoannisVlahavas, Distributed Data Mining ofLarge Classifier Ensembles, SETN-2008, Thessaloniki, Greece, Proceedings, Companion Volume, pp.249-256, 11-12 April 2008.
  4. Cheng-Tao Chu,Map-Reduce for Machine Learning on Multicore,CS Department,Stanford University,Stanford,CA,2006.
  5. Jeffrey Dean and Sanjay Ghemawat,Map Reduce:Simplied data processing on large clusters.InOSDI,pages 137-150,2004.
  6. Daniel J.Abadi, Yale University, DataManagement in the Cloud: Limitations and Opportunities, Bulletin of the IEEE ComputerSociety Technical Committee on Data Engineering 2009
  7. "Top 10 algorithms in data mining", Springer-Verlag London Limited 2007
  8. James l.Johnson,SQL in the Clouds,IEEE journal Cloud Computing,2009.

Publication Details

Published in : Volume 2 | Issue 1 | January-February 2017
Date of Publication : 2017-12-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 291-297
Manuscript Number : CSEIT1726169
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Achi Sandeep, K. Rammohan Goud, "Data Mining Algorithm in Cloud Computing Using Map Reduce Framework", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 1, pp.291-297, January-February-2017.
Journal URL :

Article Preview