Data Mining Algorithm in Cloud Computing Using Map Reduce Framework

Achi Sandeep; K. Rammohan Goud

doi:10.32628/CSEIT1726169

Authors

Achi Sandeep Assistant Professor, CSE Department, Sri Indu College of Engineering and Technology, JNTU Hyderabad, Hyderabad, India
K. Rammohan Goud Assistant Professor, CSE Department, Sri Indu College of Engineering and Technology, JNTU Hyderabad, Hyderabad, India

Keywords:

Cloud Computing, Distributed Data Mining, Hadoop, Hadoop Distributed File System, Map Reduce.

Abstract

Today's Cloud computing technology has been emerged to manage large data sets efficiently and due to rapid growth of data, large-scale data processing is becoming a major point of information technique. The Hadoop Distributed File System (HDFS) is designed for reliable storage of very large data sets and to stream those data sets at high bandwidth to user applications. In a large cluster, hundreds of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow on demand while remaining economical at every size. Map Reduce has been widely used for large-scale data analysis in the Cloud. Hadoop is an open source implementation of Map Reduce which can achieve better performance with the allocation of more compute nodes from the cloud to speed up computation; however, this approach of 'renting more nodes' isn't cost effective in a pay-as-you-go environment.

References

SouptikDatta,KanishkaBhaduri,Chris Giannella, Ran Wolff,and HillolKargupta,Distributed Data Mining in Peer-to-Peer Networks,Universityof Maryland, Baltimore County, Baltimore, MD, USA, Journal IEEEInternet Computing archive Volume 10 Issue 4,Pages 18-26,July 2006.
MafruzZamanAshrafi, David Taniar,and Kate A.Smith, A Data MiningArchitecture for Distributed Environments, pages 27-34, Springer-VerlagLondon, UK, 2007.
GrigoriosTsoumakas and IoannisVlahavas, Distributed Data Mining ofLarge Classifier Ensembles, SETN-2008, Thessaloniki, Greece, Proceedings, Companion Volume, pp.249-256, 11-12 April 2008.
Cheng-Tao Chu et.al.,Map-Reduce for Machine Learning on Multicore,CS Department,Stanford University,Stanford,CA,2006.
Jeffrey Dean and Sanjay Ghemawat,Map Reduce:Simplied data processing on large clusters.InOSDI,pages 137-150,2004.
Daniel J.Abadi, Yale University, DataManagement in the Cloud: Limitations and Opportunities, Bulletin of the IEEE ComputerSociety Technical Committee on Data Engineering 2009
"Top 10 algorithms in data mining", Springer-Verlag London Limited 2007
James l.Johnson,SQL in the Clouds,IEEE journal Cloud Computing,2009.

Data Mining Algorithm in Cloud Computing Using Map Reduce Framework

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite