Data Analysis Using R and Hadoop

Authors(3) :-Amit Rajbanshi, Birendra Kumar Sah, C. K. Raina

Analyzing and managing huge information may be very hard exploitation classical means like electronic data service management systems or desktop package package packages for statistics and image. Instead, huge information desires huge clusters with an entire heap or even thousands of computing nodes. Official statistics is progressively} considering huge information for clarification new statistics as a results of huge information sources would possibly manufacture additional relevant and timely statistics than ancient sources. one of the package package tools successfully and wide unfold used for storage and method of huge information sets on clusters of artefact hardware is Hadoop. Hadoop framework contains libraries, a distributed file-system (HDFS), and a resource-management platform and implements a version of the MapReduce programming model for big scale process. throughout this paper we've got an inclination to analyze the possibilities of integration Hadoop with R that would be a stylish package package used for applied mathematics computing and information image. we've got an inclination to gift three ways in which of integration them: R with Streaming, Rhipe and RHadoop which we have a tendency to emphasize the advantages and downsides of each answer.

Authors and Affiliations

Amit Rajbanshi
Department of Computer Science and Engineering, Adesh College of Engineering & Technology, Chandigarh, Kharar, Punjab, India
Birendra Kumar Sah
Department of Computer Science and Engineering, Adesh College of Engineering & Technology, Chandigarh, Kharar, Punjab, India
C. K. Raina
Department of Computer Science and Engineering, Adesh College of Engineering & Technology, Chandigarh, Kharar, Punjab, India

R, Big Data, Hadoop, Rhipe, Rhadoop, Streaming

  1. Ahas, R., and Tiru, M., victimisation mobile positioning information for touristry statistics: Sampling and information management problems, NTTS - Conferences on New Techniques and Technologies for Statistics, Bruselles.
  2. Beyer, M., "Gartner Says determination 'Big Data' Challenge Involves over simply Managing Volumes of Data". Gartner, accessible at http://www.gartner.com/newsroom/id/1731916,  accessed on twenty fifth March 2014.
  3. Cleveland, William S., Guha, S., Computing atmosphere for the applied math analysis of huge and sophisticated information, degree treatise, Purdue University West Lafayette.
  4. Dean, J., and  Ghemawat, S., "MapReduce: Simplifi erectile dysfunction processing on giant Clusters", accessible at http://static.googleusercontent.com/media/research.google.com/ro//archive/mapreduce-osdi04.pdf, accessed on twenty fifth March 2014.
  5. High-Level cluster for the improvement of applied math Production and Services (HLG), (2013), What will "big data" mean for of fi cial statistics?, UNECE, accessible at http://www1.unece.org/stat/platform/pages/viewpage.action?pageId=77170614, accessed on twenty fifth March 2014.
  6. Holmes, A , Hadoop in follow, Manning Publications, New Jersey.
  7. Mayer-Schönberger, V. , and  Cukier, K , "Big Data: A Revolution That Transforms however we have a tendency to Work, Live, and Think", Houghton Mif American state in Harcourt.
  8. Prajapati, V , huge information analysis with R and Hadoop, PaktPublishing.
  9. R Core Team , associate Introduction to R, accessible at http://www.r-project.org/, accessed on twenty fifth March 2014.

Publication Details

Published in : Volume 2 | Issue 6 | November-December 2017
Date of Publication : 2017-12-31
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 1093-1097
Manuscript Number : CSEIT1726297
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

Amit Rajbanshi, Birendra Kumar Sah, C. K. Raina, "Data Analysis Using R and Hadoop", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 6, pp.1093-1097, November-December-2017. |          | BibTeX | RIS | CSV

Article Preview