Architecture Design for Hadoop No-SQL and Hive

Authors(2) :-A. Antony Prakash, Dr. A. Aloysius

Big data came into existence when the traditional relational database systems were not able to handle the unstructured data (weblogs, videos, photos, social updates, human behaviour) generated today by organisation, social media, or from any other data generating source. Data that is so large in volume, so diverse in variety or moving with such velocity is called Big data. Analyzing Big Data is a challenging task as it involves large distributed file systems which should be fault tolerant, flexible and scalable. The technologies used by big data application to handle the massive data are Hadoop, Map Reduce, Apache Hive, No SQL and HPCC, Overflow. These technologies handle massive amount of data in MB, PB, YB, ZB, KB and TB. In this research paper various technologies for handling big data along with the advantages and disadvantages of each technology for catering the problems in hand to deal the massive data has discussed.

Authors and Affiliations

A. Antony Prakash
Assistant Professor, Information Tech, St Joseph's College - Tiruchirappalli, Tamil Nadu, India
Dr. A. Aloysius
Assistant Professor, Computer Science, St Joseph's College - Tiruchirappalli , Tamil Nadu, India

Big Data, Hadoop, Map Reduce, Apache Hive, No SQL, and Overflow.

  1. Yuri Demchenko “The Big Data Architecture Framework (BDAF)” Outcome of the Brainstorming Session at the University of Amsterdam 17 July 2013.
  2. Tekiner F. and Keane J.A., Systems, Man and Cybernetics (SMC), “Big Data Framework” 2013 IEEE International Conference on 13–16 Oct. 2013, 1494–1499.
  3. Margaret Rouse, April 2010 “unstructured data”.
  4. Nguyen T.D., Gondree M.A., Khosalim, J.; Irvine, “Towards a Cross Domain MapReduce Framework“ IEEE C.E. Military Communications Conference, MILCOM 2013, 1436 – 1441
  5. Dong, X.L.; Srivastava, D. Data Engineering (ICDE),” Big data integration“ IEEE International Conference on , 29(2013) 1245–1248
  6. Jian Tan; Shicong Meng; Xiaoqiao Meng; Li ZhangINFOCOM, “Improving ReduceTask data locality for sequential MapReduce” 2013 Proceedings IEEE ,1627 - 1635
  7. Yaxiong Zhao; Jie Wu INFOCOM, “Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework” 2013 Proceedings IEEE 2013, 35 - 39 (Volume 19)
  8. Sagiroglu, S.; Sinanc, D.,”Big Data: A Review”,2013,20-24
  9. Minar, N.; Gray, M.; Roup, O.; Krikorian, R.; Maes, “Hive: distributed agents for networking things“ IEEE CONFERENCE PUBLICATIONS 1999 (118-129)
  10. Garlasu, D.; Sandulescu, V.; Halcu, I.; Neculoiu, G,”A Big Data implementation based on Grid Computing”, Grid Computing, 2013, 17-19
  11. Mukherjee, A.; Datta, J.; Jorapur, R.; Singhvi, R.; Haloi, S.; Akram, “Shared disk big data analytics with Apache Hadoop”, 2012, 18-22
  12. Aditya B. Patel, Manashvi Birla, Ushma Nair, “Addressing Big Data Problem Using Hadoop and Map Reduce”, 2012, 6-8
  13. Jefry Dean and Sanjay Ghemwat, MapReduce:A Flexible Data Processing Tool, Communications of the ACM, Volume 53, Issuse.1,2010, 72-77.
  14. Chan,K.C.C. Bioinformatics and Biomedicine (BIBM), “Big data analytics for drug discovery” IEEE International Conference on Bioinformatics and Biomedicine 2013,1.
  15. Kyuseok Shim, MapReduce Algorithms for Big Data Analysis, DNIS 2013, LNCS 7813, pp. 44–48, 2013.
  16. Wang, J.; Xiao, Q.; Yin, J.; Shang, P. Magnetics, “DRAW: A New Data-gRouping-AWare Data Placement Scheme for Data Intensive Applications With Interest Locality“IEEE Transactions ( Vol: 49 ), 2013, 2514 – 2520
  17. HADOOP-3759: Provide ability to run memory intensive jobs without affecting other running tasks on the nodes.

Publication Details

Published in : Volume 3 | Issue 1 | January-February 2018
Date of Publication : 2018-02-28
License:  This work is licensed under a Creative Commons Attribution 4.0 International License.
Page(s) : 1069-1077
Manuscript Number : CSEIT1831245
Publisher : Technoscience Academy

ISSN : 2456-3307

Cite This Article :

A. Antony Prakash, Dr. A. Aloysius, "Architecture Design for Hadoop No-SQL and Hive", International Journal of Scientific Research in Computer Science, Engineering and Information Technology (IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 1, pp.1069-1077, January-February-2018.
Journal URL : http://ijsrcseit.com/CSEIT1831245

Article Preview