Architecture Design for Hadoop No-SQL and Hive

A. Antony Prakash; Dr.  A. Aloysius

doi:10.32628/CSEIT1831245

Authors

A. Antony Prakash Assistant Professor, Information Tech, St Joseph's College - Tiruchirappalli, Tamil Nadu, India
Dr. A. Aloysius Assistant Professor, Computer Science, St Joseph's College - Tiruchirappalli , Tamil Nadu, India

Keywords:

Big Data, Hadoop, Map Reduce, Apache Hive, No SQL, and Overflow.

Abstract

Big data came into existence when the traditional relational database systems were not able to handle the unstructured data (weblogs, videos, photos, social updates, human behaviour) generated today by organisation, social media, or from any other data generating source. Data that is so large in volume, so diverse in variety or moving with such velocity is called Big data. Analyzing Big Data is a challenging task as it involves large distributed file systems which should be fault tolerant, flexible and scalable. The technologies used by big data application to handle the massive data are Hadoop, Map Reduce, Apache Hive, No SQL and HPCC, Overflow. These technologies handle massive amount of data in MB, PB, YB, ZB, KB and TB. In this research paper various technologies for handling big data along with the advantages and disadvantages of each technology for catering the problems in hand to deal the massive data has discussed.

References

Yuri Demchenko “The Big Data Architecture Framework (BDAF)” Outcome of the Brainstorming Session at the University of Amsterdam 17 July 2013.
Tekiner F. and Keane J.A., Systems, Man and Cybernetics (SMC), “Big Data Framework” 2013 IEEE International Conference on 13–16 Oct. 2013, 1494–1499.
Margaret Rouse, April 2010 “unstructured data”.
Nguyen T.D., Gondree M.A., Khosalim, J.; Irvine, “Towards a Cross Domain MapReduce Framework“ IEEE C.E. Military Communications Conference, MILCOM 2013, 1436 – 1441
Dong, X.L.; Srivastava, D. Data Engineering (ICDE),” Big data integration“ IEEE International Conference on , 29(2013) 1245–1248
Jian Tan; Shicong Meng; Xiaoqiao Meng; Li ZhangINFOCOM, “Improving ReduceTask data locality for sequential MapReduce” 2013 Proceedings IEEE ,1627 - 1635
Yaxiong Zhao; Jie Wu INFOCOM, “Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework” 2013 Proceedings IEEE 2013, 35 - 39 (Volume 19)
Sagiroglu, S.; Sinanc, D.,”Big Data: A Review”,2013,20-24
Minar, N.; Gray, M.; Roup, O.; Krikorian, R.; Maes, “Hive: distributed agents for networking things“ IEEE CONFERENCE PUBLICATIONS 1999 (118-129)
Garlasu, D.; Sandulescu, V.; Halcu, I.; Neculoiu, G,”A Big Data implementation based on Grid Computing”, Grid Computing, 2013, 17-19
Mukherjee, A.; Datta, J.; Jorapur, R.; Singhvi, R.; Haloi, S.; Akram, “Shared disk big data analytics with Apache Hadoop”, 2012, 18-22
Aditya B. Patel, Manashvi Birla, Ushma Nair, “Addressing Big Data Problem Using Hadoop and Map Reduce”, 2012, 6-8
Jefry Dean and Sanjay Ghemwat, MapReduce:A Flexible Data Processing Tool, Communications of the ACM, Volume 53, Issuse.1,2010, 72-77.
Chan,K.C.C. Bioinformatics and Biomedicine (BIBM), “Big data analytics for drug discovery” IEEE International Conference on Bioinformatics and Biomedicine 2013,1.
Kyuseok Shim, MapReduce Algorithms for Big Data Analysis, DNIS 2013, LNCS 7813, pp. 44–48, 2013.
Wang, J.; Xiao, Q.; Yin, J.; Shang, P. Magnetics, “DRAW: A New Data-gRouping-AWare Data Placement Scheme for Data Intensive Applications With Interest Locality“IEEE Transactions ( Vol: 49 ), 2013, 2514 – 2520
HADOOP-3759: Provide ability to run memory intensive jobs without affecting other running tasks on the nodes.

Architecture Design for Hadoop No-SQL and Hive

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite