Optimal Common Job Block Table (CJBT) to improve the Performance in Hadoop framework

Authors

  • Pinjari Vali Basha   M. Tech Scholar, Computer Science and Engineering, JNTUA College of Engineering, Ananthapuramu, Andhra Pradesh, India

DOI:

https://doi.org/10.32628/CSEIT217689

Keywords:

Common Job Block Table, Least Recently Used, Improved Hadoop.

Abstract

By rapid transformation of technology, huge amount of data (structured data and Un Structured data) is generated every day.  With the aid of 5G technology and IoT the data generated and processed every day is very large. If we dig deeper the data generated approximately 2.5 quintillion bytes.
This data (Big Data) is stored and processed with the help of Hadoop framework. Hadoop framework has two phases for storing and retrieve the data in the network.

  • Hadoop Distributed file System (HDFS)
  • Map Reduce algorithm

In the native Hadoop framework, there are some limitations for Map Reduce algorithm. If the same job is repeated again then we have to wait for the results to carry out all the steps in the native Hadoop. This led to wastage of time, resources.  If we improve the capabilities of Name node i.e., maintain Common Job Block Table (CJBT) at Name node will improve the performance. By employing Common Job Block Table will improve the performance by compromising the cost to maintain Common Job Block Table.
Common Job Block Table contains the meta data of files which are repeated again. This will avoid re computations, a smaller number of computations, resource saving and faster processing. The size of Common Job Block Table will keep on increasing, there should be some limit on the size of the table by employing algorithm to keep track of the jobs. The optimal Common Job Block table is derived by employing optimal algorithm at Name node.

References

  1. Sachin Arun Thanekar, K. Subrahmanyam, A. B. Bagwan, “Big Data and MapReduce Challenges, Opportunities and Trends”, International Journal of Electrical and Computer Engineering (IJECE) Vol. 6, No. 6, pp. 2911~2919, December 2016.
  2. Sachin Arun Thanekar, K. Subrahmanyam, A. B. Bagwan, “A Study on Digital Forensics in Hadoop”, I J C T A, 9(18), pp. 8927-8933, 2016.
  3. H. Alshammari; J. Lee; H. Bajwa, "H2Hadoop: Improving Hadoop Performance using the Metadata of Related Jobs," in IEEE Transactions on Cloud Computing , vol.PP, no.99, pp.1-1
  4. H. Alshammari, J. Lee and H. Bajwa, "Evaluate H2Hadoop and Amazon EMR performances by processing MR jobs in text data sets," 2016 IEEE Long Island Systems, Applications and Technology Conference (LISAT), Farmingdale, NY, 2016, pp. 1-6.
  5. Ibrahim Abaker Targio Hashem, Ibrar Yaqoob, Nor Badrul Anuar, Salimah Mokhtar, Abdullah Gani, Samee Ulah Khan, “The rise of “big data” on cloud computing : Review and open research issues ”, Elsevier Information Systems 47 (2015) 98–115.
  6. Lidong Wang1 and Chery Ann Alexander, “ Big Data: Infrastructure, technology progress and challenges”, Journal of Data Manaagement and Computer Science Vol. 2(1), pp. 001-006, July, 2015. 7Wei Fan, Albert Bifet, “Mining Big Data: Current Status, and Forecast to the Future”, SIGKDD Explorations Volume 14, Issue 2, 2012.
  7. H. Alshammari H. Bajwa L. Jeongkyu "Enhancing performance of Hadoop and MapReduce for scientificdata using NoSQL database" , Systems Applications and Technology Conference (LISAT) 2015 IEEE Long Island, 2015.
  8. Z. Asad, M. Asad Rehman Chaudhry and D. Malone, "CodHoop: A system for optimizing big data processing," 2015 Annual IEEE Systems Conference (SysCon) Proceedings, Vancouver, BC, 2015, pp. 295-300.
  9. Zakia Asad, Mohammad Asad Rehman Chaudhry, "A Two-Way Street: Green Big Data Processing for a Greener Smart Grid", Systems Journal IEEE, vol. 11, pp. 784-795, 2017, ISSN 1932-8184.
  10. Zakia Asad, Mohammad Asad Rehman Chaudhry, David Malone, "Greener Data Exchange in the Cloud: A Coding-Based Optimization for Big Data Processing", Selected Areas in Communications IEEE Journal on, vol. 34, pp. 1360-1377.
  11. Zakia Asad, Mohammad Asad Rehman Chaudhry, "A Set Cover Based Efficient Solution for the Complementary Index Coding Problem", Ubiquitous Wireless Broadband (ICUWB) 2015 IEEE International Conference on, pp. 1-5, 2015.
  12. Abdelrahman Elsayed, Osama Ismail, and Mohamed E. El-Sharkawi, “Map Reduce: State-of-the-Art and Research Directions”, International Journal of Computer and Electrical Engineering, Vol. 6, No. 1, February 2014.
  13. K. Grolinger, M. Hayes, W. Higashino, A. L'Heureux, D. S. Allison, M. A. M. Capretz, “Challenges for MapReduce in Big Data”, IEEE 10th 2014 World Congress on Services (SERVICES 2014), June 27-July 2, 1014, Alaska, USA.
  14. Nilam Kadale, U. A. Mande, “Survey of Task Scheduling Method for Map Reduce Framework in Hadoop”, International Journal of Applied Information Systems (IJAIS) – ISSN : 2249-0868 Foundation of Computer Science FCS, New York, USA 2nd National Conference on Innovative Paradigms in Engineering & Technology (NCIPET2013).
  15. Diana Moise, Thi-Thu-Lan Trieu, Gabriel Antoniu, Luc Boug_e “Optimizing Intermediate Data Management in MapReduce Coputations”. CloudCP 2011 { 1st International Workshop on Cloud Computing Platforms, Held in conjunction with the ACM SIGOPS Eurosys 11 conference, Apr 2011, Salzburg, Austria. 2011.
  16. Shafali Agarwal, Zeba Khanam, “ MapReduce: A Survey Paper on Recent Expansion”, (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 6, No. 8, 2015.

Downloads

Published

2021-12-30

Issue

Section

Research Articles

How to Cite

[1]
Pinjari Vali Basha , " Optimal Common Job Block Table (CJBT) to improve the Performance in Hadoop framework" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 7, Issue 6, pp.346-350, November-December-2021. Available at doi : https://doi.org/10.32628/CSEIT217689