Sorted Positional Indexing Based Computation for Large Data

Authors

  • K. S. Vijaya Lakshmi  Assistant Professor, Computer Science Department, VR Siddhartha College, K. S. Vijaya Lakshmi, Vijayawada, India
  • K. Gayatri  Computer Science Department, VR Siddhartha College, Student, Vijayawada, India

Keywords:

Big data, Hadoop Map reduce, Skyline, SSPL

Abstract

The performance of Hadoop Map Reduce mainly depends on its configuration parameters. Tuning the job configuration parameters is an effective way to improve performance so that we can reduce the execution time and the disk utilization. The performance of tuning is mainly based on CPU usage, disk I/O rate, memory usage, network traffic components. In this work we are discussing about the tuning techniques to upgrade the execution of Map Reduce occupations. It is found that the current calculations can't prepare the skyline on huge information productively. So, here we are using a novel skyline algorithm Skyline Sorted Positional Index List (SSPL) on huge data like social data. SSPL utilizes sorted positional index lists which require low space overhead to reduce I/O cost significantly. The experimental results on synthetic and real data sets show that SSPL has a significant advantage over the existing skyline algorithms.

References

  1. J. J. Huang, "Two Steps Genetic Programming for Big Data - Perspective of Distributed and High-Dimensional Data," IEEE International Congress on Bi Data, New York, NY, pp. 753-756, 2015.
  2. Avita Katal Mohammad Wazid R H Goudar, Big data: Issues, challenges, tools and   Good practices. In Contemporary Computing (IC3), Sixth International Conference on, 404-409, 2013.
  3. X. Wu, X. Zhu, G. Q. Wu, and W. Ding, “Data Mining with Big Data”, IEEE Transactions on Knowledge and Data Engineering, 26(1) 97-107, 2014.
  4. V. Kalavri and V. Vlassov, "MapReduce: Limitations, Optimizations and Open Issues," 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, Melbourne, VIC, pp. 1031-1038, 2013.
  5. A. Saboori, G. Jiang, and H. Chen, "Autotuning configurations in distributed systems for performance improvements using evolutionary strategies", Proc. 28th IEEE International Conference on Distributed Computing Systems (ICDCS '08), Dec. 2008, pp.769-776.

Downloads

Published

2017-10-31

Issue

Section

Research Articles

How to Cite

[1]
K. S. Vijaya Lakshmi, K. Gayatri, " Sorted Positional Indexing Based Computation for Large Data, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 2, Issue 5, pp.380-385, September-October-2017.