Analysis of Data Performance that Reduces Resource Utilization Overheads and Increases the Efficiency

Authors

  • Anilkumar Ambore  Research Scholar, VTU, Department of CSE, REVA ITM, Bangalore, India
  • Udaya Rani V  Department of CSE, REVA ITM, Bangalore, India

DOI:

https://doi.org//10.32628/CSEIT228369

Keywords:

Big Data, Resource Utilization, Spark, Hadoop, Cloud Computing

Abstract

In today, the size of the data is increasing at a random speed. So, this leads to processing of Big data. When we compare this in business applications where the volume of data is huge and at the same time it should be processed in efficient manner. Traditional system fails to process the bigdata because most of the data in bigdata is unstructured. To improve performance in distributed data processing resource utilization plays vital role. There are resource gaps develop while execution occurs. This is more frequent in heterogeneous environment. In the previous techniques there is wastage or not efficient usage of resources. To process data in distributed environment multiple platforms used such as Apache Hadoop, Apache Spark etc. Here we develop new algorithm that reduces the usage of resources and increases the performances. The algorithm implemented in Apache Spark distributed environment. The experimental results indicate efficient utilization of resources and increase in performance.

References

  1. www.en.wikipedia.org
  2. Gartner IT Glossary 2013
  3. Gueyoung Jung ; Gnanasambandam, N. ; Mukherjee, T. Big Data Analytics2012 International Conference on Communication, Information & Computing Technology (ICCICT), Oct. 19-20, Mumbai, India
  4. McKinsey Global Institute Big data: The next frontier for innovation, competition, and productivity 2011
  5. O’Reilly Strata An Introduction to the big data landscape 2012
  6. Microsoft Enterprise Insights The Big Bang: How the Big Data Explosion Is Changing the World
  7. IBM Big Data at the Speed of Business 2012
  8. A. Ambore and U. R. V., "A Survey on Data Placement Strategy in Big Data Heterogeneous Environments," 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), 2019, pp. 439-443, doi: 10.1109/ICOEI.2019.8862676
  9. Chung, Wu-Chun & Wu, Tsung-Lin & Lee, Yi-Hsuan & Huang, Kuo-Chan & Hsiao, Hung-Chang & Lai, Kuan-Chou. (2020). Minimizing Resource Waste in Heterogeneous Resource Allocation for Data Stream Processing on Clouds. Applied Sciences. 11. 149. 10.3390/app11010149.
  10. https://towardsdatascience.com/apache-spark-performance-boosting-e072a3ec1179
  11. Patty JW, Penn EM (2015) Analyzing big data: social choice and measurement. Polit Sci Polit 48(01):95–101
  12. Yang C et al (2014) A spatiotemporal compression based approach for efficient big data processing on Cloud. J Comput Syst Sci 80(8):1563–1583
  13. Dong W et al (2011) Tradeoffs in scalable data routing for deduplication clusters. In: FAST
  14. Xia W et al (2011) SiLo: a similarity-locality based near-exact deduplication scheme with low RAM overhead and high throughput. In: USENIX annual technical conference
  15. Fan J, Han F, Liu H (2014) Challenges of big data analysis. Nat Sci Rev 1(2):293–314

Downloads

Published

2022-06-30

Issue

Section

Research Articles

How to Cite

[1]
Anilkumar Ambore, Udaya Rani V, " Analysis of Data Performance that Reduces Resource Utilization Overheads and Increases the Efficiency, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 8, Issue 3, pp.191-195, May-June-2022. Available at doi : https://doi.org/10.32628/CSEIT228369