Big Data Analytics with Cloud Databases: Efficiency and Cost Optimization

Authors

  • N V Rama Sai Chalapathi Gupta Lakkimsetty   Independent Researcher, USA

Keywords:

Big Data Analytics, Efficiency and Scalability, Cloud Computing, Cost Analysis, Hadoop Platform Deployment, Platform-As-A-Service Mechanism, Performance Measurements, Physically Visiting Libraries.

Abstract

A revolutionary synergy, big data analytics with cloud computing allows for the analysis and processing of large datasets with previously unheard-of scalability and efficiency. The cloud overcomes the drawbacks of conventional on-premises systems by offering a flexible and affordable infrastructure for big data management, storage, and analysis. With the help of this combination, businesses can fully use big data and extract useful insights that inform choices and spur creativity. Big data analytics and cloud computing are combined to take use of cutting-edge technology like artificial intelligence, machine learning, and distributed computing. They are constantly current, which is why they use online resources instead of going to libraries in person. Users may store data on a large scale, maintain backups, and protect themselves from calamities, among many other advantages of cloud computing. Libraries may now store enormous volumes of data on the websites and electronic databases because to the development of cloud computing. All of this data will be stored securely. Cloud computing is a technique that makes a virtual platform available on library websites. It handles all of the data that is easily accessible over the internet. A popular IT strategy for meeting the requirements of several commercial and scientific Big Data applications is cloud computing. In this research, we provide a Hadoop platforms deployment technique using the Occopus cloud the orchestrator system tool for several cloud infrastructures. With the primary objective of preventing vendor locking issues, our automated solution offers a simple, portable, and scalable method of deploying the well-known Hadoop platform; that is, it does not rely on any cloud provider's prepared and provided virtual machine image or "black-box" Platform-as-a-Service mechanism. The study offers cost analysis and encouraging performance assessment outcomes.

References

  1. Pradhananga, Y., Karande, S., & Karande, C. High performance analytics of big data with dynamic and optimized Hadoop cluster. IEEE.
  2. Dawelbeit, O., & McCrindle, R. A novel cloud based elastic framework for big data preprocessing. In IEEE Conference Publications.
  3. Gonzales, J. U., & Krishnan, S. P. T. Building your next big thing with Google Cloud Platform.
  4. Singh, M. P., Hoque, M. A., & Tarkoma, S. A survey of systems for massive stream analytics.
  5. Ambeth Kumar, V. D., Ashok Kumar, V. D., Divakar, H., & Gokul, R. Cloud enabled media streaming using Amazon Web Services.
  6. Subia, S. (2018). Data Storage SpringerDOI: 978-3-319-21569-3_7 10, Procedia Computer Science.
  7. Nakhimovsky, A., & Myers, T. Google, Amazon, and beyond: Creating and consuming Web services.
  8. Mohanty, H., Bhuyan, P., & Chenthati, D. Chapter 2: Big data architecture. In Big data: A primer.
  9. Begam, S. S., Selvachandran, G., Ngan, T. T., & Sharma, R. (2020). Similarity measure of lattice ordered multi-fuzzy soft sets based on set theoretic approach and its application in decision making. Mathematics, 8, 1255.
  10. Thanh, V., Rohit, S., Raghvendra, K., Le Hoang, S., Thai, P. B., Dieu, T. B., Ishaani, P., Manash, S., & Tuong, L. (2020). Crime rate detection using social media of different crime locations and Twitter part-ofspeech tagger with Brown clustering. Journal of Intelligent & Fuzzy Systems, 38, 4287–4299.
  11. The Old Bailey and OCR: Benchmarking AWS, Azure, and GCP with 180,000 Page Images DocEng ‘20: In Proceedings of the ACM Symposium on Document Engineering, September 2020. Article No.: 19, pp. 1–4.
  12. Ta, V.-D., Liu, C.-M., & Nkabinde, G. W. (2016). Big data stream computing in healthcare real-time analytics. In 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA), pp. 37–42.
  13. Sui, X., Liu, D., Li, L., Wang, H., & Yang, H. (2019). Virtual machine scheduling strategy based on machine learning algorithms for load balancing. EURASIP Journal on Wireless Communications and Networking, 2019(1), 1-16.
  14. Tao, D., Lin, Z., & Wang, B. (2017). Load feedback-based resource scheduling and dynamic migration-based data locality for virtual hadoop clusters in openstack-based clouds. Tsinghua Science and Technology, 22(2), 149-159.
  15. Toosi, A. N., Calheiros, R. N., & Buyya, R. (2014). Interconnected Cloud Computing Environments: Challenges, Taxonomy, and Survey. ACM Computing Surveys, 47(1), 7-47.
  16. S. Jiao, C. He, Y. Dou, H. Tang, “Molecular dynamics simulation: Implementation and optimization based on Hadoop”, 2012 Eighth International Conference on Natural Computation (ICNC), 12031207, 2012.
  17. K. Shvachko, H. Kuang, S. Radia, “The hadoop distributed file system”, Proceedings of the 26th Symposium on Mass Storage Systems and Technologies, 1-10, 2010.
  18. G. Kecskemeti, M. Gergely, ´ A. Visegr ´ adi, Zs. N ´ emeth, J. Kov ´ acs, P. Kacsuk, ´ “One Click Cloud Orchestrator: Bringing Complex Applications Effortlessly to the Clouds”, In: Euro-Par 2014, Lecture Notes in Computer Science (8806), Springer, 38-49, 2014.
  19. J. Kovacs, P. Kacsuk, Z. Farkas, “Orchestrating Federated Clouds by Occopus”, in P. Ivnyi, B.H.V. Topping, G. Vrady, (Editors), “Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering”, Civil-Comp Press, Stirlingshire, UK, Paper 14, 2017.
  20. R. Lovas, E. Nagy, J. Kovacs, “Cloud Agnostic Orchestration for Big Data Research Platforms”, in P. Ivnyi, B.H.V. Topping, G. Vrady, (Editors), “Proceedings of the Fifth International Conference on Parallel, Distributed, Grid and Cloud Computing for Engineering”, Civil-Comp Press, Stirlingshire, UK, Paper 15, 2017.
  21. Vishal Reddy Vadiyala, Parikshith Reddy Baddam, and Swathi Kaluvakuri. “Demystifying Google Cloud: A Comprehensive Review of Cloud Computing Services”. In: Asian Journal of Applied Science and Engineering 5.1 (2016), pp. 207–218.
  22. Fletcher Trueblood, David Rodriguez, Jese Hernandez, Michelle Salomon, Sanjay Soundarajan, and Matin Pirouz. “Demystifying Transportation Using Big Data Analytics”. In: 2019 International Conference on Computational Science and Computational Intelligence (CSCI). IEEE. 2019, pp. 1281–1286.
  23. G Kousalya, P Balakrishnan, C Pethuru Raj, G Kousalya, P Balakrishnan, and C Pethuru Raj. “Demystifying the Traits of Software-Defined Cloud Environments (SDCEs)”. In: Automated Workflow Scheduling in Self-Adaptive Clouds: Concepts, Algorithms and Methods (2017), pp. 23– 53.
  24. Venu Vedam and Jayanti Vemulapati. “Demystifying cloud benchmarking paradigm-an in-depth view”. In: 2012 ieee 36th annual computer software and applications conference. IEEE. 2012, pp. 416–421.
  25. Kai Hwang and Min Chen. Big-data analytics for cloud, IoT and cognitive computing. John Wiley & Sons, 2017.
  26. Prateeksha Varshney and Yogesh Simmhan. “Demystifying fog computing: Characterizing architectures, applications and abstractions”. In: 2017 IEEE 1st international conference on fog and edge computing (ICFEC). IEEE. 2017, pp. 115–124.
  27. Susan M Keaveney. “Customer switching behavior in service industries: An exploratory study”. In: Journal of marketing 59.2 (1995), pp. 71–82.
  28. Joseph Vignos, Philip Kim, and Richard L Metzer. “Demystifying the fog: Cloud computing from a risk management perspective”. In: Special Issue: Cloud Computing (2013).
  29. Basappa B Kodada and Demian Antony D’Mello. “Secure Data Deduplication (SD 2 e D up) in Cloud Computing: Threats, Techniques and Challenges”. In: International Conference on Advanced Communication and Computational Technology. Springer. 2019, pp. 1239–1251.
  30. Dutta, P., & Dutta, P. (2019). Comparative study of cloud services offered by Amazon, Microsoft, and Google. International Journal of Trends in scientific Research and Development (IJTSRD), 3(3), 981–985.

Downloads

Published

2020-03-01

Issue

Section

Research Articles

How to Cite

[1]
N V Rama Sai Chalapathi Gupta Lakkimsetty , " Big Data Analytics with Cloud Databases: Efficiency and Cost Optimization" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 6, Issue 2, pp.599-607, March-April-2020.