Storage Preservation Using Big Data Based Intelligent Compression Scheme

Authors

  • Ramya. S  ME-Computer Science and Engineering, Dhanalakshmi Srinivasan Engineering College, Perambalur, Tamil Nadu, India
  • Gokula Krishnan. V  Assistant Professor, Dhanalakshmi Srinivasan Engineering College, Perambalur, Tamil Nadu, India

DOI:

https://doi.org//10.32628/CSEIT19539

Keywords:

Data Chunks, Similarity Matching, Parallel Processing, Data Security, Data Compression

Abstract

Big data has reached a maturity that leads it into a productive phase. This means that most of the main issues with big data have been addressed to a degree that storage has become interesting for full commercial exploitation. However, concerns over data compression still prevent many users from migrating data to remote storage. Client-side data compression in particular ensures that multiple uploads of the same content only consume network bandwidth and storage space of a single upload. Compression is actively used by a number of backup providers as well as various services. Unfortunately, compressed data is pseudorandom and thus cannot be deduplicated: as a consequence, current schemes have to entirely sacrifice storage efficiency. In this system, present a scheme that permits a more fine-grained trade-off. And present a novel idea that differentiates data according to their popularity. Based on this idea, design a compression scheme that guarantees semantic storage preservation for unpopular data and provides scalable data storage and bandwidth benefits for popular data. We can implement variable data chunk similarity algorithm for analyze the chunks data and store the original data with compressed format. And also includes the encryption algorithm to secure the data. Finally, can use the backup recover system at the time of blocking and also analyze frequent login access system.

References

  1. L. Wang, J. Zhan, W. Shi and Y. Liang, “In cloud, can scientific communities benefit from the economies of scale?” IEEE Transactions on Parallel and Distributed Systems 23(2): 296-303, 2012.
  2. B. Li, E. Mazur, Y. Diao, A. McGregor and P. Shenoy, “A platform for scalable one-pass analytics using mapreduce,” in: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'11), 2011, pp. 985-996.
  3. R. Kienzler, R. Bruggmann, A. Ranganathan and N. Tatbul, “Stream as you go: The case for incremental data access and processing in the cloud,” IEEE ICDE International Workshop on Data Management in the Cloud (DMC'12), 2012
  4. C. Olston, G. Chiou, L. Chitnis, F. Liu, Y. Han, M. Larsson, A. Neumann, V.B.N. Rao, V. Sankarasubramanian, S. Seth, C. Tian, T. ZiCornell and X. Wang, “Nova: Continuous pig/hadoop workflows,” Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'11), pp. 1081-1090, 2011.
  5. K.H. Lee, Y.J. Lee, H. Choi, Y.D. Chung and B. Moon, “Parallel data processing with mapreduce: A survey,” ACM SIGMOD Record 40(4): 11-20, 2012.
  6. X. Zhang, C. Liu, S. Nepal and J. Chen, “An Efficient Quasiidentifier Index based Approach for Privacy Preservation over Incremental Data Sets on Cloud,” Journal of Computer and System Sciences (JCSS), 79(5): 542-555, 2013.
  7. X. Zhang, T. Yang, C. Liu and J. Chen, “A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization using Systems, in MapReduce on Cloud,” IEEE Transactions on Parallel and Distributed, 25(2): 363-373, 2014.
  8. N. Laptev, K. Zeng and C. Zaniolo, “Very fast estimation for result and accuracy of big data analytics: The EARL system,” Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE), pp. 1296-1299, 2013.
  9. T. Condie, P. Mineiro, N. Polyzotis and M. Weimer, “Machine learning on Big Data,” Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE), pp. 1242-1244, 2013.
  10. Aboulnaga and S. Babu, “Workload management for Big Data analytics,” Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE), pp. 1249, 2013

Downloads

Published

2019-06-30

Issue

Section

Research Articles

How to Cite

[1]
Ramya. S, Gokula Krishnan. V, " Storage Preservation Using Big Data Based Intelligent Compression Scheme, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 5, Issue 3, pp.92-100, May-June-2019. Available at doi : https://doi.org/10.32628/CSEIT19539