Big Data Anonymization in Cloud using k-Anonymity Algorithm using Map Reduce Framework

Authors

  • Anushree Raj  Department of M.Sc. Big Data Analytics, St Agnes Autonomous College, Mangalore, Karnataka, India
  • Rio G L D'Souza  Department of Computer Science and Engineering, St Joseph Engineering College, Mangalore, Karnataka, India

DOI:

https://doi.org//10.32628/CSEIT19516

Keywords:

Anonymization, Big Data in cloud, k-Anonymity, Map Reduce, Privacy Preserving

Abstract

Anonymization techniques are enforced to provide privacy protection for the data published on cloud. These techniques include various algorithms to generalize or suppress the data. Top Down Specification in k anonymity is the best generalization algorithm for data anonymization. As the data increases on cloud, data analysis becomes very tedious. Map reduce framework can be adapted to process on these huge amount of Big Data. We implement generalized method using Map phase and Reduce Phase for data anonymization on cloud in two different phases of Top Down Specification

References

  1. L. Hsiao-Ying and W.G. Tzeng, “A Secure Erasure Code-Based Cloud Storage System with Secure Data Forwarding,” IEEE Trans. Parallel and Distributed Systems, vol. 23, no. 6, pp. 9951003, 2012.
  2. D. Zissis and D. Lekkas, "Addressing Cloud Computing Security Issues," Future Generation Computer Systems, vol. 28, no. 3,pp. 583-592,2011.
  3. RuilinLiu, Hui Wang, ”Privacy –Preserving Data Publishing “ IEEE ,2010.
  4. B.C.M. Fung, K. Wang, and P.S. Yu, "Anonymizing Classification Data for Privacy Preservation," IEEE Trans. Knowledge and Data Eng., vol. 19, no. 5, pp. 711-725, May 2007.
  5. X. Xiao and Y. Tao, "Anatomy: Simple and Effective Privacy Preservation." Proc. 32nd Int'l Conf. Very Large Data Bases (VLDB'06), pp. 139-150,2006.
  6. K. LeFevre, D.J. DeWitt, and R. Ramakrishnan, "Incognito: Efficient Full-Domain K-Anonymity," Proc. ACM SIGMOD Infl Conf. Management of Data (SIGMOD '05), pp. 49-60, 2005.
  7. K. LeFevre, D. l DeWitt, and R. Ramakrishnan, "Mondrian Multidimensional K-Anonymity," Proc. 22nd Int'! Conf. Data Eng. (ICDE '06), 2006.
  8. 1 Xu, W. Wang, 1 Pei, X. Wang, B. Shi, and A. W. Fu. , "Utility-based anonymization using local recoding ", In ACM SIGKDD, 2006.
  9. S.Yu, ”Anonymizing Classification Data for Privacy Preservation“. IEEE Transactions on Knowledge and Data Engineering ,vol 19 no 5 ,2007
  10. Benjamin C.M Fung, Ke Wang, Philip S.Yu “Top Down Specialization for Information and Privacy Preservation”.
  11. Bayardo R and Agrawal R, Data privacy through optimal k-anonymization. In ICDE05: The 21st International Conference on Data Engineering, pages 217– 228, 2005.
  12. HIPAA 2012 “k-anonymity : A model for Protecting Privacy “ International Journal Uncertain Fuzz .vol 10 ,no,5 pp 557-570 ,2002.
  13. Ciriani, S. De Capitani di Vimercati, S. Foresti, and P. Samarati, "k-Anonymity Data Mining: A Survey", Springer US, Advances in Information Security (2007)
  14. Latanya Sweeney, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, "Achieving k-anonymity privacy protection using generalization and suppression", International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, Volume 10 Issue 5, October 2002, Pages 571 - 588
  15. Meyerson A and Williams R, On the complexity of optimal k-anonymity. In PODS04: Proceedings of the twenty fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 223–228, 2004
  16. Ke Wang, Philip S. Yu, Sourav Chakraborty, "Bottom-Up Generalization: A Data Mining Solution to Privacy Protection", Fourth IEEE International Conference on Data Mining, 2004. ICDM '04 .. Pages 249 – 256
  17. Xuyun Z, Laurence T Yang ,”A scalable Two phase Top Down Specialization Approach for Data Anonymization using Map Reduce on cloud“, IEEE Transaction on Parallel and Distributed Systems ,TPDSSI-2012.
  18. Zhang X, Yang LT, Liu C, Chen J. A scalable two‑phase top‑down specialization approach for data anonymization using MapReduce on cloud. IEEE Trans Parallel Distrib Syst. 2014;25(2):363–73
  19. Jefrey Dean and Sanjay Ghernawat “Map-Reduce: Simplified Data Processing on Large Clusters” Google,Inc.2004
  20. J. Dean and S. Ghemawat, "Mapreduce: Simplified Data Processing on Large Clusters," Comm. ACM, vol. 51, no. 1, pp. 107-113,2008
  21. Al‑Zobbi M, Shahrestani S, Ruan C. Sensitivity‑based anonymization of big data. In: Local computer networks work‑ shops (LCN workshops), 2016 IEEE 41st Conference on. IEEE; 2016. p. 58–64
  22. Al‑Zobbi M, Shahrestani S, Ruan C. Implementing a framework for big data anonymity and analytics access control. In: Trustcom/BigDataSE/ICESS, 2017 IEEE. IEEE; 2017. p. 873–80.
  23. Al‑Zobbi M, Shahrestani S, Ruan C. Multi‑dimensional sensitivity‑based anonymization method for big data. In: Elk‑ hodr M, Shahrestani S, Hassan Q, editors. Networks of the future: architectures, technologies, and implementations. Boca Raton: Chapman and Hall/CRC Computer and Information Science Series, Taylor & Francis; 2017. p. 448.

Downloads

Published

2018-12-30

Issue

Section

Research Articles

How to Cite

[1]
Anushree Raj, Rio G L D'Souza, " Big Data Anonymization in Cloud using k-Anonymity Algorithm using Map Reduce Framework, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 5, Issue 1, pp.50-56, January-February-2019. Available at doi : https://doi.org/10.32628/CSEIT19516