Privacy Preserving Parallel Distributed Data Stream Anonymization

Authors

  • Brinit Trivedi  Research Student, Department of Computer Engineering, Sigma Institute of Engineering, Vadodara, Gujarat, India
  • Sheshang Degadwala  Associate Professor, Department of Computer Engineering, Sigma Institute of Engineering, Vadodara, Gujarat, India
  • Dhairya Vyas  Managing Director, Shree Drashti Infotech LLP, Vadodara, Gujarat, India

DOI:

https://doi.org//10.32628/CSEIT228312

Keywords:

Data Stream, Anonymization, K-Anonymity, Generalization, L-Diversity, T-Closeness

Abstract

Sustainable stream processing algorithms have gained popularity in recent years. Flow control is a way of searching and modifying real-time data streams. Missing values are ubiquitous in real-world data streams, making data stream privacy challenging to safeguard. On the other hand, most privacy preservation methods need not take absent values into account when developed. They can anonymize data in certain study, however this results in data loss. This research proposes a unique parallel distributed approach for protecting privacy while using incomplete data streams. This method uses a production computational system to continually anonymize data streams, using clustering to construct each tuple. It clusters data in partial and complete forms using variable and array dimensions as similarity metrics. In order to prevent values and outliers’ pollution, a generalization approach that is based on more than matches is used. The experiments used real data to compare current systems with varied settings. This research will cover several anonymization mechanisms and their advantages. There are also drawbacks. Finally, we will explore the future of continuous data anonymization research.

References

  1. Yang, X. Chen, Y. Luo, X. Lan, and W. Wang, “IDEA: A utility-enhanced approach to incomplete data stream anonymization,” Tsinghua Sci. Technol., vol. 27, no. 1, pp. 127–140, 2022, doi: 10.26599/TST.2020.9010031.
  2. De Capitani Di Vimercati et al., “Artifact: Scalable Distributed Data Anonymization,” 2021 IEEE Int. Conf. Pervasive Comput. Commun. Work. other Affil. Events, PerCom Work. 2021, pp. 450–451, 2021, doi: 10.1109/PerComWorkshops51409.2021.9431059.
  3. Pelin Canbay and Hayri Sever, “The Effect of Clustering on Data Privacy“ 2015 IEEE International Conference on. IEEE 2015.
  4. Mohamed Nassar, Abdelkarim Erradi, Qutaibah M. Malluhi, “Paillier’s Encryption: Implementation and Cloud Applications” KINDI Center for Computing Research Qatar University Doha, Qatar.
  5. Mohammad-Reza Zare-Mirakabad, Fatemeh Kaveh-Yazdy, Mohammad Tahmasebi, “Privacy Peservation by k-anonymizing Ngrams of  Time Series” Yazd University, Iran, Dalian University of Technology, Dalian.
  6. Tsubasa Takahashi , Koji Sobataka , Takao Takenouchi , Yuki Toyoda , Takuya Mori and Takahide Kohroy “Top-Down Itemset Recording for Releasing Private Complex Data”  Cloud System Research Laboratories, NEC Corporation, Kawasaki, Kanagawa Japan, Jichi Medical University Hospital, Shimotsuke, Tochigi Japan. IEEE 2013.
  7. Ninghui Li, Tiancheng Li, Suresh Venkatasubramanian, “T-Closeness: Privacy Beyond k-Anonymity and l-Diversity” Department of Computer Science, Purdue University, AT&T Labs – Research. IEEE 2007.
  8. U.Patel, Vaishali.R.Patel, “Anonymization of Social Networks for Reducing Communication Complexity and Information Loss by Sequential Cluatering”, 2015.
  9. Mahesh, R., A New Method for Preserving Privacy in Data Publishing Against Attribute and Identity Disclosure Risk (2013). International Journal on Cryptography and Information Security (IJCIS), Vol.3, No. 2, June 2013, Available at SSRN: https://ssrn.com/abstract=3685781
  10. B. Ghate and R. Ingle, "Clustering based Anonymization for privacy preservation," in Pervasive Computing (ICPC), 2015 International Conference on, 2015.
  11. B. Malik, M. A. Ghazi, and R. Ali, ‘‘Privacy preserving data mining techniques: Current scenario and future prospects,’’inProc.3rdInt.Conf. Comput. Commun. Technol. (ICCCT), Nov. 2012, pp. 26–32.
  12. -J. Choi, H.-S. Kim and Y.-S. Moon. "Publishing time-series data under preservation of privacy and distance orders". International Journal of Innovative Computing, Information and Control (IJICIC), Vol. 8, pp. 3619-3638, 2012.
  13. Xiao and Y. Tao. Personalized privacy preservation. In Proceedings of ACM Conference on Management of Data (SIGMOD’06), pages 229–240, June 2006.
  14. C. Aggarwal and S. Y. Philip, A general survey of privacy-preserving data mining models and algorithms: Springer, 2008. 21.  Olga Gkountouna,  A Survey on Privacy Preservation Methods, June -2011.
  15. Pierangela Samarati and Latanya Sweeney , Protecting Privacy When Disclosing Information: K-Anonymity and its Enforcement through Generlization and Suppression.
  16. Freny Presswala, Amit Thakkar and Nirav Bhatt, Survey on Anonymizati on in Privacy Preserving Data Mining,  International Journal of Innovative and Emerging Research in Engineering (IJIERE), 2015.

Downloads

Published

2022-05-30

Issue

Section

Research Articles

How to Cite

[1]
Brinit Trivedi, Sheshang Degadwala, Dhairya Vyas, " Privacy Preserving Parallel Distributed Data Stream Anonymization, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 8, Issue 3, pp.53-66, May-June-2022. Available at doi : https://doi.org/10.32628/CSEIT228312