Comparative Study of Datasets used in Cyber Security Intrusion Detection

Authors

  • Rahul Yadav  Department of Computer Science Application, ITM University, Gwalior, Madhya Pradesh, India
  • Phalguni Pathak   Department of Computer Science Application, ITM University, Gwalior, Madhya Pradesh, India
  • Saumya Saraswat  Department of Computer Science Application, ITM University, Gwalior, Madhya Pradesh, India

DOI:

https://doi.org//10.32628/CSEIT2063103

Keywords:

Architecture , Attack , Detection , IDS , Datasets , Prevention

Abstract

In recent years, deep learning frameworks are applied in various domains and achieved shows potential performance that includes malware detection software, self-driving cars, identity recognition cameras, adversarial attacks became one crucial security threat to several deep learning applications in today’s world Deep learning techniques became the core part for several cyber security applications like intrusion detection, android malware detection, spam, malware classification, binary analysis and phishing detection. . One of the major research challenges in this field is the insufficiency of a comprehensive data set which reflects contemporary network traffic scenarios, broad range of low footprint intrusions and in depth structured information about the network traffic. For Evaluation of network intrusion detection systems, many benchmark data sets were developed a decade ago. In this paper, we provides a focused literature survey of data sets used for network based intrusion detection and characterize the underlying packet and flow-based network data in detail used for intrusion detection in cyber security. The datasets plays incredibly vital role in intrusion detection; as a result we illustrate cyber datasets and provide a categorization of those datasets.

References

  1. S. Aftergood, “Cybersecurity: The cold war online,” Nature, vol. 547, no. 7661, p. 30, 2017.
  2. A. Milenkoski, M. Vieira, S. Kounev, A. Avritzer, and B. D. Payne, “Evaluating Computer Intrusion Detection Systems:A Survey of Common Practices,” Acm Comput. Surv., vol. 48, no. 1, pp. 1–41, 2015.
  3. C. N. Modi and K. Acha, “Virtualization layer security challenges and intrusion detection/prevention systems in cloud computing: a comprehensive review,” J. Supercomput., vol. 73, no. 3, pp. 1–43, 2016.
  4. J. M. Johnson and T. M. Khoshgoftaar, "Survey on deep learning with class imbalance,'' J. Big Data, vol. 6, no. 1, p. 27, 2019.
  5. A. Ali, S. M. Shamsuddin, and A. L. Ralescu, "Classi_cation with class imbalance problem:Areview,'' Int. J. Adv. Soft Comput. Appl., vol. 7, no. 3, pp. 176_204, 2015.
  6. F. Provost, "Machine learning from imbalanced data sets 101,'' in Proc. AAAI Workshop Imbalanced Data Sets, Menlo Park, CA, USA: AAAI Press, 2000.
  7. S. Barua, M. M. Islam, X. Yao, and K. Murase, "MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning,'' IEEE Trans. Knowl. Data Eng., vol. 26, no. 2, pp. 405_425, Feb. 2014.
  8. X. Gao, C. Shan, C. Hu, Z. Niu, and Z. Liu, "An adaptive ensemble machine learning model for intrusion detection,'' IEEE Access, vol. 7, pp. 82512_82521, 2019.
  9. Buczak AL , Guven E . A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surv. Tut. 2015;18(2):1153–76 .
  10. A .A . Diro, N. Chilamkurti, Distributed attack detection scheme using deep learning approach for internet of things, Future Gener. Comput. Syst. 82 (2018) 761–768 . Online]. Available: http://www.sciencedirect.com/science/article/pii/ S0167739X17308488 .
  11. Milenkoski A , Vieira M , Kounev S , Avritzer A , Payne BD . Evaluating computer intrusion detection systems: a survey of common practices. ACM Comput. Surv. 2015;48(1):12 .
  12. M. Lopez-Martin, B. Carro, A. Sanchez-Esguevillas, J. Lloret, Conditional vari- ational autoencoder for prediction and feature recovery applied to intrusion detection in IoT, Sensors 17 (9) (2017) 1967 Online]. Available, doi: 10.3390/ s17091967 .
  13. Zarpelao BB , Miani RS , Kawakani CT , de Alvarenga SC . A survey of intrusion detection in internet of things. J. Netw. Comput. Appl. 2017;84:25–37 .
  14. Giovanni Apruzzese, Luca Ferretti, Mirco Marchetti, Michele Colajanni, Alessandro Guido On the Effectiveness of Machine and Deep Learning for Cyber Security
  15. A. Ramos, M. Lazar, R.H. Filho, J.J.P.C. Rodrigues, Model-based quantitative network security metrics: a survey, IEEE Commun. Surv. Tut. 19 (4) (2017) 2704–2734 . Online].Available: http://www.sciencedirect.com/science/article/ pii/S0167739X17308488 .
  16. D. Rozenblum, “Understanding Intrusion Detection Systems,” SANS Inst., no. 122, pp. 11–15, 2001.
  17. H. Debar, M. Dacier, and A. Wespi, “A revised taxonomy for intrusion-detection systems,” Ann. Des Télécommunications, vol. 55, no. 7–8, pp. 361–378.
  18. Sans Penetration Testing, “Host- vs. Network-Based Intrusion Detection Systems,” 2001. Online]. Available: https://cyber-defense.sans.org/resources/papers/gsec/host-vs-network-based-intrusion-detection-systems-102574. Accessed: 24-Feb-2016].
  19. Roesch.M, “Snort - Lightweight Intrusion Detection for Networks” 13th USENIX Conference on System Administration, USENIX Association (1999) 229–238
  20. Hossein M. Shirazi,”Anomaly Intrusion Detection System Using Information Theory, K-NN and KMC Algorithms”, Australian Journal of Basic and Applied Sciences, 3(3): 2581-2597, 2009
  21. SANS Institute InfoSec Reading Room, “Application of Neural Networks to Intrusion Detection,” 2001. Online]. Available: https://www.sans.org/reading-room/whitepapers/detection/application-neural-networks-intrusion-detection-336. Accessed: 24-Feb-2016].
  22. S. S. Tirumala, H. Sathu, and A. Sarrafzadeh, “Free and open source intrusion detection systems: A study,” in 2015 International Conference on Machine Learning and Cybernetics (ICMLC), 2015, vol. 1, pp. 205–210.
  23. V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection,” ACM Comput. Surv., vol. 41, no. 3, pp. 1–58, Jul. 2009. 35
  24. G. Karatas, O. Demir, and O. Koray Sahingoz, "Deep learning in intrusion detection systems,'' in Proc. Int. Congr. Big Data, Deep Learn. Fighting Cyber Terrorism (IBIGDELFT), Dec. 2018, pp. 113_116.
  25. G. Karatas and O. K. Sahingoz, "Neural network based intrusion detection systems with different training functions,'' in Proc. 6th Int. Symp. Digit. Forensic Secur. (ISDFS), Mar. 2018, pp. 1_6.
  26. D. Proti¢, "Review of KDD cup '99, NSL-KDD and Kyoto 2006C datasets,'' Vojnotehni£ki Glasnik, vol. 66, no. 3, pp. 580_596, 2018.
  27. M. Tavallaee, E. Bagheri,W. Lu, and A. A. Ghorbani, "A detailed analysis of the KDD CUP 99 data set,'' in Proc. IEEE Symp. Comput. Intell. Secur. Defense Appl., Jul. 2009, pp. 1_6.
  28. A. Gharib, I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, "An evaluation framework for intrusion detection dataset,'' in Proc. Int. Conf. Inf. Sci. Secur. (ICISS), Dec. 2016, pp. 1_6.
  29. I. Sharafaldin, A. Habibi Lashkari, and A. A. Ghorbani, "Toward generating a newintrusion detection dataset and intrusion traf_c characterization,'' in Proc. 4th Int. Conf. Inf. Syst. Secur. Privacy, 2018, pp. 108_116.
  30. R. K. Sharma, H. K. Kalita, and P. Borah, “Analysis of Machine Learning Techniques Based Intrusion Detection Systems,” in International Conference on Advanced Computing, Networking, and Informatics, 2016, pp. 485–493.
  31. Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications And Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6.

Downloads

Published

2020-10-30

Issue

Section

Research Articles

How to Cite

[1]
Rahul Yadav, Phalguni Pathak , Saumya Saraswat, " Comparative Study of Datasets used in Cyber Security Intrusion Detection, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 6, Issue 5, pp.302-312, September-October-2020. Available at doi : https://doi.org/10.32628/CSEIT2063103