Detecting URL Phishing Attacks Using Machine Learning & NLP Techniques

Authors

  • Anitha R  Assistant Professor, Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore Tamil Nadu, India
  • Swathi S  B.E Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore Tamil Nadu, India
  • Vasuhi R  B.E Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore Tamil Nadu, India
  • Thenmozhi P  B.E Computer Science and Engineering, Sri Krishna College of Technology, Coimbatore Tamil Nadu, India

Keywords:

Phishing, Phishing detection system, Hacking, SVM, URL, NLP

Abstract

The internet plays a huge role in day to day life of people.URL Phishing is done by hackers who aim to obtain information from the internet user by making the url seem like, it is from a trustworthy organization. It is necessary to be aware of such phishing attack that takes place online in order to safeguard sensitive information from being stolen. This paper focuses on detecting URL phishing. URL features are extracted from the url that is to be tested using various NLP (Natural Language Processing) techniques such as tokenizing, finding popularity, checking the presence of IP (Internet Protocol) address, etc. This system uses machine learning algorithm (Support Vector Machine) which can be used for classification challenges. SVM (Support Vector Machine) is used to identify the phishing or safe status of the given URL. A dataset containing url features is used to train the SVM algorithm to do so. SVM is the best algorithm in classification (based on the features of given data) which gives reliable results.

References

  1. "Microsoft (2005) Anti-phishing white paper" http://www-pc.uni-regensburg.de/systemsw/ie70/Anti-phishi ng_White_Paper.doc.
  2. Schneider F, Provos N, Moll R, Chew M, Rakowski B (2007) "Phishing protection design documentation". https://wiki.mozilla.org/Phishing_Protection:_Design_Docu mentation.
  3. Xiang G, Hong J, Rose CP, Cranor L (2011) CANTINA+: "A feature-rich machine learning framework for detecting phishing web sites". ACM Trans Inf Syst Secur 14(2):21. doi:10.1145/2019599.2019606
  4. Fu AY, Wenyin L, Deng X (2006) "Detecting phishing web pages with visual similarity assessment based on earth mover’s distance (EMD)". IEEE Trans Dependable Secure Computing 3(4):301–311. doi:10.1109/TDSC.2006.50
  5. Li Y, Chu S, Xiao R (2015) "A pharming attack hybrid detection model based on IP addresses and web content". Optik- Int J Light Electron Optics 126(2):234–239. doi:10.1016/j.ijleo.2014.10.001
  6. Thomas K, Grier C, Ma J, Paxson V, Song D (2011) "Design and evaluation of a real-time URL spam filtering service". In: proceedings of the thirty second IEEE conference on security & privacy, California, 22–25 May 2011, p. 447–462
  7. Jeeva SC, Rajsingh EB (2016) "Intelligent phishing universal resource locator detection victimisation association rule mining". Human-centric Comput Inf Sci 6:10. doi:10.1186/s13673-016-0064-3.
  8. Ramesh G, Krishnamurthi I, Kumar KSS (2014) "An efficacious technique for police investigation phishing webpages through target domain identification". Decis Support Syst 61:12–22. doi:10.1016/j.dss.2014.01.002.
  9. Huang C-Y, Ma S-P, Chen K-T (2011) "Using one-time passwords to prevent password phishing attacks". J Netw Comput Appl 34(4):1292–1301.

Downloads

Published

2019-04-30

Issue

Section

Research Articles

How to Cite

[1]
Anitha R, Swathi S, Vasuhi R, Thenmozhi P, " Detecting URL Phishing Attacks Using Machine Learning & NLP Techniques, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 5, Issue 2, pp.53-56, March-April-2019.