The Comparative study of Python Libraries for Natural Language Processing (NLP)

Authors

  • Dr. Dhara Ashish Darji Faculty of Computer Applications, Ganpat University, Ganpat Vidyanagar Mehsana-Gozaria, Highway, Kherva, Gujarat, India Author https://orcid.org/0009-0000-6359-908X
  • Dr. Sachinkumar Anandpal Goswami Faculty of Computer Applications, Ganpat University, Ganpat Vidyanagar Mehsana-Gozaria, Highway, Kherva, Gujarat, India Author

DOI:

https://doi.org/10.32628/CSEIT2410242

Keywords:

NLP, Libraries, NLU, NLG, NLTK

Abstract

Natural Language Processing (NLP) has seen significant advancements in recent years, driven largely by the availability of powerful Python libraries. This comparative study aims to analyze and compare the performance, language support, community support and ease of use of many popular Python libraries for NLP like NLTK (Natural Language Toolkit), spaCy, TextBlob, Flair, Jina, Gensim etc. The study evaluates these libraries across various NLP tasks such as tokenization, part-of-speech tagging, named entity recognition, sentiment analysis, and text summarization. Additionally, the paper discusses the strengths and weaknesses of each library, providing insights into their suitability for different NLP applications. Through detailed experimentation and analysis, this study aims to guide researchers and practitioners in selecting the most appropriate library for their NLP projects.

Downloads

Download data is not yet available.

References

A. Dunn, D. Inkpen and R. Andonie, "Context-Sensitive Visualization of Deep Learning Natural Language Processing Models," 2021 25th International Conference Information Visualisation (IV), Sydney, Australia, 2021, pp. 170-175, doi: 10.1109/IV53921.2021.00035. DOI: https://doi.org/10.1109/IV53921.2021.00035

A. Ferrari, L. Zhao and W. Alhoshan, "NLP for Requirements Engineering: Tasks, Techniques, Tools, and Technologies," 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), Madrid, ES, 2021, pp. 322-323, doi: 10.1109/ICSE-Companion52605.2021.00137. DOI: https://doi.org/10.1109/ICSE-Companion52605.2021.00137

A. M. P. Braşoveanu and R. Andonie, "Visualizing Transformers for NLP: A Brief Survey," 2020 24th International Conference Information Visualisation (IV), Melbourne, Australia, 2020, pp. 270-279, doi: 10.1109/IV51561.2020.00051. DOI: https://doi.org/10.1109/IV51561.2020.00051

Ahmed Banafa, "3 Natural Language Processing (NLP)," in Transformative AI: Responsible, Transparent, and Trustworthy AI Systems , River Publishers, 2024, pp.17-22. DOI: https://doi.org/10.1201/9781032669182-4

B. D. Bašić and M. P. di Buono, "An Analysis of Early Use of Deep Learning Terms in Natural Language Processing," 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO), Opatija, Croatia, 2020, pp. 1125-1129, doi: 10.23919/MIPRO48935.2020.9245375. DOI: https://doi.org/10.23919/MIPRO48935.2020.9245375

B. D. Shivahare, S. Ranjan, A. M. Rao, J. Balaji, D. Dattattrey and M. Arham, "Survey Paper: Study of Sentiment Analysis and Machine Translation using Natural Language Processing and its Applications," 2022 3rd International Conference on Intelligent Engineering and Management (ICIEM), London, United Kingdom, 2022, pp. 652-656, doi: 10.1109/ICIEM54221.2022.9853044. DOI: https://doi.org/10.1109/ICIEM54221.2022.9853044

B. K. AlSaidi, S. K. AlMamari and F. H. Mohideen, "A survey on supervised and unsupervised NLP algorithms for mental health detection applications," 6th Smart Cities Symposium (SCS 2022), Hybrid Conference, Bahrain, 2022, pp. 163-167, doi: 10.1049/icp.2023.0390. DOI: https://doi.org/10.1049/icp.2023.0390

B. K. AlSaidi, S. K. AlMamari and F. Hajamohideen, "A survey on mental health based on NLP," 6th Smart Cities Symposium (SCS 2022), Hybrid Conference, Bahrain, 2022, pp. 210-215, doi: 10.1049/icp.2023.0406. DOI: https://doi.org/10.1049/icp.2023.0406

B. Rawat, A. S. Bist, U. Rahardja, Q. Aini and Y. P. Ayu Sanjaya, "Recent Deep Learning Based NLP Techniques for Chatbot Development: An Exhaustive Survey," 2022 10th International Conference on Cyber and IT Service Management (CITSM), Yogyakarta, Indonesia, 2022, pp. 1-4, doi: 10.1109/CITSM56380.2022.9935858. DOI: https://doi.org/10.1109/CITSM56380.2022.9935858

C. Anilkumar, A. Karrothu, N. S. Mouli and C. B. Tej, "Recognition and Processing of phishing Emails Using NLP: A Survey," 2023 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2023, pp. 1-4, doi: 10.1109/ICCCI56745.2023.10128481. DOI: https://doi.org/10.1109/ICCCI56745.2023.10128481

D. W. Otter, J. R. Medina and J. K. Kalita, "A Survey of the Usages of Deep Learning for Natural Language Processing," in IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 2, pp. 604-624, Feb. 2021, doi: 10.1109/TNNLS.2020.2979670. DOI: https://doi.org/10.1109/TNNLS.2020.2979670

Daniel Minoli; Benedict Occhiogrosso, "Current and Evolving Applications to Natural Language Processing," in AI Applications to Communications and Information Technologies: The Role of Ultra Deep Neural Networks , IEEE, 2024, pp.65-116, doi: 10.1002/9781394190034.ch2. DOI: https://doi.org/10.1002/9781394190034.ch2

E. Ceh-Varela and E. Imhmed, "Uncovering Water Research with Natural Language Processing," 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), Torino, Italy, 2023, pp. 983-984, doi: 10.1109/COMPSAC57700.2023.00138. DOI: https://doi.org/10.1109/COMPSAC57700.2023.00138

F. B. Rodrigues, W. F. Giozza, R. de Oliveira Albuquerque and L. J. García Villalba, "Natural Language Processing Applied to Forensics Information Extraction With Transformers and Graph Visualization," in IEEE Transactions on Computational Social Systems, doi: 10.1109/TCSS.2022.3159677. DOI: https://doi.org/10.1109/TCSS.2022.3159677

I. J. Dristy, A. M. Saad and A. A. Rasel, "Mental Health Status Prediction Using ML Classifiers with NLP-Based Approaches," 2022 International Conference on Recent Progresses in Science, Engineering and Technology (ICRPSET), Rajshahi, Bangladesh, 2022, pp. 1-6, doi: 10.1109/ICRPSET57982.2022.10188544. DOI: https://doi.org/10.1109/ICRPSET57982.2022.10188544

K. Hood and P. K. Kuiper, "Improving Student Surveys with Natural Language Processing," 2018 Second IEEE International Conference on Robotic Computing (IRC), Laguna Hills, CA, USA, 2018, pp. 383-386, doi: 10.1109/IRC.2018.00079. DOI: https://doi.org/10.1109/IRC.2018.00079

K. Kanhaiya, Naveen, A. K. Sharma, K. Gautam and P. S. Rathore, "AI Enabled- Information Retrival Engine (AI-IRE) in Legal Services: An Expert-Annotated NLP for Legal Judgements," 2023 Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS), Trichy, India, 2023, pp. 206-210, doi: 10.1109/ICAISS58487.2023.10250733. DOI: https://doi.org/10.1109/ICAISS58487.2023.10250733

K. S, S. R, S. R and T. S V, "Survey on Automatic Text Summarization using NLP and Deep Learning," 2023 International Conference on Advances in Electronics, Communication, Computing and Intelligent Information Systems (ICAECIS), Bangalore, India, 2023, pp. 523-527, doi: 10.1109/ICAECIS58353.2023.10170660. DOI: https://doi.org/10.1109/ICAECIS58353.2023.10170660

Khurana, D., Koli, A., Khatter, K. et al. Natural language processing: state of the art, current trends and challenges. Multimed Tools Appl 82, 3713–3744 (2023). https://doi.org/10.1007/s11042-022-13428-4 DOI: https://doi.org/10.1007/s11042-022-13428-4

Kunal Sawarkar, Deep Learning with PyTorch Lightning: Swiftly build high-performance Artificial Intelligence (AI) models using Python , Packt Publishing, 2022.

M. Ramprasath, K. Dhanasekaran, T. Karthick, R. Velumani and P. Sudhakaran, "An Extensive Study on Pretrained Models for Natural Language Processing Based on Transformers," 2022 International Conference on Electronics and Renewable Systems (ICEARS), Tuticorin, India, 2022, pp. 382-389, doi: 10.1109/ICEARS53579.2022.9752241. DOI: https://doi.org/10.1109/ICEARS53579.2022.9752241

N. E. Houda Ouamane and H. Belhadef, "Deep Reinforcement Learning Applied to NLP: A Brief Survey," 2022 2nd International Conference on New Technologies of Information and Communication (NTIC), Mila, Algeria, 2022, pp. 1-5, doi: 10.1109/NTIC55069.2022.10100477. DOI: https://doi.org/10.1109/NTIC55069.2022.10100477

N. Zhang and J. Kim, "A Survey on Attention mechanism in NLP," 2023 International Conference on Electronics, Information, and Communication (ICEIC), Singapore, 2023, pp. 1-4, doi: 10.1109/ICEIC57457.2023.10049971. DOI: https://doi.org/10.1109/ICEIC57457.2023.10049971

P. R. Kshirsagar, D. H. Reddy, M. Dhingra, D. Dhabliya and A. Gupta, "A Review on Application of Deep Learning in Natural Language Processing," 2022 5th International Conference on Contemporary Computing and Informatics (IC3I), Uttar Pradesh, India, 2022, pp. 1834-1840, doi: 10.1109/IC3I56241.2022.10073309. DOI: https://doi.org/10.1109/IC3I56241.2022.10073309

Pais, S., Cordeiro, J. & Jamil, M.L. NLP-based platform as a service: a brief review. J Big Data 9, 54 (2022). https://doi.org/10.1186/s40537-022-00603-5 DOI: https://doi.org/10.1186/s40537-022-00603-5

R. Boorugu and G. Ramesh, "A Survey on NLP based Text Summarization for Summarizing Product Reviews," 2020 Second International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 2020, pp. 352-356, doi: 10.1109/ICIRCA48905.2020.9183355. DOI: https://doi.org/10.1109/ICIRCA48905.2020.9183355

R. Guo and F. Ren, "Towards the relationship between Semantic Web and NLP," 2009 International Conference on Natural Language Processing and Knowledge Engineering, Dalian, China, 2009, pp. 1-8, doi: 10.1109/NLPKE.2009.5313806. DOI: https://doi.org/10.1109/NLPKE.2009.5313806

R. Patil, S. Boit, V. Gudivada and J. Nandigam, "A Survey of Text Representation and Embedding Techniques in NLP," in IEEE Access, vol. 11, pp. 36120-36146, 2023, doi: 10.1109/ACCESS.2023.3266377. DOI: https://doi.org/10.1109/ACCESS.2023.3266377

S. Cascianelli, G. Costante, A. Devo, T. A. Ciarfuglia, P. Valigi and M. L. Fravolini, "The Role of the Input in Natural Language Video Description," in IEEE Transactions on Multimedia, vol. 22, no. 1, pp. 271-283, Jan. 2020, doi: 10.1109/TMM.2019.2924598. DOI: https://doi.org/10.1109/TMM.2019.2924598

S. T and S. S, "Survey On Next Word Prediction Techniques In Natural Languages," 2023 International Conference on Innovations in Engineering and Technology (ICIET), Muvattupuzha, India, 2023, pp. 1-6, doi: 10.1109/ICIET57285.2023.10220846. DOI: https://doi.org/10.1109/ICIET57285.2023.10220846

S. Yang, Z. Ning and Y. Wu, "NLP Based on Twitter Information: A Survey Report," 2020 2nd International Conference on Information Technology and Computer Application (ITCA), Guangzhou, China, 2020, pp. 620-625, doi: 10.1109/ITCA52113.2020.00135. DOI: https://doi.org/10.1109/ITCA52113.2020.00135

Stančin, I., & Jović, A. (2019, May). An overview and comparison of free Python libraries for data mining and big data analysis. In 2019 42nd International convention on information and communication technology, electronics and microelectronics (MIPRO) (pp. 977-982). IEEE. DOI: https://doi.org/10.23919/MIPRO.2019.8757088

T. S. N. Ayutthaya and K. Pasupa, "Thai Sentiment Analysis via Bidirectional LSTM-CNN Model with Embedding Vectors and Sentic Features," 2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP), Pattaya, Thailand, 2018, pp. 1-6, doi: 10.1109/iSAI-NLP.2018.8692836. DOI: https://doi.org/10.1109/iSAI-NLP.2018.8692836

Verspoor, K., Cohen, K.B. (2013). Natural Language Processing. In: Dubitzky, W., Wolkenhauer, O., Cho, KH., Yokota, H. (eds) Encyclopedia of Systems Biology. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9863-7_158 DOI: https://doi.org/10.1007/978-1-4419-9863-7_158

Z. Shahbazi and Y. -C. Byun, "Blockchain-Based Event Detection and Trust Verification Using Natural Language Processing and Machine Learning," in IEEE Access, vol. 10, pp. 5790-5800, 2022, doi: 10.1109/ACCESS.2021.3139586. DOI: https://doi.org/10.1109/ACCESS.2021.3139586

Downloads

Published

16-03-2024

Issue

Section

Research Articles

How to Cite

[1]
Dr. Dhara Ashish Darji and Dr. Sachinkumar Anandpal Goswami, “The Comparative study of Python Libraries for Natural Language Processing (NLP)”, Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol, vol. 10, no. 2, pp. 499–512, Mar. 2024, doi: 10.32628/CSEIT2410242.