Text Summarization Using Machine Learning Algorithm

Authors

  • Dr. Vidyagouri B H  Assistant Professor, Dept of Computer Science and Engineering, SDM College of Engineering and Technology, Dharwad, Karnataka, India1
  • BibiSadiqa M D  Mtech Student Dept of Computer Science and Engineering, SDM College of Engineering and Technology, Dharwad, Karnataka, India

DOI:

https://doi.org//10.32628/CSEIT228421

Keywords:

Text Summarization, Natural Language Processing, Extractive Summarization, Abstractive Summarization, ROUGE.

Abstract

In the age of technology, data is critical. The data on the internet is formless and poorly organized. The concept of text summarization is introduced in order to convert data summaries. Text summarization is the process of extracting useful information from raw data without diluting the main theme of the data. Today’s readers must contend with task of reading comments, reviews, news articles, blogs and other forms of informal and noisy communication. It is difficult to retrieve the correct gist of the gist, which is required by all readers. To achieve the benefits of both extractive and abstractive summarization, the proposed approach combines TF-TDF-TR(Term Frequency – Inverse Document Frequency – Text Rank) as an unsupervised learning algorithm and the seq2seq (Sequence to Sequence) model as a supervised learning algorithm. In terms of ROUGE score, the proposed TFRSP approach outperforms existing text summarization methods, resulting in high summary accuracy.

References

  1. Meena S M, Ramkumar M P, Asmitha R E. ” Text Summarization Using Text Frequency Ranking Sentence Prediction.” 2020 4th International Conference on Computer, Communication and Signal Processing (ICCCSP).
  2. Parmar, Chandu, RanjanChaubey, and Kirtan Bhatt. "Abstractive Text Summarization Using Artificial Intelligence." Available at SSRN 3370795 (2019).
  3. Kim, Joo-Chang, and Kyungyong Chung. "Associative feature information extraction using text mining from health big data." Wireless Personal Communications 105, no. 2 (2019):691-707.
  4. Qaiser, Shahzad, and Ramsha Ali. "Text mining: use of TF-IDF to examine the relevance of words to documents." International Journal of Computer Applications 181, no. 1 (2018):25-29.
  5. Roul, Rajendra Kumar, and JajatiKeshariSahoo. "Sentiment Analysis and Extractive Summarization Based Recommendation System." In Computational Intelligence in Data Mining, pp. 473-487. Springer, Singapore, 2020.
  6. Dutta, Madhurima, Ajit Kumar Das, ChirantanaMallick, ApurbaSarkar, and Asit K. Das. "A Graph Based Approach on Extractive Summarization." In Emerging Technologies in Data Mining and Information Security, pp. 179- 187. Springer, Singapore,2019.
  7. Dutta, Madhurima, ChirantanaMallick, ApurbaSarkar, and Asit K. Das. "A Graph Based Approach on Extractive Summarization." In Emerging Technologies in Data Mining and Information Security, pp. 179- 187. Springer, Singapore,2019.
  8. Nallapati, Ramesh, Bowen Zhou, CaglarGulcehre, and Bing Xiang. "Abstractive text summarization using sequence-to-sequence rnns and beyond." arXiv preprint arXiv:1602.06023(2016).
  9. Jasmeet singh, Prbjot singh, Prateek chikkara. “An Ensemble Approach for extractive text summarization.” 2020 International conference on ETITE.
  10. Gupta, Vanyaa, NehaBansal, and Arun Sharma. "Text summarization for big data: A comprehensive survey." In International Conference on Innovative Computing and Communications, pp. 503-516. Springer, Singapore, 2019.
  11. Applications of automatic summarization : https://blog.frase.io/20- applications-of-automatic-summarization-in-the-enterprise/.
  12. ShanmugasundaramHariharan. "Studies on intrinsicsummary evaluation", International Journal of ArtificialIntelligenceand Soft Computing, 2010.
  13. Bhavadharani, M., M. P. Ramkumar, and Selvan GSR Emil. "Performance Analysis of Ranking Models in Information Retrieval." In 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), pp. 1207-1211. IEEE,2019.
  14. Pan, Suhan, Zhiqiang Li, and Juan Dai. "An improved TextRank keywords extraction algorithm." In Proceedings of the ACM Turing Celebration Conference-China, pp. 1-7.2019.
  15. Mihalcea, Rada. "Graph-based ranking algorithms for sentence extraction, applied to text summarization." In Proceedings of the ACL Interactive Poster and Demonstration Sessions, pp. 170-173.2004.
  16. Roul, Rajendra Kumar, and JajatiKeshariSahoo. "Sentiment Analysis and Extractive Summarization Based Recommendation System." In Computational Intelligence in Data Mining, pp. 473-487. Springer, Singapore, 2020.
  17. Song, Shengli, Haitao Huang, and TongxiaoRuan. "Abstractive text summarization using LSTM-CNN based deep learning." Multimedia Tools and Applications 78, no. 1 (2019):857-875.
  18. "Advances in Computational Intelligence", SpringerScience and Business Media LLC,2019,
  19. Understanding Encoder - Decoder Sequence to sequence model : https://towardsdatascience.com/understanding-encoder-decoder- sequence-tosequence-model-679e04af4346 .
  20. Text Summarization using Sequence to sequence encoder decoder model: https://www.analyticsvidhya.com/blog/2019/06/comprehensive-guide-textsummarization-using-deep- learning-python/ .
  21. Kaggle Dataset :https://www.kaggle.com/skathirmani/amazon-reviews [19] "Natural Language Processing and ChineseComputing", Springer Science and Business MediaLLC,2018.  [22] Python 3 Jupyter Notebook : https://jupyter.org/.
  22. ROUGE :https://pypi.org/project/rouge.

Downloads

Published

2022-08-30

Issue

Section

Research Articles

How to Cite

[1]
Dr. Vidyagouri B H, BibiSadiqa M D, " Text Summarization Using Machine Learning Algorithm, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 8, Issue 4, pp.167-173, July-August-2022. Available at doi : https://doi.org/10.32628/CSEIT228421