A Comprehensive Study on SomeWeb Mining Algorithms

Authors

  • Monimoy Ghosh  Department of Computer Science, St. Xavier's College (Autonomous), Kolkata, West Bengal, India
  • Asoke Nath  Department of Computer Science, St. Xavier's College (Autonomous), Kolkata, West Bengal, India

DOI:

https://doi.org/10.32628/CSEIT228325

Keywords:

Web mining, World Wide Web, Web search rank, Page rank and Weighted Page rank, HITS.

Abstract

The tremendous growth of Webtechnologies in the past three decades has made it the largest publicly accessible data source in the world. With the ever-increasing volume of data on the Web, it is getting difficult and time-consuming to discover informative knowledge and patterns. Finding intelligent and user-requested data from unstructured and inconsistent material on the internet is a difficult undertaking. Web mining is the application of data mining techniques to discover patterns and structures and extract knowledge from the World Wide Web. It extracts structured and unstructured data from web pages, server logs, and link structures using automated methods. To categorize and rank search results, a variety of Web Mining techniques are commonly employed, including PageRank, Weighted PageRank, and HITS. The motive behind this paper is to present and analyze the currently important algorithms for ranking web pages such as PageRank, Weighted PageRank and HITS.

References

  1. Kosala, R. and H. Blockeel, “Web mining research: a survey”. SIGKDD Explor. Newsl., 2000. 2(1): p. 1-5. DOI:10.1145/360402.360406
  2. Han, J., Kamber, M., Pei, J. (2012). “Data Mining Concepts and Techniques”. Elsevier/Morgan Kaufmann. 3rd edition. Netherlands.
  3. Just, Jiri. “A Short Survey of Web Data Mining.” (2013).
  4. Osmar R. Za¨ane,“From resource discovery to knowledge discovery on the internet”, Technical Report TR 1998-13, Simon Fraser University, 1998.
  5. R.W. Cooley, “Web usage mining: Discovery and application of Interesting patterns from Web data”, PhD thesis, dept of computer science, university of Minnesota, May 2000.DOI:10.1145/846183.846188
  6. S. Brin, and L. Page, “The Anatomy of a Large Scale Hypertextual Web Search Engine”, Computer Network and ISDN Systems, Vol. 30, Issue 1-7, pp. 107-117, 1998. DOI: 10.1016/S0169-7552(98)00110-X
  7. Wenpu Xing and Ali Ghorbani, “Weighted PageRank Algorithm”, Proceedings of the Second Annual Conference on Communication Networks and Services Research (CNSR ‟04), IEEE, 2004.
  8. Kleinberg, J.M., “Authoritative sources in a hyperlinked environment”. J. ACM, 1999. 46(5): p. 604-632. DOI: 10.1145/324133.324140
  9. Mohamed-K HUSSEIN et al., “An Effective Web Mining Algorithm using Link Analysis”, International Journal of Computer Science and Information Technologies, Vol. 1 (3), 2010, 190-197. DOI: 10.1.1.259.5389
  10. C. Ding, X. He, P. Husbands, H. Zha, and H. Simon, “Link analysis: Hubs and authorities on the World”. Technical report: 47847, 2001. DOI:10.1137/S0036144501389218
  11. http://ianrogers.uk/google-page-rank
  12. https://www.geeksforgeeks.org/weighted-pagerank-algorithm/ Date accessed: 15/05/2022
  13. https://www.geeksforgeeks.org/hyperlink-induced-topic-search-hits-algorithm-using-networxx-module-python
  14. https://towardsdatascience.com/pagerank-algorithm-fully-explained-dc794184b4af

Downloads

Published

2022-04-30

Issue

Section

Research Articles

How to Cite

[1]
Monimoy Ghosh, Asoke Nath, " A Comprehensive Study on SomeWeb Mining Algorithms" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 8, Issue 2, pp.319-326, March-April-2022. Available at doi : https://doi.org/10.32628/CSEIT228325