A Comprehensive Study on SomeWeb Mining Algorithms
DOI:
https://doi.org/10.32628/CSEIT228325Keywords:
Web mining, World Wide Web, Web search rank, Page rank and Weighted Page rank, HITS.Abstract
The tremendous growth of Webtechnologies in the past three decades has made it the largest publicly accessible data source in the world. With the ever-increasing volume of data on the Web, it is getting difficult and time-consuming to discover informative knowledge and patterns. Finding intelligent and user-requested data from unstructured and inconsistent material on the internet is a difficult undertaking. Web mining is the application of data mining techniques to discover patterns and structures and extract knowledge from the World Wide Web. It extracts structured and unstructured data from web pages, server logs, and link structures using automated methods. To categorize and rank search results, a variety of Web Mining techniques are commonly employed, including PageRank, Weighted PageRank, and HITS. The motive behind this paper is to present and analyze the currently important algorithms for ranking web pages such as PageRank, Weighted PageRank and HITS.
References
- Kosala, R. and H. Blockeel, “Web mining research: a survey”. SIGKDD Explor. Newsl., 2000. 2(1): p. 1-5. DOI:10.1145/360402.360406
- Han, J., Kamber, M., Pei, J. (2012). “Data Mining Concepts and Techniques”. Elsevier/Morgan Kaufmann. 3rd edition. Netherlands.
- Just, Jiri. “A Short Survey of Web Data Mining.” (2013).
- Osmar R. Za¨ane,“From resource discovery to knowledge discovery on the internet”, Technical Report TR 1998-13, Simon Fraser University, 1998.
- R.W. Cooley, “Web usage mining: Discovery and application of Interesting patterns from Web data”, PhD thesis, dept of computer science, university of Minnesota, May 2000.DOI:10.1145/846183.846188
- S. Brin, and L. Page, “The Anatomy of a Large Scale Hypertextual Web Search Engine”, Computer Network and ISDN Systems, Vol. 30, Issue 1-7, pp. 107-117, 1998. DOI: 10.1016/S0169-7552(98)00110-X
- Wenpu Xing and Ali Ghorbani, “Weighted PageRank Algorithm”, Proceedings of the Second Annual Conference on Communication Networks and Services Research (CNSR ‟04), IEEE, 2004.
- Kleinberg, J.M., “Authoritative sources in a hyperlinked environment”. J. ACM, 1999. 46(5): p. 604-632. DOI: 10.1145/324133.324140
- Mohamed-K HUSSEIN et al., “An Effective Web Mining Algorithm using Link Analysis”, International Journal of Computer Science and Information Technologies, Vol. 1 (3), 2010, 190-197. DOI: 10.1.1.259.5389
- C. Ding, X. He, P. Husbands, H. Zha, and H. Simon, “Link analysis: Hubs and authorities on the World”. Technical report: 47847, 2001. DOI:10.1137/S0036144501389218
- http://ianrogers.uk/google-page-rank
- https://www.geeksforgeeks.org/weighted-pagerank-algorithm/ Date accessed: 15/05/2022
- https://www.geeksforgeeks.org/hyperlink-induced-topic-search-hits-algorithm-using-networxx-module-python
- https://towardsdatascience.com/pagerank-algorithm-fully-explained-dc794184b4af
Downloads
Published
Issue
Section
License
Copyright (c) IJSRCSEIT

This work is licensed under a Creative Commons Attribution 4.0 International License.