A Novel Sequence Graph Representation for Searching and Retrieving Sequences of Long Text in the Domain of Information Retrieval
Keywords:
Search engine, Stop words, Graph database, Word Sequence Graph ModelAbstract
Long tail queries or keywords are becoming the norm for the user to search for what they intend and relying on keyword based SEO tactics never wins the game. A full text sequence based indexing approach for the document is needed to manage these lengthy search queries. This paper presents a highly efficient and novel graph based document representation, Word Sequence Graph model, to enhance text search and retrieval of any length including stop words by exploiting the unique features of a graph database. It is a one-for-all model where document and content information lies at the same place. This methodology is of high relevance in many real world applications that includes searching huge collection of documents. The examples are demonstrated with the help of bible texts.
References
- Rao, B., & Mishra, S. N. (2017). An Approach to Text Documents Clustering with {n, n-1,….., 1}-Word (s) Appearance Using Graph Mining Techniques. IJSEAT, 4(12), 756-762.
- Ravinuthala,M. K. V.& Ch, S. R (2016). Thematic Text Graph: A Text Representation Technique for Keyword Weighting in Extractive Summarization System. International Journal of Information Engineering and Electronic Business(IJIEEB), 8(4), 18.
- Murtaza Munawar Fazal and Muhammad Rafi (2014). Clustering textual documents by extracting sequence from word-of-graph. Journal of Independent Studies and Research – Computing Volume 12 Issue 1
- S. S. Sonawane, and Dr. P.A. Kulkarni (2014). Graph based Representation and Analysis of Text Document : A Survey of Techniques. vol. 96, no. 19, pp. 1–8.
- Hammouda, K. M., & Kamel, M. S. (2004). Efficient phrase-based document indexing for web document clustering. IEEE Transactions on knowledge and data engineering, 16(10), 1279-1296.
- Pfaffe, P., Tillmann, M., Lutteropp, S., Scheirle, B., & Zerr, K. (2016). Parallel String Matching.
- Rolston, L., & Kirchhoff, K. (2016). Collection of Bilingual Data for Lexicon Transfer Learning.
- Hewitt, J., Post, M., & Yarowsky, D. (2016). Automatic Construction of Morphologically Motivated Translation Models for Highly Inflected, Low-Resource Languages. AMTA 2016, Vol., 177.
- Wolf, L., Hanani, Y., Bar, K., & Dershowitz, N. (2014). Joint word2vec networks for bilingual semantic representations. International Journal of Computational Linguistics and Applications, 5(1), 27-44.
- Rani, A., Goyal, N., & Gadia, S. K. (2016, October). Efficient Multi-depth Querying on Provenance of Relational Queries Using Graph Database. In Proceedings of the 9th Annual ACM India Conference (pp. 11-20). ACM
Downloads
Published
Issue
Section
License
Copyright (c) IJSRCSEIT

This work is licensed under a Creative Commons Attribution 4.0 International License.