Frequent Pattern Mining over Unstructured Data using Semi-Structured Doc-Model and Pattern Ranking

Authors

  • Sudhir Tirumalasetty  Department of Computer Science & Engineering, Vasireddy Venkatadri Institute of Technology, Guntur, Andhra Pradesh, India
  • A. Divya  Department of Computer Science & Engineering, Vasireddy Venkatadri Institute of Technology, Guntur, Andhra Pradesh, India
  • D. Rahitya Lakshmi  Department of Computer Science & Engineering, Vasireddy Venkatadri Institute of Technology, Guntur, Andhra Pradesh, India
  • Ch. Durga Bhavani  Department of Computer Science & Engineering, Vasireddy Venkatadri Institute of Technology, Guntur, Andhra Pradesh, India
  • D. Anusha  Department of Computer Science & Engineering, Vasireddy Venkatadri Institute of Technology, Guntur, Andhra Pradesh, India

DOI:

https://doi.org//10.32628/CSEIT206216

Keywords:

Data Mining, Doc-Model, Frequent Pattern Mining, Pattern Rank, Unstructured Data

Abstract

Frequent pattern mining is an essential data-mining task, with a goal of discovering knowledge in the form of repeated patterns. Many efficient pattern-mining algorithms have been discovered in the last two decades, yet most do not scale to the type of data we are presented with today, the so-called “Big Data”. Scalable parallel algorithms hold the key to solving the problem in this context. This paper reviews recent advances in parallel frequent pattern mining, analysing them through the Big Data lens. Load balancing and work partitioning are the major challenges to be conquered. These challenges always invoke innovative methods to do, as Big Data evolves with no limits. The biggest challenge than before is conquering unstructured data for finding frequent patterns. To accomplish this Semi Structured Doc-Model and ranking of patterns are used.

References

  1. UKEssays https://www.ukessays.com/essays/information-technology/traditional-file-systems-and-database-management-information-technology-essay.php
  2. Innocent Mapanga, Prudence Kadebu “Database Management Systems: A NoSQL Analysis”, International Journal of Modern Communication Technologies & Research (IJMCTR) ISSN: 2321-0850, Volume-1, Issue-7, September 2013
  3. Big Data, https://en.wikipedia.org/wiki/Big_data
  4. SAS Insights “History of Big Data”, https://www.sas.com/en_in/insights/big-data/what-is-big-data.html
  5. Document-oriented database, https://en.wikipedia.org/wiki/Document-oriented_database
  6. Rupali Arora, Rinkle Rani Aggarwal “Modeling and Querying Data in MongoDB”, International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July-2013,141 ISSN 2229-5518
  7. NCache, https://www.alachisoft.com/nosdb/document-databases.html
  8. Bertino, E., Beng, C. O., Ron, S.D., Kian, L.T., Justin, Z., Boris, S., & Daniele, A. (2012) Indexing techniques for advanced database systems. Springer Publishing Company, Incorporated.

Downloads

Published

2020-04-30

Issue

Section

Research Articles

How to Cite

[1]
Sudhir Tirumalasetty, A. Divya, D. Rahitya Lakshmi, Ch. Durga Bhavani, D. Anusha, " Frequent Pattern Mining over Unstructured Data using Semi-Structured Doc-Model and Pattern Ranking , IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 6, Issue 2, pp.36-42, March-April-2020. Available at doi : https://doi.org/10.32628/CSEIT206216