Fuzzy Document Clustering based on Frequent Features and Feature Length

Authors

  • U. S. Patki  Department of Computer Science, Science College Nanded, Maharashtra, India
  • Dr. S. B. Kishor  Department of Computer Scinece, S.P. College, Chandrapur, Maharashtra, India
  • Dr. P. G. Khot  Ex-Professor, Department of Statistics, RSTM Nagpur University, Nagpur, Maharashtra, India

Keywords:

Document Clustering, Soft Computing, Features selection, features reduction, Fuzzy C-Means.

Abstract

Document Clustering is a method of grouping similar documents into one cluster. Fuzzy document clustering is a soft computing technique used for clustering the similar documents. It permits overlapping i.e. it permits single document to belong to multiple clusters. Feature selection and feature extraction is the most important phase during clustering process. In the Literature different feature reduction methods are proposed. In this research paper we have proposed a feature reduction method based on feature frequency and feature length. In this method, we have chosen the features based on no. of occurrence in a set of N documents. We have also taken into account feature length. Finally we have applied fuzzy C-Means clustering algorithm for clustering the N documents into K-Clusters.

References

  1. Sumit Goswami and Mayank Singh Shishodia, "A Fuzzy Based Approach To Text Mining And Document Clustering"2013
  2. Sowmya P, Supreetha R,Ushadevi A, "Survey On Algorithms Used for Text Document Clustering", IJAEC  Special Issue September  2016
  3. A. Sudha Ramkumar, Dr. B Poorna. (November 2016). Text Document Clustering Using Dimension Reduction Technique. International Journal of Applied Engineering Research , 4770-4773.
  4. Ammar Ismael Kadhim, Yu-N Cheah and Nurul Hashimah Ahamed. (2014). Text Document Preprocessing and Dimension Reduction Techniques for Text Document Clustering. IEEE Computer Society (pp. 69-73). IEEE.
  5. MS K. Mugunthadevi. MRS S.C. Punitha, Dr. M. Punithavalli. (2011). Survey On Feature Selection in Document Clustering. International Journal on Computer Science and Engineering (IJCSE) , 12401-1244.
  6. Anna Huang, ," Similarity Measures for Text Document Clustering", NZCSRSC 2008, April 2008, Christchurch, New Zealand
  7. A Text Book " Text Mining and Application Programming"  Manu Konchady ,Ed. 3 Indian Edition

Downloads

Published

2018-02-28

Issue

Section

Research Articles

How to Cite

[1]
U. S. Patki, Dr. S. B. Kishor, Dr. P. G. Khot, " Fuzzy Document Clustering based on Frequent Features and Feature Length, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 1, pp.1418-1422, January-February-2018.