Complaint Classification Using Support Vector Machine for Indonesian Text Dataset

Authors

  • Desi Ramayanti  Faculty of Computer Science, Universitas Mercu Buana, Jakarta Barat, IndonesiaFaculty of Computer Science, Universitas Mercu Buana, Jakarta Barat, Indonesia
  • Umniy Salamah  

Keywords:

Complaint Classification, Indonesian Text, Support Vector Machine

Abstract

Text classification is used to classify text data, for example, to find some information from news stories and text from social media that can be used by data owner. Since manual classification is time-consuming and difficult, many studies have been done to this research area. However, the most of studies focused on English text classification. This research attempted to classify Indonesian text dataset by using SVM classifiers. We have conducted research to classify Indonesian text using Python programming language and scikit-learn library. As the result, the experiment without cross validation and tuning parameter for SVM classifier on the dataset achieved the accuracy 0.89473 with value of precision and recall is 0.90289 and 0.89473 respectively. Moreover, value of K for SVM classifier is 0.78992 so that strength of agreement is included into good category. Then, the experiment using cross validation with k-5 and k-10 and tuning parameter with C constant and gamma value. Result of cross validation with k-10 is derived the best accuracy with value 0.9648, however, it spend computational time as long as 40.118 second. Then, we conducted experiment to find the best kernel function among Sigmoid, Linear and RBF. Moreover, based on result of experiment, kernel function Sigmoid achieved the best accuracy and computational time.

References

  1. W P. Sari, E. Cahyaningsih, D. I. Sensuse, and H. Noprisson, "The welfare classification of Indonesian national civil servant using TOPSIS and k-Nearest Neighbour (KNN)," in Research and Development (SCOReD), 2016 IEEE Student Conference on, 2016, pp. 1-5.
  2. V Ayumi, "Pose-based Human Action Recognition with Extreme Gradient Boosting," 2016.
  3. J Dai and X. Liu, "Approach for Text Classification Based on the Similarity Measurement between Normal Cloud Models," Sci. World J., 2014.
  4. T Joachims, "Text categorization with support vector machines: Learning with many relevant features," in In Proceedings of the 10th European Conference on Machine Learning.
  5. N Boudad, R. Faizi, R. O. H. Thami, and R. Chiheb, "Sentiment analysis in Arabic: A review of the literature," Ain Shams Eng. J., 2017.
  6. A M. Abd and S. M. Abd, "Case Studies in Construction Materials Modelling the strength of lightweight foamed concrete using support vector machine ( SVM )," Case Stud. Constr. Mater., vol. 6, pp. 8-15, 2017.
  7. R Cristina, B. Madeo, and S. M. Peres, "Gesture phase segmentation using support vector machines," Expert Syst. Appl., vol. 56, pp. 100-115, 2016.
  8. B Ghaddar and J. Naoum-sawaya, "High dimensional data classification and feature selection using support vector machines," Eur. J. Oper. Res., vol. 0, pp. 1-12, 2017.
  9. L Mart, N. Sanchez-pi, J. Manuel, and M. Lpez, "On the combination of support vector machines and segmentation algorithms for anomaly detection?: A petroleum industry comparative study," J. Appl. Log., vol. 24, pp. 71-84, 2017.
  10. T. Pinto, T. M. Sousa, I. Praa, Z. Vale, and H. Morais, "Neurocomputing Support Vector Machines for decision support in electricity markets strategic bidding," Neurocomputing, vol. 172, pp. 438-445, 2016.
  11. S. Shabani, P. Yousefi, and G. Naser, "Support vector machines in urban water demand forecasting using phase space reconstruction," Procedia Eng., vol. 186, pp. 537-543, 2017.
  12. V. Ayumi and M. I. Fanany, "A comparison of SVM and RVM for human action recognition," Internetworking Indones. J., vol. 8, no. 1, pp. 29-33, 2016.
  13. C. Burges, A tutorial on support vector machines for pattern recognition. Boston: Kluwer Academic Publishers, 1998.
  14. V. N. Vapnik, "An Overview of Statistical Learning Theory," vol. 10, no. 5, pp. 988-999, 1999.
  15. D. . Altman, Practical Statistics for Medical Students. London: Chapman and Hall, 1991.
  16. R. L. B. Bai, "How do the preferences of online buyers and browsers differ on the design and content of travel websites?," Int. J. Contemp. Hosp. Manag., vol. 20, no. 4, pp. 388-400, 2008.
  17. J. Brooke, "SUS - A quick and dirty usability scale," 1986.

Downloads

Published

2018-09-30

Issue

Section

Research Articles

How to Cite

[1]
Desi Ramayanti, Umniy Salamah, " Complaint Classification Using Support Vector Machine for Indonesian Text Dataset, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 3, Issue 7, pp.179-184, September-October-2018.