Using Topic Modelling Approach for Discovery of Anomalous Cluster in High Dimensional Discrete Data

Authors

  • Gajanan Patle  PG Scholar, Department of Computer Science & Engineering, Abha-Gaikwad Patil College of Engineering, Nagpur, Maharashtra, India
  • Ajinkya S. Gujarkar  Assistant Professor, Department of Computer Science & Engineering, Abha-Gaikwad Patil College of Engineering, Nagpur, Maharashtra, India
  • Ektaa Meshram  PG Scholar, Department of Computer Science & Engineering, Abha-Gaikwad Patil College of Engineering, Nagpur, Maharashtra, India

Keywords:

ATD, BTM, hopeful documents, Biterm Topic Modeling

Abstract

In the area of various research, anomaly detection is an imperative issue. Anomaly is the example that does not affirm to the normal conduct. It can allude as anomaly, exemptions, shock and so forth. Anomalies can be meant continuous element, for example, misrepresentation detection, and digital interruption and so on. Numerous sorts of anomaly detection methods have been proposed yet that lone fit for recognizing singular anomalies. In this paper we proposed ATD algorithm to identify cluster of anomalies. Singular anomaly detection strategy neglects to identify atypical example that display on striking subset of fluctuate high dimensional component space. Our proposed algorithm comprises of two stages. To begin with is the preparation advance in which we learn BTM as our invalid model M0 to create all document in test set. Second is the detection stage in which we used document-bootstrapping algorithm for clustering of hopeful documents (S) in the test set.

References

  1. Hossein Soleimani, and David J. Miller, “ATD: Anomalous Topic Discovery in High Dimensional Discrete Data,” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2016.
  2. Naresh Kumar Nagwani, “A Comment on A Similarity Measure for Text Classification and Clustering,” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 27, NO. 9, SEPTEMBER 2015
  3. Xueqi Cheng, Xiaohui Yan, Yanyan Lan, and Jiafeng Guo, BTM: Topic Modeling over Short Texts, IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 26, NO. 12, DECEMBER 2014
  4. V. J. Hodge and J. Austin, “A survey of outlier detection methodologies,” Artificial Intelligence Review, vol. 22, no. 2, pp. 85–126, 2004.
  5. V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Computing Surveys (CSUR), vol. 41, no. September, pp. 1–58, 2009.
  6. A. Srivastava and A. Kundu, “Credit card fraud detection using hidden Markov model,” IEEE Transactions on Dependable and Secure Computing, vol. 5, no. 1, pp. 37–48, 2008.
  7. J. Major and D. Riedinger, “EFD: A Hybrid Knowledge/Statistical- Based System for the Detection of Fraud,” Journal of Risk and Insurance, vol. 69, no. 3, pp. 309–324, 2002.
  8. K. Wang and S. Stolfo, “Anomalous payload-based network intrusion detection,” in Recent Advances in Intrusion Detection, pp. 203– 222, 2004.
  9. F. Kocak, D. Miller, and G. Kesidis, “Detecting anomalous latent classes in a batch of network traffic flows,” in Information Sciences and Systems (CISS), 2014 48th Annual Conference on, pp. 1–6, 2014.
  10. D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation,” Journal of Machine Learning Research, vol. 3, pp. 993–1022, 2003.
  11. H. Soleimani and D. J. Miller, “Parsimonious Topic Models with Salient Word Discovery,” Knowledge and Data Engineering, IEEE Transaction on, vol. 27, pp. 824–837, 2015.
  12. L. Xiong, s. P. Barnaba, J. G. Schneider, A. Connolly, and V. Jake, ´ “Hierarchical probabilistic models for group anomaly detection,” in International Conference on Artificial Intelligence and Statistics, pp. 789–797, 2011.
  13. L. Xiong, B. Poczos, and J. Schneider, “Group anomaly detection ´ using flexible genre models,” in Advances in neural information processing systems, pp. 1071–1079, 2011.
  14. R. Yu, X. He, and Y. Liu, “GLAD : Group Anomaly Detection in Social Media Analysis,” in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 372– 381, 2014.
  15. K. Muandet and B. Scholkopf, “One-class support measure ma- ¨ chines for group anomaly detection,” in 29th Conference on Uncertainty in Artificial Intelligence, 2013.
  16. W. Wong, A. Moore, G. Cooper, and M. Wagner, “Rule-based anomaly pattern detection for detecting disease outbreaks,” 2002.
  17. W. Wong, A. Moore, G. Cooper, and M. Wagner, “Bayesian network anomaly pattern detection for disease outbreaks,” 2003.
  18. K. Das, J. Schneider, and D. B. Neill, “Anomaly pattern detection in categorical datasets,” 2008
  19. E. McFowland, S. Speakman, and D. Neill, “Fast generalized subset scan for anomalous pattern detection,” Journal of Machine Learning Research, vol. 14, no. 1, pp. 1533–1561, 2013.
  20. J. Allan, R. Papka, and V. Lavrenko, “On-line new event detection and tracking,” 1998.

Downloads

Published

2019-04-30

Issue

Section

Research Articles

How to Cite

[1]
Gajanan Patle, Ajinkya S. Gujarkar, Ektaa Meshram, " Using Topic Modelling Approach for Discovery of Anomalous Cluster in High Dimensional Discrete Data , IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 5, Issue 2, pp.1242-1250, March-April-2019.