Resolving Incidents and Alerts in AIOps with Predictive Analytics

Authors

  • Satyanarayana Murthy Polisetty   Jawaharlal Nehru Technological University, Kakinada India

DOI:

https://doi.org/10.32628/CSEIT23902182

Keywords:

IBM Cloud Pak, AIOps, Predictive Analytics, Incident Management

Abstract

Incident and alert management in IT operations has traditionally been reactive, but with the integration of AI, systems can now resolve issues before they escalate. This article explores IBM Cloud Pak for AIOps 4.4.0’s capabilities in predicting, managing, and automating incident resolution. Using predictive analytics, this paper discusses how incidents are identified based on historical patterns and triggered by anomaly detections such as changes in system metrics or event logs. The core focus of the article is on utilizing AI-powered incident management, which anticipates incidents before they occur, based on trend analysis and metrics. A novel aspect discussed is how incidents can be auto-resolved using predefined policies and actions through runbooks, thereby reducing manual intervention and improving response times. The article suggests incorporating AI-based feedback loops for incident resolution, where each resolved incident feeds data back into the system to refine predictions for future incidents, enhancing the overall robustness of AIOps solutions.

References

  1. Han, J., Kamber, M., & Pei, J. (2011). Data mining: concepts and techniques. Morgan Kaufmann.
  2. Hodge, V. J., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial Intelligence Review, 22(2), 85-126.
  3. J. Jangid and S. Malhotra, "Optimizing Software Upgrades in Optical Transport Networks: Challenges and Best Practices," Nanotechnology Perceptions, vol. 18, no. 2, pp. 194–206, 2022. https://nano-ntp.com/index.php/nano/article/view/5169
  4. Kabir, E., & El-Sappagh, S. (2018). Machine learning for incident prediction in cloud computing: A survey. Journal of Network and Computer Applications, 115, 24-38.
  5. Kephart, J. O., & Chess, D. M. (2003). The vision of autonomic computing. Computer, 36(1), 41-50.
  6. Kim, H. S., & Cho, S. B. (2007). Incremental feature selection using genetic algorithms. Information Sciences, 177(22), 4788-4804.
  7. Lipton, Z. C. (2018). The mythos of model interpretability. In Proceedings of the 2018 ICML Workshop on Human Interpretability in Machine Learning (WHI 2018) (pp. 1-8).
  8. Fnu, Y., Saqib, M., Malhotra, S., Mehta, D., Jangid, J., & Dixit, S. (2021). Thread mitigation in cloud native application Develop- Ment. Webology, 18(6), 10160–10161, https://www.webology.org/abstract.php?id=5338s
  9. Mori, K., Yamaguchi, S., & Uchihira, N. (2004). Exception log analysis for software failure detection. In Proceedings of the 2004 international conference on Software engineering (pp. 343-352). IEEE.
  10. Ohlsson, N., & Wohlin, C. (1998). Software reliability engineering: a statistical approach. IEEE Transactions on Software Engineering, 24(11), 1002-1010.
  11. Peng, C., Yuan, D., & Zhang, Y. (2018). Anomaly detection for online service systems. In Proceedings of the 2018 Symposium on Cloud Computing (pp. 21-33). ACM.
  12. Sachin Dixit "AI-Powered Risk Modeling in Quantum Finance : Redefining Enterprise Decision Systems " International Journal of Scientific Research in Science, Engineering and Technology (IJSRSET), Print ISSN : 2395-1990, Online ISSN : 2394-4099, Volume 9, Issue 4, pp.547-572, July-August-2022. Available at doi : https://doi.org/10.32628/IJSRSET221656
  13. Sani, N. F. M., & Teoh, S. S. (2016). A survey on machine learning techniques for anomaly detection in cloud computing. Journal of Network and Computer Applications, 68, 94-121.
  14. Shahriar, H., Hasan, R., & Zulkernine, M. (2016). Machine learning based anomaly detection for cloud infrastructure. In 2016 IEEE International Conference on Cloud Computing Technology and Science (CloudCom) (pp. 250-255). IEEE.
  15. Tsoumakas, G., Katakis, I., & Vlahavas, I. (2003). Effective stacking of regression models. In Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'03) (pp. 216-220). IEEE.
  16. Vaarandi, R. (2003). A data clustering algorithm for mining patterns from event logs. In Proceedings of the third IEEE international conference on data mining workshop on clustering large data sets (CLDS 2003) (pp. 19-27). IEEE.
  17. Malhotra, S., Yashu, F., Saqib, M., & Divyani, F. (2020). A multi-cloud orchestration model using Kubernetes for microservices. Migration Letters, 17(6), 870–875. https://migrationletters.com/index.php/ml/article/view/11795
  18. Wei, L., & Li, Y. (2015). A survey of machine learning techniques for anomaly detection. Journal of Parallel and Distributed Computing, 80, 22-35.

Downloads

Published

2022-10-30

Issue

Section

Research Articles

How to Cite

[1]
Satyanarayana Murthy Polisetty , " Resolving Incidents and Alerts in AIOps with Predictive Analytics" International Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 8, Issue 5, pp.375-387, September-October-2022. Available at doi : https://doi.org/10.32628/CSEIT23902182