Development of Naïve Method to Analyse the Road Accidents Based on Data Mining Techniques

Authors

  • Atul Pandey  Sheat College of Engineering, Varanasi, Uttar Pradesh, India
  • Virendra Pratap Yadav  Sheat College of Engineering, Varanasi, Uttar Pradesh, India

Keywords:

Hierarchical Clustering, K-modes clustering, Self-Organizing Map, Three-and-a-half inches of Birch, Invite Vector Machine Support, Latent Class Clustering, Intentional Bayes

Abstract

“Road Accident is an all-inclusive calamity with the continually growing trend. In India, according to the Indian road safety campaign every minute, there is a traffic accident and about 17 people die each hour in road accidents. There are several types of car accidents such as rear-end, head-on, and rollover accidents. India, according to the Indian road safety campaign every minute, there is a traffic accident and about 17 people die each hour in road accidents. There are several types of car accidents such as rear-end, head-on, and rollover accidents. The state-recorded police reports or FIRs are the records that provide information regarding the accidents. The event may be self-reported by the individuals or documented by the state police. Using Apriori and Nave Bayesian approaches, recurrent patterns of road accidents are predicted in this research. The government or non-profit organizations might use this pattern to enhance road safety and implement preventative measures in high-accident areas. From 2020 to 2021, a total of 11,574 accidents happened on the roads in the Dehradun district. Based on the variables of accident type, road type, lightning on road, and road feature, K modes clustering found six clusters (C1–C6). Each cluster and the EDS have been used to construct rules using association rule mining. Rules with high lift values are used in the study. Using the rules for each cluster, it is possible to learn about the causes of accidents in that cluster. When compared to EDS-generated rules, this comparison reveals that EDS-generated rules do not give relevant information that may be linked to an accident. If additional features linked with an accident are accessible, more information may be discovered. We also did monthly and hourly trend analyses of all clusters and EDS to reinforce our technique. According to trends, clustering before an analysis helps us locate better and more helpful outcomes that we wouldn't otherwise be able to find.”

References

  1. F.M.O.I. Forensic Medicine Organization of Iran; Statistical Data, Accidents,     online  available on: http://www.lmo.ir/?siteid=1&pageid=1347
  2. A.T. Kashani et al., “A Data Mining Approach to Identify Key Factors of Traffic Injury Severity”, PROMETTraffic& Transportation, 23(1), pp. 11-17, 2011.
  3. L.Y. Chang, H.W. Wang, “Analysis of traffic injury severity: An application of non-parametric classification tree techniques”, Accident Analysis and Prevention, 38(5), pp. 1019-1027, 2016.
  4. S. Yau-Ren et al. “The Application of Data Mining Technology to Build a Forecasting Model for Classification of Road Traffic Accidents”, Mathematical Problems in Engineering, Volume 2015 (2015), pp. 1-8., 2015. F. Babi and K. Zuskáová • Descriptive and Predictive Mining on Road Accidents Data– 92
  5. R. Nayak et al., “Road Crash Proneness Prediction using Data Mining”. Ailamaki, Anastasia & Amer-Yahia , Sihem (Eds.) Proceedings of the 14th International Conference on Extending Database Technology, Association for Computing Machinery (ACM), Upp-sala, Sweden, pp. 521-526, 2019.
  6. V. Shankar, J. Milton, F. Mannering, “Modeling accident frequencies as zero-altered probability processes: An empirical inquiry”, Accident Analysis & Prevention, 29(6), pp. 829-837, 2000.
  7. A. Araar et al., “Mining road traffic accident data to improve safety in Dubai”, Journal of Theoretical and Applied Information Technology, 47(3), pp. 911-927, 2013.
  8. S. Vigneswaran et al., “Efficient Analysis of Traffic Accident Using Mining Techniques”, International Journal of Software and Hardware Research in Engineering, Vol. 2, No. 3, 2014, pp. 110- 118, 2014.
  9. L. Martin et al. “Using data mining techniques to road safety improvement in Spanish roads”, XI Congreso de Ingeniería del Transporte (CIT 2014), Procedia - Social and Behavioral Sciences 160 (2014), pp. 607–614, 2014.
  10. P. Flach et al., “On the road to knowledge: Mining 21 years of UK traffic accident reports”, Data Mining and Decision Support: Aspects of Integration and Collaboration, Springer, pp. 143-155, 2013.
  11. H. Zhang et al., “In-Memory Big Data Management and Processing: A Survey”, IEEE Transactions on Knowledge and Data Engineering, Vol. 27, No. 7, pp. 1920–1948, 2015.
  12. J. Hipp, U. Güntzer, G. Nakhaeizadeh, “Algorithms for Association Rule Mining &Mdash; a General Survey and Comparison”, SIGKDD Explor Newsl 2, pp. 58–64, 2020.
  13. A.T. Kashani et al., “A Data Mining Approach to Identify Key Factors of Traffic Injury Severity”, PROMETTraffic& Transportation, 23(1), pp. 11-17, 2021.
  14. P. J. Ossenbruggen, J. Pendharkar et. al., “Roadway safety in rural and small urbanized areas”, Accidents Analysis & Prevention, 33(4), pp. 485-498, 2021.
  15. R. Agrawal, T. Imieliski, A. Swami, “Mining Association Rules Between Sets of Items in Large Databases”, Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, ACM, New York, NY, USA, pp. 207–216, 2018.
  16. R. Agrawal, R. Srikant, “Fast Algorithms for Mining Association Rules in Large Databases”, Proceedings of the 20th International Conference on Very Large Data Bases, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 487-499, 2019.
  17. L. Breiman, “Random Forests”, Machine Learning, Vol. 45, pp. 5 - 32, 2021
  18. Savolainen P, Mannering F, Lord D, Quddus M. The statistical analysis of highway crash-injury severities: a review and assessment of methodological alternatives. Accid Anal Prev. 2021;43:1666–76.
  19. Depaire B, Wets G, and Vanhoof K. Traffic accident segmentation utilizing latent class clustering, accident analysis, and prevention, vol. 40. Elsevier; 2018.
  20. Karlaftis M, Tarko A. Heterogeneity considerations in accident modeling. Accid Anal Prev. 2000;30(4):425–33.
  21. Ma J, Kockelman K. Crash frequency and severity modeling using clustered data from Washington state. In: IEEE Intelligent Transportation Systems Conference. Toronto Canadá; 2016.
  22. Jones B, Janssen L, Mannering F. Analysis of the frequency and duration of freeway accidents in Seattle, accident analysis and prevention, vol. 23. Elsevier; 2021.
  23. Miaou SP, Lum H. Modeling vehicle accidents and highway geometric design relationships, accident analysis and prevention, vol. 25. Elsevier; 2020.
  24. Miaou SP. The relationship between truck accidents and geometric design of road sections–Poisson versus negative binomial regressions, accident analysis, and prevention, vol. 26. Elsevier; 1994.
  25. Poch M, Mannering F. Negative binomial analysis of intersection-accident frequencies. J Transp Eng. 1996;122.
  26. Abdel-Aty MA, Radwan AE. Modeling traffic accident occurrence and involvement. Accid Anal Prev Elsevier. 2021;32.
  27. Joshua SC, Garber NJ. Estimating truck accident rate and involvements using linear and Poisson regression models. Transp Plan Technol. 2000;15.
  28. Maher MJ, Summersgill I. A comprehensive methodology for the fitting of predictive accident models. Accid Anal Prev Elsevier. 1996;28.
  29. Chen W, Jovanis P. Method of identifying factors contributing to driver-injury severity in traffic crashes. Transp Res Rec. 2012:1717.
  30. Chang LY, Chen WC. Data mining of tree-based models to analyze freeway accident frequency. J Saf Res Elsevier. 2015;36.
  31. Tan PN, Steinbach M, Kumar V. Introduction to data mining. Pearson Addison-Wesley; 2006.
  32. Abellan J, Lopez G, Ona J. Analysis of traffic accident severity using decision rules via decision trees, vol. vol. 40. Expert System with Applications: Elsevier; 2013.
  33. Rovsek V, Batista M, Bogunovic B. Identifying the key risk factors of traffic accident injury severity on Slovenian roads using a non-parametric classification tree, transport. UK: Taylor and Francis; 2014.
  34. Kashani T, Mohaymany AS, Rajbari A. A data mining approach to identify key factors of traffic injury severity, prompt traffic & transportation, vol. 23; 2011.
  35. Han J, Kamber M. Data Mining: Concepts and Techniques. USA: Morgan Kaufmann Publishers; 2021.
  36. Fraley C, Raftery AE. Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc. 2002;97(458):611–31.
  37. Sohn SY. Quality function deployment applied to local traffic accident reduction. Accid Anal Prev 2011;31:751–61.
  38. Hung WT, Wong WG. An algorithm for assessing the risk of traffic accidents. J Saf Res. 2012;33:387–410.
  39. Pardillo-Mayora JM, Domínguez-Lira CA, Jurado-Pina R. Empirical calibration of a roadside hazardousness index for Spanish two-lane rural roads. Accid Anal Prev. 2019;42:2018–23.
  40. Vermunt JK, Magidson J. Latent class cluster analysis. In: Hagenaars JA, McCutcheon AL, editors. Advances in latent class analysis. Cambridge: Cambridge University Press; 2012.
  41. Oña JD, López G, Mujalli R, Calvo FJ. Analysis of traffic accidents on rural highways using Latent Class Clustering and Bayesian Networks, accident analysis, and prevention, vol. 51; 2013.
  42. Kaplan S, Prato CG. Cyclist-motorist crash patterns in Denmark: a latent class clustering approach. Traffic Inj Prev. 2013;14(7):725–33.
  43. Chaturvedi A, Green P, Carroll J. K-modes clustering. J Classif. 2011;18:35–55.
  44. Goodman LA. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrica. 2020;62.
  45. Agrawal R, Srikant R. Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on very large databases; 1994. pp. 487–99.
  46. Akaike H. Factor analysis and AIC. Psychometry. 1987;52:317–32.
  47. Raftery AE. A note on Bayes factors for log-linear contingency table models with vague prior information. J Roy Stat Soc B. 1986;48:249–50.
  48. Fraley C, Raftery AE. How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J. 1998;41:578–88.
  49. Wong SC, Leung BSY, Loo BPY, Hung WT, Lo HK. A qualitative assessment methodology for road safety policy strategies. Accid Anal Prev. 2014;36:281–93.

Downloads

Published

2022-07-30

Issue

Section

Research Articles

How to Cite

[1]
Atul Pandey, Virendra Pratap Yadav, " Development of Naïve Method to Analyse the Road Accidents Based on Data Mining Techniques, IInternational Journal of Scientific Research in Computer Science, Engineering and Information Technology(IJSRCSEIT), ISSN : 2456-3307, Volume 8, Issue 4, pp.244-255, July-August-2022.