Optimizing Mental Health Detection in Indian Armed Forces Personnel through Feature Engineering Driven Dataset Reduction, Addressing Suicide, Depression, and Stress

Authors

  • Sudipto Roy Research Scholar, Department of Computer Science, Shri Vaishnav Vidyapeeth Vishwavidyalaya, Indore, India Author
  • Jigyasu Dubey Head of Department of Computer Science, Shri Vaishnav Vidyapeeth Vishwavidyalaya, Indore, India Author

Keywords:

Machine Learning, Psychometric Test, Feature Engineering, Exploratory Data Analysis, imensionality Reduction, Principal Components Analysis

Abstract

Within the realm of machine learning, the construction of high-quality datasets stands as a crucial factor profoundly influencing model performance. This research aims to furnish a comprehensive guide for enhancing the accuracy and efficiency of dataset construction. It achieves this by integrating multi-variate reduction techniques and innovative feature engineering strategies, implemented within the Python programming ecosystem. As the landscape of datasets becomes increasingly diverse and complex, the imperative to optimize precision grows more critical. This study explores the judicious application of dimensionality reduction methods, such as Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE), alongside various feature selection approaches to strategically streamline datasets while preserving vital information. In conjunction with these reduction techniques, the research introduces novel feature engineering methods to amplify the discriminative power of remaining features, thereby enriching the dataset's representational capacity. The exploration spans a spectrum of multi-variate reduction techniques and delves into feature engineering methodologies, including polynomial feature creation, interaction term generation, and domain-specific transformation functions. Practical implementations of these techniques are demonstrated through Python, showcasing their applicability across diverse domains. Empirical evaluations on real-world datasets underscore the efficacy of the proposed methodology, revealing superior accuracy and efficiency compared to conventional dataset construction approaches. The insights derived from this research contribute significantly to the broader discourse in machine learning, presenting a generic yet potent framework for enhancing precision in datasets. Beyond deepening our understanding of multi-variate reduction and feature engineering, the findings offer a practical guide for researchers and practitioners seeking to optimize precision in various machine learning applications.              

Downloads

Download data is not yet available.

References

Kosinski, M., Stillwell, D., and Graepel, T. Private traits and attributes are predictable from digital records of human behavior. Proc. Natl. Acad. Sci. U.S.A. 110, 5802–5805, (2013).

Monaro, M., Galante, C., Spolaor, R., Li, Q. Q., Gamberini, L., Conti, M., et al. Covert lie detection using keyboard dynamics. Scientific Reports 8 (1976).

Vieira, S., Pinaya, H., and Mechelli, A. Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: methods and applications. Neurosci. Biobehav. Rev. 74(Part A), 58–75, (2017).

Obermeyer, Z., and Emanuel, E. J. Predicting the future: big data, machine learning, and clinical medicine. N. Engl. J. Med. 375, 1216–1219, (2016).

Pace, G., Orrù, G., Merylin, M., Francesca, G., Roberta, V., Boone, K. B., Malingering detection of cognitive impairment with the B test is boosted using machine learning. Front. Psychol. 10:1650 (2019).

Navarin, N., and Costa, F. An efficient graph kernel method for noncoding RNA functional prediction. Bioinformatics 33, 2642–2650, (2017).

Seidenberg, M. S. Connectionist models of word reading. Curr. Dir. Psychol. Sci. 14, 238–242(2005).

Pashler, H., and Wagenmakers, E. J. Editors’ introduction to the special section on reliability in psychological science: a crisis of confidence? Perspect. Psychol. Sci. 7, 528–530(2012).

Breiman, L. Statistical modeling: the two cultures (with comments and a rejoinder by the author). Stat. Sci. 16, 199–231, (2001).

Ioannidis, J. P., Tarone, R., and McLaughlin, J. K. The false-positive to false-negative ratio in epidemiologic studies. Epidemiology 24, 450–456. (2011).

Zhang, J. M., Harman, M., Ma, L., and Liu, Y. Machine learning testing: survey, landscapes and horizons. arXiv [Pre-print]. (2019).

Stef van Buuren, Karin Groothuis- Oudshoorn, “MICE: Multivariate Imputation by Chained Equations in R”. Journal of Statistical Software 45: 1-67, (2011).

Roderick J, A Little and Donald B Rubin “Statistical Analysis with Missing Data”. John Wiley & Sons, Inc., New York, NY, USA, (1986).

Domański, P.D. ‘Study on Statistical Outlier Detection and Labelling’. Int. J. Autom. Computing. 17, 788–811, (2020).

Jishan S.T., Rashu R.I., Mahmood A., Billah F., Rahman R.M. “Application of Optimum Binning Technique in Data Mining Approaches to Predict Students’ Final Grade in a Course”. Computational Intelligence in Information Systems. Vol 331. Springer, Cham, (2015).

Jajuga, Krzysztof, and Marek Walesiak. "Standardisation of data set under different measurement scales." In Classifica-tion and information processing at the turn of the millennium, pp. 105-112. Springer, Berlin, Heidelberg, (2000).

Reddy, G. Thippa, et al. "Analysis of dimensionality reduction techniques on big data." IEEE Access 8, (2020).

Mladenić, Dunja. "Feature selection for dimensionality reduction." International Statistical and Optimization Perspectives Workshop" Subspace, Latent Structure and Feature Selection". Springer, Berlin, Heidelberg, (2005).

Pan, Sinno Jialin, James T. Kwok, and Qiang Yang. "Transfer learning via dimensionality reduction." AAAI. Vol. 8. (2008).

Peluffo, Diego H., John A. Lee, and Michel Verleysen. "Recent methods for dimensionality reduction: A brief compara-tive analysis." ESANN, (2014).

Khalid, Samina, Tehmina Khalil, and Shamila Nasreen. "A survey of feature selection and feature extraction techniques in machine learning." 2014 Science and Information Conference. IEEE, (2014).

Ajzen, I. ‘The Theory of Planned Behaviour. Organizational Behaviour and Human Decision Processes’, 50, 179-211. (1991).

Clark, L. A., & Watson, D. Constructing validity: Basic issues in objective scale development. Psychological Assess-ment, 7, 309–319, (1995).

Kyriazos, T. A., & Stalikas, A. Applied Psychometrics: The Steps of Scale Development and Standardization Process. Psychology, 9, 2531-2560, (2018).

Fabrigar, L. R., & Ebel-Lam, A. Questionnaires. In N. J. Salkind (Ed.), Encyclopedia of Measurement and Statistics. Thousand Oaks, CA: Sage, pp. 808-812 (2007).

Dorans, N. J. Scores, Scales, and Score Linking. The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development, V.II, pp. 573-606, (2018).

Chadha, N. K. Applied Psychometry. New Delhi, IN: Sage Publications. (2009).

Price, L. R., Psychometric Methods: Theory into Practice. New York: The Guilford Press. (2017).

Dorans, N. J. “Scores, Scales, and Score Linking. The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development”, V.II (pp. 573-606), (2018).

DeVellis, R. F. ‘Scale Development: Theory and Applications’ (4th ed.). Thousand Oaks, CA: Sage. (2017).

Jenkins, G. D., & Taber, T. D. ‘A Monte Carlo Study of Factors Affecting Three Indices of Composite Scale Reliability’. Journal of Applied Psychology, 62, 392-398. (1977).

Streiner, D. L., Norman, G. R., & Cairney, J. ‘Health Measurement Scales: A Practical Guide to Their Development and Use’ (5th ed.). Oxford, UK: Oxford University, (2015).

Dimitrov, D. M. “Statistical Methods for Validation of Assessment Scale Data in Counselling and Related Fields”. Alexandria, VA: American Counselling Association. (2012).

Morrison, K. M., & Embretson, S. ‘Item Generation. In P. Irwing, T. Booth, & D. J. Hughes (Eds.), The Wiley Hand-book of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development”, V.I (pp. 46-96), (2018).

Demaio, T., & Landreth, A. “Do Different Cognitive Interview Methods Produce Different Results”, Questionnaire Development and Testing Methods. Hoboken, NJ: Wiley. (2004).

Raykov, T. “Scale Construction and Development Using Structural Equation Modelling”. R. H. Hoyle (Ed.), Handbook of Structural Equation Modeling (pp. 472-492). New York: Guilford Press. (2012).

Downloads

Published

06-03-2024

Issue

Section

Research Articles

How to Cite

[1]
S. Roy and J. Dubey, “Optimizing Mental Health Detection in Indian Armed Forces Personnel through Feature Engineering Driven Dataset Reduction, Addressing Suicide, Depression, and Stress”, Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol, vol. 10, no. 2, pp. 70–81, Mar. 2024, Accessed: May 09, 2024. [Online]. Available: http://ijsrcseit.com/index.php/home/article/view/CSEIT241026

Similar Articles

1-10 of 110

You may also start an advanced similarity search for this article.