Psychometric properties of the 18-item Indonesian Mental Toughness Questionnaire using the Rasch model and Machine Learning

Authors

DOI:

https://doi.org/10.21580/pjpp.v10i1.23214

Keywords:

differential item functioning, gradient boosting classification, Mental Toughness Questionnaire, rating scale model, Wright Map

Abstract

The psychometric properties of the Indonesian version of the 18-item Mental Toughness Questionnaire (MTQ-18) remain vague. This study uses the Rasch model to elucidate these properties. In addition, boosting classification was adopted to assess the predictive validity of athletes’ achievement. The sample size comprised 400 athletes. According to the Martin-Loef likelihood-ratio test = 482, p = 1.0 and factor analysis of the Rasch residuals, the questionnaire tends to make unidimensional assumptions. The MADaQ3 = 0.074 shows the overall tendency of local independency across all items, with the majority clustered in moderate to low-level measures. Q11, Q15, and Q18 were clearly identified as showing gender bias, with significant effect sizes. According to the boosting classification, the performance between national vs no achievement (F1 = 0.7, AUC = 0.56) and international vs no achievement (F1 = 0.62, AUC = 0.58) was flagged as unsatisfactory predictive performance. In conclusion, the abridged questionnaire is not preferable for determining an individual’s future performance or achievement. Future studies are needed to develop a better version that is more unimpeded by gender bias, and to resolve the variability of the items.

Downloads

Download data is not yet available.

Author Biography

Mami Kanzaki, Faculty of Education, Kyoto University of Education, Kyoto

Faculty of Psychology

References

AERA, APA, & NCME. (2014). Standards for educational and psychological testing / American Educational Research Association, American Psychological Association, National Council on Measurement in Education. American Educational Research Association.

Andrich, D. (1978). Application of a psychometric rating model to ordered categories which are scored with successive integers. Applied Psychological Measurement, 2(4), 581–594. https://doi.org/10.1177/014662167800200413

Belsti, Y., Moran, L., Du, L., Mousa, A., De Silva, K., Enticott, J., & Teede, H. (2023). Comparison of machine learning and conventional logistic regression-based prediction models for gestational diabetes in an ethnically diverse population; the Monash GDM Machine learning model. International Journal of Medical Informatics, 179, 105228. https://doi.org/10.1016/j.ijmedinf.2023.105228

Benkendorf, D. J., Schwartz, S. D., Cutler, D. R., & Hawkins, C. P. (2023). Correcting for the effects of class imbalance improves the performance of machine-learning based species distribution models. Ecological Modelling, 483, 110414. https://doi.org/10.1016/j.ecolmodel.2023.110414

Bond, T., & Fox, C. M. (2015). Applying the Rasch Model: Fundamental measurement in the human sciences. Routledge. https://doi.org/10.4324/9781315814698

Boone, W. J. (2016). Rasch analysis for instrument development: Why, when, and how? CBE—Life Sciences Education, 15(4), rm4. https://doi.org/10.1187/cbe.16-04-0148

Brand, S., Kalak, N., Gerber, M., Clough, P. J., Lemola, S., Sadeghi Bahmani, D., Pühse, U., & Holsboer-Trachsler, E. (2017). During early to mid adolescence, moderate to vigorous physical activity is associated with restoring sleep, psychological functioning, mental toughness and male gender. Journal of Sports Sciences, 35(5), 426–434. https://doi.org/10.1080/02640414.2016.1167936

Brand, S., Sabouri, S., Gerber, M., Sadeghi Bahmani, D., Lemola, S., Clough, P., Kalak, N., Shamsi, M., & Holsboer-Trachsler, E. (2016). Examining Dark Triad traits in relation to mental toughness and physical activity in young adults. Neuropsychiatric Disease and Treatment, 229. https://doi.org/10.2147/NDT.S97267

Carifio, J., & Perla, R. J. (2007). Ten common misunderstandings, misconceptions, persistent myths and urban legends about Likert Scales and Likert response formats and their antidotes. Journal of Social Sciences, 3(3), 106–116. https://doi.org/10.3844/jssp.2007.106.116

Chicco, D., & Jurman, G. (2023). The Matthews Correlation Coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BioData Mining, 16(1), 4. https://doi.org/10.1186/s13040-023-00322-4

Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Computer Science, 7, e623. https://doi.org/10.7717/peerj-cs.623

Christensen, K. B., Bjorner, J. B., Kreiner, S., & Petersen, J. H. (2002). Testing unidimensionality in Polytomous Rasch Models. Psychometrika, 67(4), 563–574. https://doi.org/10.1007/BF02295131

Clough, P., Earle, K., & Sewell, D. (2002). Mental toughness: The concept and its measurement. In Solutions in Sport Psychology (pp. 32–46). Thomson Learning.

Cowden, R. G. (2017). On the mental toughness of self-aware athletes: Evidence from competitive tennis players. South African Journal of Science, 113(1/2), 6. https://doi.org/10.17159/sajs.2017/20160112

Dagnall, N., Denovan, A., Papageorgiou, K. A., Clough, P. J., Parker, A., & Drinkwater, K. G. (2019). Psychometric assessment of shortened Mental Toughness Questionnaires (MTQ): Factor structure of the MTQ-18 and the MTQ-10. Frontiers in Psychology, 10. https://doi.org/10.3389/fpsyg.2019.01933

Debelak, R., & Koller, I. (2020). Testing the local independence assumption of the Rasch Model with Q3 -based nonparametric model tests. Applied Psychological Measurement, 44(2), 103–117. https://doi.org/10.1177/0146621619835501

Denovan, A., Dagnall, N., Hill-Artamonova, E., & Musienko, T. (2021). Mental Toughness Questionnaire (MTQ18): A Russian version. National Security and Strategic Planning, 2021(3), 47–59. https://doi.org/10.37468/2307-1400-2021-3-47-59

Ferreira, A. J., & Figueiredo, M. A. T. (2012). Boosting algorithms: A review of methods, theory, and applications. In C. Zhang & Y. Ma (Eds.), Ensemble Machine Learning (pp. 35–85). Springer New York. https://doi.org/10.1007/978-1-4419-9326-7_2

Fokkema, M., Iliescu, D., Greiff, S., & Ziegler, M. (2022). Machine learning and prediction in psychological assessment. European Journal of Psychological Assessment, 38(3), 165–175. https://doi.org/10.1027/1015-5759/a000714

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5). https://doi.org/10.1214/aos/1013203451

Geiger, R. S., Yu, K., Yang, Y., Dai, M., Qiu, J., Tang, R., & Huang, J. (2020). Garbage in, garbage out? Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, 325–336. https://doi.org/10.1145/3351095.3372862

Gerber, M., Brand, S., Feldmeth, A. K., Lang, C., Elliot, C., Holsboer-Trachsler, E., & Pühse, U. (2013). Adolescents with high mental toughness adapt better to perceived stress: A longitudinal study with Swiss vocational students. Personality and Individual Differences, 54(7), 808–814. https://doi.org/10.1016/j.paid.2012.12.003

Gerber, M., Feldmeth, A. K., Lang, C., Brand, S., Elliot, C., Holsboer-Trachsler, E., & Pühse, U. (2015). The relationship between mental toughness, stress, and burnout among adolescents: A longitudinal study with Swiss vocational students. Psychological Reports, 117(3), 703–723. https://doi.org/10.2466/14.02.PR0.117c29z6

Gonzalez, O. (2021). Psychometric and machine learning approaches to reduce the length of scales. Multivariate Behavioral Research, 56(6), 903–919. https://doi.org/10.1080/00273171.2020.1781585

Grömping, U. (2016). Practical guide to logistic regression. Journal of Statistical Software, 71(3), 1–5. https://doi.org/10.18637/jss.v071.b03

Gucciardi, D. F., Hanton, S., Gordon, S., Mallett, C. J., & Temby, P. (2015). The concept of mental toughness: Tests of dimensionality, nomological network, and traitness. Journal of Personality, 83(1), 26–44. https://doi.org/10.1111/jopy.12079

Guszkowska, M., & Wójcik, K. (2021). Effect of mental toughness on sporting performance: Review of studies. Baltic Journal of Health and Physical Activity, Supplement(2), 1–12. https://doi.org/10.29359/BJHPA.2021.Suppl.2.01

Hossin, M., & Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International Journal of Data Mining & Knowledge Management Process, 5(2), 1–11. https://doi.org/10.5121/ijdkp.2015.5201

Hsieh, Y.-C., Lu, F. J. H., Gill, D. L., Hsu, Y.-W., Wong, T.-L., & Kuan, G. (2024). Effects of mental toughness on athletic performance: A systematic review and meta-analysis. International Journal of Sport and Exercise Psychology, 22(6), 1317–1338. https://doi.org/10.1080/1612197X.2023.2204312

Irribarra, D. T., & Freund, R. (2022). Package ‘WrightMap.’ /https://cran.r-project.org/web/packages/WrightMap/WrightMap.pdf

Jamieson, S. (2004). Likert Scales: How to (ab)use them. Medical Education, 38(12), 1217–1218. https://doi.org/10.1111/j.1365-2929.2004.02012.x

Kawabata, M., Pavey, T. G., & Coulter, T. J. (2021). Evolving the validity of a mental toughness measure: Refined versions of the Mental Toughness Questionnaire‐48. Stress and Health, 37(2), 378–391. https://doi.org/10.1002/smi.3004

Kilkenny, M. F., & Robinson, K. M. (2018). Data quality: “Garbage in – garbage out.” Health Information Management Journal, 47(3), 103–105. https://doi.org/10.1177/1833358318774357

Kim, J., & Oshima, T. C. (2013). Effect of multiple testing adjustment in differential item functioning detection. Educational and Psychological Measurement, 73(3), 458–470. https://doi.org/10.1177/0013164412467033

Kobasa, S. C. (1979). Stressful life events, personality, and health: An inquiry into hardiness. Journal of Personality and Social Psychology, 37(1), 1–11. https://doi.org/10.1037/0022-3514.37.1.1

Lang, C., Brand, S., Colledge, F., Ludyga, S., Pühse, U., & Gerber, M. (2019). Adolescents’ personal beliefs about sufficient physical activity are more closely related to sleep and psychological functioning than self-reported physical activity: A prospective study. Journal of Sport and Health Science, 8(3), 280–288. https://doi.org/10.1016/j.jshs.2018.03.002

Ley, C., Martin, R. K., Pareek, A., Groll, A., Seil, R., & Tischer, T. (2022). Machine learning and conventional statistics: Making sense of the differences. Knee Surgery, Sports Traumatology, Arthroscopy, 30(3), 753–757. https://doi.org/10.1007/s00167-022-06896-6

Lin, Y., Mutz, J., Clough, P. J., & Papageorgiou, K. A. (2017). Mental toughness and individual differences in learning, educational and work performance, psychological well-being, and personality: A systematic review. Frontiers in Psychology, 8. https://doi.org/10.3389/fpsyg.2017.01345

Linacre, J. M. (1989). Rasch models from objectivity: A generalization. ERIC.

Linacre, J. M. (1998). Estimating measures with known polytomous item difficulties. Rasch Measurement Transactions, 12(2), 638.

Linacre, J. M. (2009). Dichotomizing rating scales. Rasch Measurement Transactions, 23(3), 1228.

Linacre, J. M., & Wright, B. (2012). Winsteps help for Rasch analysis. Winsteps, USA.

Magis, D., Beland, S., & Raiche, G. (2020). Package ‘difR.’ https://cran.r-project.org/web/packages/difR/difR.pdf

Mair, P., Rusch, T., Hatzinger, R., Maier, M. J., & Debelak, R. (2024). Package ‘eRm.’ https://cran.r-project.org/web/packages/eRm/eRm.pdf

Marôco, J. (2024). Factor analysis of ordinal items: Old questions, modern solutions? Stats, 7(3), 984–1001. https://doi.org/10.3390/stats7030060

Meggs, J., Chen, M. A., & Koehn, S. (2019). Relationships between flow, mental toughness, and subjective performance perception in various triathletes. Perceptual and Motor Skills, 126(2), 241–252. https://doi.org/10.1177/0031512518803203

Moustafa, R. E. (2011). Andrews curves. WIREs Computational Statistics, 3(4), 373–382. https://doi.org/10.1002/wics.160

Nicholls, A. R., Morley, D., & Perry, J. L. (2016). Mentally tough athletes are more aware of unsupportive coaching behaviours: Perceptions of coach behaviour, motivational climate, and mental toughness in sport. International Journal of Sports Science & Coaching, 11(2), 172–181. https://doi.org/10.1177/1747954116636714

Nicholls, A. R., Polman, R. C. J., Levy, A. R., & Backhouse, S. H. (2009). Mental toughness in sport: Achievement level, gender, age, experience, and sport type differences. Personality and Individual Differences, 47(1), 73–75. https://doi.org/10.1016/j.paid.2009.02.006

Orrù, G., Monaro, M., Conversano, C., Gemignani, A., & Sartori, G. (2020). Machine learning in psychometrics and psychological research. Frontiers in Psychology, 10. https://doi.org/10.3389/fpsyg.2019.02970

Papageorgiou, K. A., Denovan, A., & Dagnall, N. (2019). The positive effect of narcissism on depressive symptoms through mental toughness: Narcissism may be a dark trait but it does help with seeing the world less grey. European Psychiatry, 55, 74–79. https://doi.org/10.1016/j.eurpsy.2018.10.002

Perry, J. L., Clough, P. J., Crust, L., Earle, K., & Nicholls, A. R. (2013). Factorial validity of the Mental Toughness Questionnaire-48. Personality and Individual Differences, 54(5), 587–592. https://doi.org/10.1016/j.paid.2012.11.020

Perry, J. L., Strycharczyk, D., Dagnall, N., Denovan, A., Papageorgiou, K. A., & Clough, P. J. (2021). Dimensionality of the Mental Toughness Questionnaire (MTQ48). Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.654836

Pornel, J. B., & Saldaña, G. A. (2013). Four common misuses of the Likert scale. Philippine Journal of Social Sciences and Humanities, 18(2), 12–19.

Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53(4), 495–502. https://doi.org/10.1007/BF02294403

Raju, N. S. (1990). Determining the significance of estimated signed and unsigned areas between two item response functions. Applied Psychological Measurement, 14(2), 197–207. https://doi.org/10.1177/014662169001400208

Schapire, R. E. (2003). The boosting approach to machine learning: An overview. In Nonlinear Estimation and Classification (pp. 149–171). Springer. https://doi.org/10.1007/978-0-387-21579-2_9

Schmidt, F. L., Viswesvaran, C., & Ones, D. S. (2000). Reliability is not validity and validity is not reliability. Personnel Psychology, 53(4), 901–912. https://doi.org/10.1111/j.1744-6570.2000.tb02422.x

Smith, E. V. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3(2), 205–231. http://www.ncbi.nlm.nih.gov/pubmed/12011501

Stimson, J. R., Cromwell, E. K., & Cromer, L. D. (2022). The predictive validity of the MTQ48 for academic and athletic success in student-athletes. Journal for the Study of Sports and Athletes in Education, 1–13. https://doi.org/10.1080/19357397.2022.2143148

Sullivan, G. M., & Artino, A. R. (2013). Analyzing and Interpreting data from Likert-type scales. Journal of Graduate Medical Education, 5(4), 541–542. https://doi.org/10.4300/JGME-5-4-18

Trognon, A., Cherifi, Y. I., Habibi, I., Demange, L., & Prudent, C. (2022). Using machine-learning strategies to solve psychometric problems. Scientific Reports, 12(1), 18922. https://doi.org/10.1038/s41598-022-23678-9

Vujovic, Ž. Ð. (2021). Classification model evaluation metrics. International Journal of Advanced Computer Science and Applications, 12(6). https://doi.org/10.14569/IJACSA.2021.0120670

Wakita, T., Ueshima, N., & Noguchi, H. (2012). Psychological distance between categories in the Likert Scale. Educational and Psychological Measurement, 72(4), 533–546. https://doi.org/10.1177/0013164411431162

Wolins, L., Wright, B. D., & Masters, G. N. (1983). Rating scale analysis: Rasch measurement. Journal of the American Statistical Association, 78(382), 497. https://doi.org/10.2307/2288670

Wright, B. D. (1996). Comparing Rasch measurement and factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 3(1), 3–24. https://doi.org/10.1080/10705519609540026

Yarayan, Y. E., Solmaz, S., Aslan, M., Batrakoulis, A., Al-Mhanna, S. B., & Keskin, K. (2024). Sex differences in athletic performance response to the imagery and mental toughness of elite middle- and long-distance runners. Sports, 12(6), 141. https://doi.org/10.3390/sports12060141

Zheng, X., Wang, B., Liu, H., Wu, W., Sun, J., Fang, W., Jiang, R., Hu, Y., Jin, C., Wei, X., & Chen, S. S.-C. (2023). Diagnosis of Alzheimer’s disease via resting-state EEG: Integration of spectrum, complexity, and synchronization signal features. Frontiers in Aging Neuroscience, 15. https://doi.org/10.3389/fnagi.2023.1288295

Downloads

Published

2025-05-08

How to Cite

Yudiarso, A., Ardhiani, I. W., Surya, R., Watimena, F. Y., & Kanzaki, M. (2025). Psychometric properties of the 18-item Indonesian Mental Toughness Questionnaire using the Rasch model and Machine Learning. Psikohumaniora: Jurnal Penelitian Psikologi, 10(1). https://doi.org/10.21580/pjpp.v10i1.23214

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 7 8 9 10 11 12 > >> 

You may also start an advanced similarity search for this article.