Journal Search Engine
Search Advanced Search Adode Reader(link)
Download PDF Export Citaion korean bibliography PMC previewer
ISSN : 2233-6710(Print)
ISSN : 2384-2121(Online)
Journal of Asia Pacific Counseling Vol.10 No.2 pp.79-98

Machine Learning Analysis of t he MMPI -2 I tems f or G ender I dentity

Seong-Hyeon Kim1, Shant Rising1, Rachel Green1 Caleb Sin1
1 Fuller Theological Seminary
Corresponding Author
Seong-Hyeon Kim, Graduate School of Psychology, Fuller Theological Seminary, Pasadena, CA, 91101
Acknowledgement: Drs. Alex Caldwell, Roger Greene, and David Nichols for sharing the Clinical Caldwell Dataset.


The authors applied machine learning (ML) to the item responses of 44,846 MMPI-2 (Butcher et al., 1989) profiles to identify important predictors of gender identity, utilizing ML algorithms’ capacity to learn and recognize structural relationships in the data without being explicitly programmed or hypothesized (Samuel, 1959). Several ML algorithms, including XGBoost and deep neural networks, were trained on a train set using a 5-fold cross-validation to predict each profile’s gender from the item responses. Their predictions were then compared with each profile’s reported gender, a proxy variable for gender identity. Their prediction accuracy on the test set ranged from 96.09% to 97.06%. The majority of the 20 most important item responses for gender prediction identified by ML belonged to the seven-item Feminine Gender Identity scale in Martin and Finn (2010), who studied it using factor analysis and expert judgment, thereby demonstrating the validity and usefulness of ML for psychological research.