Machine Learning Analysis of t he MMPI -2 I tems f or G ender I dentity

Seong-Hyeon Kim¹, Shant Rising¹, Rachel Green¹ Caleb Sin¹

¹ Fuller Theological Seminary

Email: shkim@fuller.edu

Corresponding Author
Seong-Hyeon Kim, Graduate School of Psychology, Fuller Theological Seminary, Pasadena, CA, 91101
Acknowledgement: Drs. Alex Caldwell, Roger Greene, and David Nichols for sharing the Clinical Caldwell Dataset.

Abstract

The authors applied machine learning (ML) to the item responses of 44,846 MMPI-2 (Butcher et al., 1989) profiles to identify important predictors of gender identity, utilizing ML algorithms’ capacity to learn and recognize structural relationships in the data without being explicitly programmed or hypothesized (Samuel, 1959). Several ML algorithms, including XGBoost and deep neural networks, were trained on a train set using a 5-fold cross-validation to predict each profile’s gender from the item responses. Their predictions were then compared with each profile’s reported gender, a proxy variable for gender identity. Their prediction accuracy on the test set ranged from 96.09% to 97.06%. The majority of the 20 most important item responses for gender prediction identified by ML belonged to the seven-item Feminine Gender Identity scale in Martin and Finn (2010), who studied it using factor analysis and expert judgment, thereby demonstrating the validity and usefulness of ML for psychological research.

Machine Learning Analysis of t he MMPI -2 I tems f or G ender I dentity

Abstract

초록

Figure

Table