Assessment and Prediction of Activities of Daily Living Using Machine Learning Methods and Their Actuarial Applications in Insurance

Authors

  • Pengyu Liu

DOI:

https://doi.org/10.55014/pij.v8i6.916

Keywords:

Long-Term Care Insurance;, Activities of Daily Living (ADL);, Machine Learning;, Logistic Regression;, XGBoost;, Random Forest

Abstract

With the accelerating aging of the global population, the precise pricing of Long-Term Care Insurance (LTCI) urgently requires accurate assessment of an individual's Activities of Daily Living (ADL). Traditional actuarial methods relying on linear models struggle to capture complex nonlinear relationships. To address this issue, this study systematically compares the performance of three typical machine learning models—Logistic Regression, XGBoost, and Random Forest—in multiclass prediction of ADL, exploring their feasibility for insurance premium rate calibration.The research is based on four waves of cross-sectional data (2015–2020) from the China Health and Retirement Longitudinal Study (CHARLS). After rigorous data cleaning (final valid sample: 58,790 entries) and an 8:1:1 split into training/validation/test sets, 12 independent variables—including age, mental health score, household size, etc.—were selected to construct the models.

Research methods included model construction, hyperparameter tuning, feature importance analysis (based on absolute coefficient values, weighted gain, and mean decrease in impurity), and feature quantity optimization. The results indicate:(1) The XGBoost model demonstrated the best generalization capability (test set accuracy: 0.7997), significantly outperforming the severely overfitted Random Forest (test set accuracy: 0.7606) and the weakest-performing Logistic Regression (test set accuracy < 0.6);(2) Feature importance analysis consistently identified age as the most critical predictor, with mental health score and self-rated health (particularly significant in XGBoost) also having substantial influence;(3) After feature optimization, XGBoost achieved optimal performance and strong robustness with seven core features (including age, mental health, and self-rated health), while Logistic Regression and Random Forest required fewer and more features, respectively, with inferior results.Accordingly, this study recommends prioritizing the XGBoost model for ADL risk assessment and premium rate calibration in LTCI actuarial practice. Its excellent predictive accuracy, generalization ability, and effective identification of key risk factors (age, mental health, self-rated health) can provide reliable data-driven support for developing fairer and more accurate insurance products.

Downloads

Download data is not yet available.

Downloads

Published

2025-12-20
CITATION
DOI: 10.55014/pij.v8i6.916
Published: 2025-12-20

How to Cite

Liu, P. (2025). Assessment and Prediction of Activities of Daily Living Using Machine Learning Methods and Their Actuarial Applications in Insurance. Pacific International Journal, 8(6), 7–14. https://doi.org/10.55014/pij.v8i6.916

Issue

Section

Regular