Introduction

Physical and cognitive functions are fundamentally interconnected through shared neural circuits involving regions like the basal ganglia and cerebellum, integrating sensorimotor and cognitive processes1,2,3. This bidirectional relationship implies that impairments in one domain affect the other, especially in aging populations4. In geriatric populations, this decline manifests as “cognitive frailty (CF),” a syndrome that combines physical frailty (PF) and impaired cognition (IC) without dementia4,5. CF is considered an intermediary phase between normal aging and the onset of severe impairments, such as dementia or physical disability4. Therefore, CF has been increasingly considered as a precursor to accelerated physical and cognitive decline, emphasizing its role as a critical target for early diagnosis and intervention in older adults5,6.

Epidemiological studies and reviews report CF prevalence ranging from 1% to 50%7,8, reflecting disparities in geographical regions, population characteristics, and assessment methods. Nevertheless, CF has been linked to high risks of adverse health outcomes, such as reduced physical activity, frequent falls, loss of independence, malnutrition, depressive symptoms, and worsening chronic conditions like hypertension and diabetes9,10,11,12. These factors collectively accelerate health decline, increasing healthcare costs and reducing quality of life13,14. Given these adverse outcomes, there is an urgent need to better understand the underlying risk factors contributing to CF and to identify individuals at risk before irreversible decline occurs.

CF can be reversed if detected early and appropriately addressed4. Identifying CF-associated risk factors is critical for developing diagnostic tools and preventive strategies15. Studies have identified key CF factors, including cardiovascular diseases, genetic factors, and systemic inflammation16,17,18,19. Recent systematic reviews and meta-analyses offer a comprehensive approach to identifying CF by assessing motor and cognitive capacities20,21,22. More recently, several studies have attempted to integrate multidomain predictors such as motor function, cognitive reserve, and psychological health to improve CF detection23,24,25. However, these studies have often been limited by narrow variable scopes and by reliance on conventional statistical methods that may not adequately capture the complexity of interacting health domains23,24,25. In addition, the lack of population-specific validation limits the generalizability of their findings to diverse aging groups.

Therefore, this study aimed to (1) identify key CF detection factors by analyzing comprehensive characteristics and (2) develop and validate a machine learning model to determine optimal factors for identifying CF. Machine learning offers a powerful framework for identifying complex, multidimensional patterns that may not be captured by traditional statistical methods26,27,28. In this study, we applied a machine learning-based logistic regression approach to select the most informative predictors of CF and enhance risk stratification.

Results

Participant classification

Table 1 summarizes the PF and IC data of 2,404 participants. The participants were classified into two groups: CF (18.4%) and non-CF (81.6%). The CF group showed a significantly higher prevalence of PF phenotypes (slowness, weakness, exhaustion, unintentional weight loss, and low physical activity), than the non-CF group (all p < 0.01). The CF group also had significantly lower MMSE scores (p < 0.01).

Table 1 Physical frailty and impaired cognition information of the study participants.

Univariate logistic regression analysis of comprehensive characteristics

Table 2 presents the results of the univariate logistic regression comparing the CF and non-CF groups. The CF group consisted of significantly older participants and had a higher proportion of females than the non-CF group (all p < 0.01). Female sex (p < 0.01) and low education levels (p < 0.05) were significantly associated with CF risk. Among the clinical characteristics, the CF group had significantly higher prevalence rates of peripheral vascular disease, osteoarthritis, osteoporosis, and depression and a significantly greater number of participants on prescription medications than the non-CF group (all p < 0.05). Among the physical health domain of health status characteristics, all factors, particularly a TUG test time ≥ 10 s (p < 0.01), were significantly associated with CF risk. Among the psychological and nutritional health domains, poor health-related quality of life, high depression scores, difficulties in chewing and pronunciation, low MNA scores, and malnutrition were significantly associated with CF risk (all p < 0.01).

Table 2 Results of the univariate logistic regression analysis comparing the comprehensive characteristics of the cognitive frailty and non-cognitive frailty groups.

Machine learning and optimal feature selection

Table 3 reports the ranking of the 24 features used for optimal feature selection in the machine learning process. Figure 1 shows the performance metrics, including AUC, sensitivity, specificity, and accuracy, as a function of the ranked features. The model’s performance stabilized and became effective when approximately five features were used with an AUC > 80%, while the sensitivity, specificity, and accuracy all remained at > 75%. Notably, further increasing the number of features did not yield significant improvements in model performance. Consequently, the optimal number of features was determined to be six, namely the TUG test time, education level, PF-M, MNA, ABC, and K-ADL. The logistic regression model with these features achieved an AUC of 84.34%, sensitivity of 75.12%, specificity of 80.87%, and accuracy of 79.51%. The logistic regression equation for the model with the optimal features was as follows:

$$\text{l}\text{n}\left(\frac{\text{p}\left(\text{X}\right)}{1-\text{p}\left(\text{X}\right)}\right)={{\upbeta\:}}_{0}+{{\upbeta}}_{1}{\text{X}}_{1}+{{\upbeta\:}}_{2}{\text{X}}_{2}+{{\upbeta}}_{3}{\text{X}}_{3}+{{\upbeta\:}}_{4}{\text{X}}_{4}+{{\upbeta}}_{5}{\text{X}}_{5}+{{\upbeta}}_{6}{\text{X}}_{6}$$

where p(X) represents the probability of being robust or frail, ranging from 0 to 1, and X1, X2, X3, X4, X5, and X6 correspond to the TUG test time, education level, PF-M, MNA, ABC, and K-ADL, respectively. β0 is the intercept (3.7588), while β16 are the corresponding coefficients (β1 =  − 0.7241, β2 = 0.2510, β3 =  − 0.0963, β4 =  − 0.0084, β5 =  − 0.1626, and β6 = 0.2053, respectively).

Fig. 1
figure 1

Results of the optimal feature selection based on recursive feature elimination. Error bars indicate 95% confidence intervals. Abbreviations: AUC, area under the receiver operating characteristic curve.

Table 3 Ranking of 24 features determined by a machine learning algorithm.

These coefficients can be interpreted using odds ratios to better understand their practical implications. Odds ratio (OR) was derived by exponentiating the logistic regression coefficients (OR = eβ), allowing for intuitive interpretation of effect sizes. For instance, a one-second increase in TUG test time was associated with a 51.5% decrease in the odds of being classified as frail (OR = 0.485), while a one-point increase in the MNA score reduced the odds by approximately 20.5% (OR = 0.795). Such interpretations help contextualize the effect size of each variable and enhance the clinical relevance of the model.

Discussion

We found significant associations between CF and older age, female sex, lower education level, certain clinical characteristics (i.e., peripheral vascular disease, osteoarthritis, osteoporosis, depression, and the number of prescription medications), ADL, PF-M for physical function limitations, SARC-F for sarcopenia, ABC for balance confidence, motor capacity (sit-to-stand, TUG test time, and TUG test time ≥10 seconds), fall characteristics (individual’s fall experience, number of falls, concerns about falling, and their ability to safely cross a street before the traffic light turns red), health-related QoL (EQ-5D and VAS), depression, discomfort in chewing and pronunciation, MNA score, and malnutritional status. The machine learning-based model, incorporating six optimal features (TUG test time, education level, PF-M, MNA score, ABC, and K-ADL score), exhibited robust predictive performance with an AUC >80% (an AUC of 80–90% is considered excellent29,30,31).

Age, sex, and education in CF

The significant associations found between older age, sex, and education level with CF in this study align with the results of previous research, indicating that CF as a transitional stage between normal aging and dementia, often linked to PF4. The heightened CF risk likely results from the cumulative effects of cognitive and physical decline with age7. These findings emphasize the importance of early detection and targeted interventions. Consistent with previous studies, females were more affected by CF than males, with frailty and cognitive decline being more common among older women13, likely due to biological and social factors, including postmenopausal hormonal changes14. The association between education and CF supports the cognitive reserve hypothesis, where higher educational attainment increases resilience against cognitive decline32. Lower educational levels, often associated with poorer health outcomes, may reflect lower socioeconomic status, health literacy, and healthcare access33.

Although this study primarily focused on health-related predictors of CF, education level may also reflect broader socioeconomic context. Individuals with lower education levels may experience greater barriers in accessing health information, utilizing healthcare services, and maintaining healthy lifestyle behaviors. While our dataset did not include other indicators such as income or occupation, future research could explore these social and contextual pathways more directly to better understand their contribution to CF.

Clinical characteristics in CF

Peripheral vascular disease, osteoarthritis, osteoporosis, depression, and the number of prescription medications were found to be significantly associated with CF, consistent with the findings of previous literature. Peripheral vascular disease impairs cerebral blood flow, accelerating CI by limiting the oxygen supply essential for neuronal function34. This vascular insufficiency exacerbates both CI and PF, supporting the vascular CI theory35, which links small vessel disease to cognitive decline and PF. Osteoarthritis, characterized by chronic pain and reduced mobility, contributes to PF and impairs cognition by reducing physical activity, increasing social isolation, and promoting systemic inflammation36,37. Osteoporosis further restricts mobility, increasing fall and fracture risk, and compounding PF by diminishing functional capacity36. Depression is also linked to cognitive decline and PF, likely due to neuroinflammatory pathways and impaired neuroplasticity, which are increasingly recognized as shared mechanisms underlying both depression and neurodegenerative processes38,39. Depression accelerates CF progression by promoting social withdrawal, reduced physical activity, and decreased cognitive engagement40. Furthermore, polypharmacy increases the risk of adverse drug reactions and interactions and medication non-adherence, negatively impacting cognitive and physical functions41,42,43. Additionally, the use of five or more medications significantly increases CF risk42.

Health status characteristics in CF

ADL, physical function limitation, sarcopenia, balance confidence, motor capacity, fall characteristics, health-related QoL, depression, discomfort in chewing and pronunciation, and malnutritional status were found to be significantly associated with CF. A decline in ADL was closely related to CF, as CIs impair one’s ability to plan and complete daily tasks, while PF limits strength and mobility44. This reciprocal relationship between cognitive and physical decline is well-documented13. Limitations in physical function emphasize the interplay between motor and cognitive systems, with mobility impairments, such as slower gait speed, being strong detectors of cognitive decline45. Reduced physical activity diminishes neurotrophic stimulation, accelerating both physical and cognitive deterioration46.

Sarcopenia is linked to shared pathophysiological mechanisms such as inflammation and oxidative stress, which degrade both muscle and brain function47,48. Chronic inflammation and oxidative stress contribute to muscle loss and cognitive decline, leading to decreased physical activity and further exacerbation of PF and CI47,48,49. Balance confidence and motor capacity were also significantly associated with CF. Poor balance and reduced motor capacity, both key PF indicators, were strongly linked to CI13,50. Additionally, limitations in balance and motor skills restrict physical activity, reducing neurotrophic stimulation and further contributing to cognitive decline13,50. Fall characteristics, including fall experience, fear of falling, and the ability to cross traffic lights, were also significantly associated with CF. Fear of falling restricts physical activity, leading to muscle atrophy and worsening cognitive decline51.

CF was significantly associated with poor health-related QoL by limiting participation in meaningful activities and reducing perceived well-being9. Both PF and CIs independently contribute to reduced mobility, psychological distress, and social isolation, collectively affecting QoL52. Additionally, CF was significantly associated with poor mental health, emphasizing its psychological dimension. Psychological distress and physical inactivity intensify CF and PF53.

CF was also significantly associated with nutritional health, assessed by MNA. Poor nutritional status worsens CF by depriving individuals of essential nutrients needed for cognitive function and neuroprotection54. Chewing discomfort exacerbates malnutrition, accelerating cognitive decline. This finding aligns with evidence linking poor oral health, reduced dietary intake, and CF exacerbation in older adults55.

Machine learning-based model performance

The machine learning model developed in this study employed a logistic regression algorithm, enhanced by recursive feature elimination and bootstrapping, to improve predictive accuracy. Six key features, the TUG test time, education level, PF-M, MNA, ABC, and K-ADL score, were identified as being essential for optimal model performance. The model demonstrated strong discriminatory power, achieving an AUC of 84.34%, indicating excellent differentiation between individuals with and without CF. The model’s sensitivity reached 75.12%, reflecting its ability to correctly identify individuals with CF, while its specificity was 80.87%, indicating its accuracy in excluding individuals without frailty. Overall, the model achieved 79.51% accuracy, underscoring its robust predictive performance in nearly 80% of cases.

By integrating the motor capacity of TUG test time, education level, physical function limitations (PF-M), nutritional status (MNA), balance confidence (ABC), and ADL, the model captures the interplay between cognitive and physical health. These findings emphasize the importance of comprehensive health assessments in prediction models, providing valuable insights for early detection and timely intervention to slow or prevent CF progression in vulnerable older adults.

In addition to informing public health planning, the proposed model has practical utility in primary care and clinical settings. Its reliance on simple, easily measurable indicators allows for rapid screening and early identification of individuals at risk, thereby facilitating timely referrals and personalized intervention planning. Furthermore, the model may serve as a valuable tool for community-based health programs to prioritize outreach, and for health administrators to allocate resources based on population-level risk stratification. It also holds potential for integration into digital health platforms, empowering older adults and caregivers with accessible, evidence-based risk insights.

The six features selected for the final machine learning model were drawn from variables that showed significant group differences in the univariate logistic regression analysis. This sequential approach ensured that only statistically relevant predictors were considered for model development, enhancing both performance and interpretability.

Our findings complement previous studies that have highlighted the importance of multidomain risk assessment for CF23,24,25. By combining clinical, psychological, nutritional, and functional variables within a machine learning framework, this study contributes to building more context-specific predictive models for aging populations, particularly in underrepresented settings.

In summary, this study identified six key predictors of CF through a machine learning-based analysis of multidomain health characteristics. The resulting model demonstrated excellent predictive accuracy and practical utility, serving as a valuable tool for early detection and risk stratification among older adults. These findings demonstrate the potential for integrating simple, scalable assessments into routine clinical and community-based care to proactively and efficiently manage CF.

Study limitations and future work

While this study offers significant findings and a robust prediction model, four limitations should be acknowledged: First, the use of retrospective data from the KFACS may have introduced recall bias from self-reported measures. Additionally, this study’s cross-sectional design limited us from drawing causal inferences between the identified risk factors and CF; hence, longitudinal studies are needed to establish temporal relationships. Second, the study’s population, limited to older Korean adults, may affect generalizability due to demographic homogeneity. Cultural, environmental, and healthcare systems influence aging and CF; therefore, the model needs to be validated in more diverse populations. Cross-cultural studies could provide further insights into how socio-environmental factors and lifestyle behaviors shape CF. Third, this study did not include biological markers. Future research should include these variables to better explore the neurobiological mechanisms underlying CF and enhance the accuracy of predictive models. Fourth, cognitive impairment in this study was evaluated using the MMSE, a well-established and widely used tool in geriatric populations. While the MMSE has been validated in large-scale studies and provides strong clinical utility, its use alone may not fully capture the multifaceted nature of cognitive decline. Specifically, the MMSE has limited sensitivity in detecting impairments in executive function, attention, and visuospatial abilities, which are essential components of CF56,57. Given the complexity of CF, future research may benefit from incorporating multiple cognitive assessment tools to more comprehensively reflect the breadth of cognitive functioning, particularly in domains beyond memory and orientation.

Conclusion

This study provides novel insights into the identification of CF by analyzing comprehensive individual characteristics and developing a machine learning-based model. The approach and findings have significant clinical implications, as early CF detection provides a critical window for interventions aimed at preventing progression to severe cognitive and physical decline. Notably, CF is potentially reversible with timely management. Our findings underscore the importance of comprehensive individual health assessments and of including sociodemographic, clinical, and health status characteristics in routine clinical evaluations of older adults. These assessments should not only address physical function but also nutritional status, capturing CF’s multifaceted nature. By integrating machine learning-based models into clinical practice, healthcare providers can offer personalized and precise interventions. Our prediction model enables the stratification of high-risk individuals, facilitating targeted preventive measures to enhance motor and cognitive function, improve nutrition, and promote overall well-being in older adults. This technology-driven approach can transform geriatric care by promoting early interventions and ultimately reducing CF-related healthcare costs.

Methods

Data source and study participants

This retrospective cohort study utilized the first wave of data from the nationwide Korean Frailty and Aging Cohort Study (KFACS), collected between 2016 and 201758. The KFACS included community-dwelling older adults aged 70–84 years. In the initial cohort of 3,011 participants, 607 were excluded due to incomplete PF and/or cognitive function data; 2,404 participants were included in the final analysis (Fig. 2).

Fig. 2
figure 2

Strobe diagram of participants’ eligibility determinations. Abbreviations: CF, cognitive frailty; MMSE, mini-mental state examination, non-CF, non-cognitive frailty.

Ethical considerations

The use of the KFACS data and study protocol were approved by the Institutional Review Board of the Clinical Research Ethics Committee in Kyung Hee University Hospital and exempted consent was obtained (approval IRB no. 2024-05-028; date: 2024.05.27). Written informed consent of KFACS was obtained from all the participants and/or their legal guardian(s). Since this retrospective study utilized anonymized data, additional written informed consent from participants was not required. All methods in this study were performed in accordance with the guidelines and regulations of the Declaration of Helsinki.

Measurements

Cognitive frailty

CF was defined as a combination of PF and CI. PF was assessed using the Fried frailty phenotype, which includes five components: slowness, weakness, exhaustion, weight loss, and low physical activity58,59. Slowness and weakness were measured using the 4-m walking test and grip strength test, respectively, while exhaustion, weight loss, and low physical activity were self-reported through a frailty phenotype questionnaire58,59. Participants with no frailty phenotypes were classified as physically robust, and those with one or more as frail58,59. Cognitive function was evaluated using the Mini-Mental State Examination (MMSE)60. CF was defined as the presence of both a frailty phenotype and IC (MMSE score ≤ 24)5.

Comprehensive characteristics

Candidate CF detectors were based on sociodemographic, clinical, and health status characteristics extracted from the KFACS58. Data were gathered through physical examinations, self-reported health assessments, and functional surveys58. Sociodemographic characteristics included age, sex, and education. Clinical characteristics included comorbidities of cardiovascular, musculoskeletal, respiratory, digestive, endocrine, nervous, urinary, oncological, and mental systems, along with current medications. Health status was categorized into three domains; physical, psychological, and nutritional health. (1) Physical health was assessed using the Korean Activities of Daily Living (K-ADL) scale61. The PF and Mobility (PF-M) test was utilized to assess physical function limitations, including walking speed, balance, and physical strength62. The five-component Strength, Assistance with walking, Rising from a chair, Climbing stairs, and Falls (SARC-F) questionnaire was used to identify individuals at risk of sarcopenia63. Balance confidence was evaluated using the Activities-specific Balance Confidence (ABC) scale64. Motor capacity was assessed using two common clinical tests: the sit-to-stand test65 and the Timed Up and Go (TUG) test66. Additionally, fall characteristics were assessed by examining the individual’s fall experience, number of falls, concerns about falling, and ability to safely cross a street before the traffic light turns red. (2) Psychological health was assessed using the EuroQol 5 Dimension (EQ-5D)67 and EuroQol Visual Analogue Scale (VAS)68. The Korean version of the Short Geriatric Depression Scale (SGDS-K), an enhanced version of the 15-item Geriatric Depression Scale (GDS), was used to assess depression69. Sleep quality was measured by evaluating three factors: time taken to fall asleep, sleep duration, and presence or absence of daytime naps. (3) Nutritional health included an assessment of dental health, focusing on two factors: discomfort while chewing and difficulties with pronunciation. Additionally, the Mini Nutritional Assessment (MNA), along with body mass index (BMI), was used to evaluate nutritional health.

Statistical analysis

Data and statistical analyses were conducted using MATLAB (MathWorks, Natick, MA, USA) and IBM SPSS Statistics (IBM Corp., Armonk, NY, USA). Continuous variables were reported as mean ± standard deviation, while categorical variables were presented as frequencies and percentages. The normality of continuous data was assessed using the Shapiro–Wilk test. For group comparisons (i.e., CF vs. non-CF), a one-way analysis of variance was utilized for normally distributed data, while the Mann–Whitney U test was applied for non-normally distributed data. Categorical variables were analyzed using the chi-square test to determine significant differences between the CF and non-CF groups. Univariate logistic regression was conducted to explore the associations between comprehensive characteristics of candidate detectors and CF. OR and 95% confidence intervals (95% CI) were calculated for each variable, adjusting for age and sex to account for potential confounding factors. The significance level was set at two-tailed p < 0.05 for all statistical analyses.

Machine learning analysis and optimal feature selection

To identify the optimal features for accurately detecting CF, a feature selection process was implemented using machine learning with logistic regression modeling. A total of 24 comprehensive characteristics were included as sociodemographic (n = 1), clinical (n = 5), and health status (n = 18) characteristics, used as independent variables in the logistic regression model. CF, categorized as robust (0) or frail (1), was the dependent variable in the model. Bootstrapping and recursive feature elimination were used to improve model performance30,31. As recommended in previous studies70,71, 500 bootstrapping iterations were performed to ensure model robustness with a sample size of 2,404 participants. To address the class imbalance between CF (n = 443) and non-CF (n = 1,961), the synthetic minority over-sampling technique (SMOTE) was applied72, generating 500 training-validation dataset pairs for bootstrapping. Recursive feature elimination then ranked and selected the most relevant features, progressively removing the least important ones to determine the optimal feature set for CF detection.

The performance of the logistic regression model was assessed using four key metrics: area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy30,31,73. These metrics were calculated from the 500 bootstrapped datasets generated using SMOTE across different samples. AUC measured the model’s ability to differentiate between CF and non-CF cases, while sensitivity and specificity reflected the model’s accuracy in identifying participants with CF and without CF, respectively. Accuracy reflected the model’s overall classification performance. After determining the least optimal features via recursive feature elimination, the model was re-evaluated using the 500 validation dataset pairs to test its performance with the reduced feature set.