Abstract
The coronavirus disease 2019 (COVID-19) has a significant impact on the global population, particularly on individuals with chronic kidney disease (CKD). COVID-19 patients with CKD will face a considerably higher risk of mortality than the general population. This study developed a predictive model for assessing mortality in COVID-19-affected CKD patients, providing personalized risk prediction to optimize clinical management and reduce mortality rates. We developed machine learning algorithms to analyze 219 patients’ clinical laboratory test data retrospectively. The performance of each model was assessed using a calibration curve, decision curve analysis, and receiver operating characteristic (ROC) curve. It was found that the LightGBM model showed the most satisfied performance, with an area under the ROC curve of 0.833, sensitivity of 0.952, and specificity of 0.714. Prealbumin, neutrophil percent, respiratory index in arterial blood, half-saturated pressure of oxygen, carbon dioxide in serum, glucose, neutrophil count, and uric acid were the top 8 significant variables in the prediction model. Validation by 46 patients demonstrated acceptable accuracy. This model can serve as a powerful tool for screening CKD patients at high risk of COVID-19-related mortality and providing decision support for clinical staff, enabling efficient allocation of resources, and facilitating timely and targeted management for those who need the relevant interference urgently.
Similar content being viewed by others
Introduction
Coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), was initially identified at the end of 2019 and subsequently spread internationally as a global pandemic. Individuals who were infected with SARS-CoV-2 can exhibit a broad spectrum of disease severity, ranging from asymptomatic or mild infection to critical illness with multi-organ failure1. By the end of November 2023, over 770 million confirmed cases and over 6.9 million deaths had been reported worldwide2. Initially, COVID-19 was recognized as a simple respiratory tract infection. However, it is now acknowledged as a systemic disease that profoundly affects various systems, including the hematopoietic and immune systems3.
Artificial intelligence has been widely applied in various aspects of human life. In recent years, many scholars have applied machine learning in medicine, particularly in pursuing medical event prediction accuracy. Early research mainly used traditional statistical analysis methods, employing univariate analysis to obtain single risk factors, resulting in relatively limited predictive accuracy. The introduction of machine learning technology can effectively eliminate interference factors, improve prediction accuracy, assist healthcare professionals in promptly identifying high-risk patients and increase the accuracy of severity prediction. Many scholars have developed machine learning models to predict the severity and hospitalization mortality of COVID-19 patients, helping to identify those at high risk of mortality4,5,6. Some of the models exhibited bias in the stratification of subgroups with varying mortality risks. Sharifi-Kia developed a machine learning-based mortality prediction model for COVID-19 patients with a history of smoking in the Iranian population7. Zhuang established a model to differentiate between severe and non-severe COVID-19 patients among the elderly8. In addition to erlderly, having underlying health was also a risk factor for COVID-19-related mortality.
Patients with chronic kidney disease (CKD) were more susceptible to developing severe COVID-19 and had a higher risk of mortality. Compared to the general population, CKD patients were prone to be affected by COVID-199,10. Additionally, compared to individuals with other chronic diseases such as diabetes, coronary heart disease, hypertension, and asthma, CKD patients have a considerably higher risk of mortality due to COVID-19 attacks11,12,13,14. Therefore, timely identification and accurate assessment of disease severity of COVID-19-affected CKD patients facilitates proper clinical decisions. Unfortunately, COVID-19 infection prediction models specifically tailored for the unique population of patients with CKD are not available currently. Therefore, we aimed to develop a machine learning-based prediction model for COVID-19 patients with CKD.
In this study, we retrospectively analyzed clinical laboratory data of CKD patients with COVID-19. The objective was to compare multiple machine learning algorithms and identify the optimal method for predicting the likelihood of COVID-19 mortality in CKD patients. Additionally, we used limited data as the validation dataset to assess the model’s practicality using the same predictive variables and outcome endpoints. We aimed to construct a predicting model that could provide personalized early treatment recommendations for CKD patients at a higher risk of mortality when they contract COVID-19. Our results demonstrated that the final selected model was a valuable tool to assist physicians in their clinical decision-making process.
Methods
Research participants
The clinical data were collected from CKD patients hospitalized in The Second Hospital of Dalian Medical University due to COVID-19 from December 2022 to January 2023. Inclusion criteria: (1) age ≥ 50; (2) meeting the CKD diagnostic criteria; (3) all were first-infected with COVID-19. Exclusion criteria: (1) acute kidney injury at the time of admission; (2) combined with concurrent malignant tumors; (3) post-transplant patients. According to the principles of the propensity score matching (PSM) method, 89 patients of death and 84 patients of survival were included, matched 1:1 based on age, sex, history of alcohol consumption and history of smoking. (PSM is a statistical method used in observational studies to enhance the comparability of clinical indicators among research subjects, thereby reducing bias from confounding factors. PSM utilizes covariates (such as age and gender) to estimate the propensity score for each patient, followed by matching between the mortality group and the survival group to ensure similarity in baseline characteristics. Ultimately, by comparing the matched groups, researchers can more accurately assess the factors influencing survival rates.) The flowchart for the study procedure was shown in Fig. 1.
Definitions of CKD and COVID-19
The diagnosis and staging of CKD were referred to the well-recognized criteria15,16. CKD was defined by an estimated glomerular filtration rate (eGFR) < 60 mL/min/1.73 m2 or an albumin-to-creatinine ratio (ACR) ≥ 30 mg/g, sustained for more than three months17,18. eGFR was calculated using the Chronic Kidney Disease Epidemiology Collaboration equation(CKD-EPI) 202119. CKD was classified into five stages based on the revised Kidney Disease: Improving Global Outcomes (KDIGO) staging criteria20. COVID-19 was confirmed through virus detection by real-time reverse transcriptase-polymerase chain reaction (RT-PCR) analysis of nasal or pharyngeal swab samples combined with chest computerized tomography if necessary21.
Clinical and laboratory data
Their demographics, clinical features, and laboratory routine test data at admission were retrieved from the lab information system. The clinical test data included routine complete blood count analysis including hemoglobin, leukocytes, platelets, absolute neutrophil, and lymphocyte counts. Serum biochemical tests (including renal and liver functions), D-dimer, Cardiac Troponin I, C-reactive protein, Glucose and indicators in arterial blood gas analysis were also collected22.
This retrospective study has obtained approval from the Ethics Committee of the Second Affiliated Hospital of Dalian Medical University. Informed consent was waived by the Ethics Committee of the Second Affiliated Hospital of Dalian Medical University. This study was conducted according to the guidelines outlined in the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) for reporting observational studies23. The endpoint event was the occurrence of death in the hospital. During the data collection and classification process, all the data were de-identified.
Feature selection
We utilized the least absolute shrinkage and selection operator (LASSO) logistic regression analysis to rank the importance of the potential risk factors. Lasso regression functioned as a data dimension-reducing tool by rejecting parameters with lower calculated coefficients24,25. It was useful for important variable selection and reducing the number of variables for model construction24. In LASSO regression, the beta coefficients of variables that not strongly associated with the outcomes would approach zero. These variables will be automatically removed. The degree of variable-reducing in Lasso regression was decided by the lambda value. A large lambda value imposed a strong penalty on the model with more variables, leading to left a model with a relatively smaller number of variables. To set a rational cut-off value of lambda, the ten-fold cross-validation procedure was applied to minimize the mean squared error (MSE) of lambda26. By selecting the optimal lambda value corresponding to the minimal MSE of 0.066 (Fig. 2A), 13 variables were retained. The confidence of the parameter selection in the LASSO model was confirmed by using 10-fold cross-validation, which yielded a minimum error (Fig. 2B).
The selection of potential risk factors by LASSO. (A) Plot of the LASSO coefficients. Each curve in the graph represented a coefficient change trajectory for an independent variable. The vertical axis represented the values of the coefficients. The bottom horizontal axis represented the log(λ), and the top horizontal axis represented the number of variables with non-zero coefficients. (B) Cross-validation error curve. The vertical axis represented Binomial Deviance, which could be understood as the magnitude of the model’s error. The bottom horizontal axis represented log(λ) as in (A), and the top horizontal axis showed the number of variables with non-zero coefficients in the model. The aim was to select a model with fewer variable features and minimal error. The left dotted line indicated the boundary of the acceptable errors and the right one indicated the boundary of acceptable feature numbers.
Machine learning (ML) algorithms and model establishment
We employed five algorithms to construct the prediction models, including XGBoost, logistic regression, light gradient-boosting machine (LightGBM), random forest and AdaBoost. The features confirmed by LASSO regression were fed to machine learning algorithms to predict the occurrence of death. For each modeling process, we implemented five-fold cross-validation. The training set was randomly divided into five groups, with four groups selected as the training groups to train the model, and the left one served as the internal validation set to evaluate model performance.
Model evaluation and optimization
For each model, the training-validation process was repeated 5 times to evaluate the robustness of the model. Each model’s performance was evaluated by receiver operating characteristic (ROC) curve, calibration curve and decision curve analysis (DCA). The areas under the ROC curves (AUCs) were utilized to measure the discrimination ability of the corresponding prediction models. Calibration curves were conducted to assess how much the model-predicted events were closer to the actual events. Decision curve analysis (DCA) was used to evaluate the utility and net benefit of each model for decision-making. To avoid model overfitting, once the best model was determined, we optimized variable selection based on the feature importance of the optimal model. Ultimately, the top eight ranked variables were selected for constructing the final model. The Shapley Additive exPlanations (SHAP) analysis was subsequently employed to illustrate the significance of each retained parameter. Additionally, we compared the diagnostic performance of the predictive model with the traditional risk assessment tool, APACHE II.
Statistical analysis
All statistical analyses were conducted using R software for Windows (Version 3.6.3, https://www.r-project.org/) and the Beckman Coulter DxAI platform (https://www.xsmartanalysis.com/beckman/login/). Categorical variables were presented as frequencies with percentages, and continuous variables were presented as mean ± standard deviation (SD) or median with interquartile range (IQR). Continuous data were tested for normality using the Kolmogorov-Smirnov test. The Mann-Whitney U test and t-test were adopted when appropriate to compare the means between continuous variables. Categorical variables were cross-tabulated to examine the independence among them. The χ2 test or Fisher’s exact test was used as necessary. All the reported p-values were two-tailed tested. P < 0.05 was regarded as statistically significant.
Results
Baseline characteristics
A total of 173 CKD patients with COVID-19, including 89 dead patients and 84 discharged patients, were included in the model construction process. Their demographic characteristics and clinical features are listed in Table S1. Arameters showing significant differences between the non-survival and survival groups were mostly the biochemical items and the blood cell counting parameters (Table S1).
Model construction and validation
The potential risk factors obtained from LASSO regression were input into each ML algorithm (XGBoost, logistic regression, LightGBM, random forest, AdaBoost). The validation results of the five models are presented in Table 1. The ROC curves in Fig. 3A showed that the LightGBM had an outstanding discriminative ability, achieving the highest AUC score among the five models. The forest plot indicated that the AUC of LightGBM was significantly higher than the other four models (Fig. 3B). In the calibration plots of the five models, LightGBM achieved the lowest score of 0.168 among the studied models, The Brier score, which measures the difference between the predicted and actual probabilities, ranging from 0 to 1, suggested higher consistency with lower score (Fig. 3C). The decision curves of the five models, as shown in Fig. 3D, positioned LightGBM almost consistently above the treat-none line”. (DCA evaluated the net benefit of using a model for clinical decision-making compared to no treatment. If the model curve was above the baseline, it indicated that using the model could provide higher net benefits). Collectively, based on the comprehensive evaluation using multiple statistical methods, LightGBM exhibited excellent discrimination ability in predicting the risk for COVID-19-related CKD death.
To avoid model overfitting, the top 8 features with the highest importance in the LightGBM model were selected for final refinement. The ROC curve in Fig. 3E displayed that the LightGBM model showed a good discriminative ability (AUC = 0.833, sensitivity = 0.952, specificity = 0.714) with the optimal threshold of 53.70%. The F1 score was 0.889, indicating that the LightGBM model was highly accurate27. This conclusion was further supported by the calibration curve, suggesting that the risk predicted by the model was close to the actual risk (Fig. 3F).
Model interpretation
Figure 4 illustrates the ROC curves and AUC values for these biomarkers that show significant values in predicting severe COVID-19 among CKD patients. Among them, the prealbumin demonstrated the highest efficiency (AUC = 0.737), followed by neutrophil percent and count (AUC = 0.710 and 0.700). Through the statistical evaluation of AUC, the individual performance of each biomarker fell short when compared to the LightGBM machine learning model (Table 2).
We utilized SHAP scores and summary plots to visualize the significance of each feature in model prediction and interpretability. According to the SHAP analysis of the LightGBM model, serum prealbumin level was one of the main clinical parameters activating the model predictions, other relevant parameters included NP, RI, P50, CO2, Glucose, NC and UA. The summary plot displayed the top eight features ranked in descending order based on their contribution to the model (Fig. 5A). PA negatively correlated to the death event. RI, P50, CO2 and Glucose were positively linked to death event. NC and UA did not show a significant correlation (Fig. 5B).
Contribution of the 8 parameters for LightGBM model. (A) The relative contribution of each parameter. (B) Relationship of each parameter to the death event. On the y-axis, parameters were arranged in descending order of importance. On the x-axis, each patient was represented by a distinct dot, and the x-axis position of each dot reflected the corresponding feature’s SHAP value for that particular row. Higher values were colored in red, and lower values were colored in blue. The red SHAP values to the right of the zero point moved the model towards predictions of death, while the blue SHAP values to the left of the zero point moved the model away from predictions of death. For example, with prealbumin levels being red on the left side, lower prealbumin levels drove the prediction toward a death outcome. Similarly, with Percentage Neutrophils Percentage, RI in Arterial Blood, P50, CO2 and GLU being red on the right side, they drove the prediction towards a death outcome. The influence of Neutrophils and UA on outcome prediction could not be determined due to the lack of a clear correlation between their observed and predicted values. P50 half-saturation pressure of oxygen, CO2 carbon dioxide in serum, Glu glucose, UA uric acid.
Comparison of diagnostic performance between the LightGBM model and the traditional risk assessment tool APACHE II
Acute Physiology and Chronic Health Evaluation (APACHE) II was one of the most widely used risk scoring systems, particularly recognized as an effective clinical tool for predicting the mortality of patients with COVID-19. It has demonstrated excellent performance in predicting mortality among COVID-19 patients in the ICU28,29,30,31.
We evaluated the diagnostic accuracy of the LightGBM model in comparison to APACHE II, employing various evaluation metrics (Table 3). The AUC for APACHE II was 0.662. It was lower than the LightGBM model’s AUC of 0.833. Figure 6 illustrated the ROC curves of various models in the studied populations. Apparently, the performance of the LightGBM model was superior to APACHE II.
External validation
To further validate the prediction model, a cohort of 46 patients (including 19 CKD patients who died from COVID-19) was collected from February to December 2023, who met the same inclusion criteria as previously tested. Limited by the finite number of specimens in the validation set, the LightGBM model showed the values: AUC = 0.675, sensitivity = 0.706, and specificity = 0.655 (Fig. 7). The prediction accuracy was 63% with a positive likelihood ratio of 1.7, indicating that the prediction results were acceptable (Table 4).
Discussion
COVID-19 has spread globally, resulting in a pandemic and placing significant strain on healthcare systems, highlighting the need for predictive risk models to prioritize patients and facilitate early intervention. Individuals suffering from CKD have an elevated risk of contracting COVID-19 and are more prone to developing severe illness. The mortality rate for CKD patients is as high as 50.6%, which surpasses that of the general population22. Recent investigations have revealed unexpected adverse clinical outcomes and elevated mortality rates among COVID-19 patients with CKD22,32,33. Therefore, to develop early prognostic tools to stratify the risk of poor outcomes is of utmost importance.
In this study, we have successfully developed a prediction model for COVID-19-related mortality in patients with CKD, demonstrating high discrimination capabilities with impressively high AUC values of 0.833, F1 score of 0.889, as well as a remarkable sensitivity and accuracy of 0.952 and 0.743. The calibration plot demonstrated strong consistency between the predicted and actual observed outcomes, as indicated by the low Brier score. This proved that our prediction model could accurately forecast the mortality risk of CKD patients who contracted COVID-19 in real clinical practice. DCA is utilized to evaluate the clinical utility and advantages of the predictive model. The results indicated that the model had significant clinical benefits, potentially reducing unnecessary interventions and improving the accuracy of mortality risk prediction. This early prediction model enabled clinicians to rapidly identify CKD patients with a higher risk of COVID-19-related mortality, providing crucial insights into predicting disease severity and patient survival rates. It was determined that a prediction probability threshold of 53.70% yielded optimal results for risk stratification. If the probability exceeded this threshold, it indicated that patients faced a higher risk of morality. Therefore, proactive targeted interventions can be implemented in advance to effectively manage and mitigate the risks faced by these high-risk patients.
Some studies have reported that “the aggregate index of systemic inflammation (AISI), (neutrophils * monocytes * platelets/lymphocytes)”34, “the systemic immune inflammation index (SII) (platelet counts * neutrophil counts/lymphocyte counts)”22 and so on were able to indicate disease mortality in COVID-19 patients with CKD. In this study, 8 clinical parameters were proved to be closely related to mortality, and their visual representation was done through SHAP analysis. It showed that lower levels of PA, higher NP, RI in arterial blood, P50, CO2, and Glu pushed the model’s prediction toward death. Reports have indicated that SARS-CoV-2 infection leads to a significant decrease in serum PA levels35. It was thought to be the result of the activated inflammation response and the excessively released cytokines. Lower serum PA concentrations were significantly associated with the severity and mortality rate of COVID-1936. CKD patients suffer from protein loss through the kidneys, and COVID-19 further exacerbates the decrease in serum PA, which worsens the malnutrition status of CKD patients. Both lowered PA and malnutrition were negative prognostic factors for COVID-1937. Neutrophils act as the primary cellular defense against infections, and play a crucial role in safeguarding the airway. For severe SARS-CoV-2 infection, there was a notable involvement of neutrophils within the lungs to protect the airway epithelium38. Elevated levels of immature neutrophils were a distinctive feature of severe COVID-19 infection and were associated with aggravating systemic inflammation3,39,40. The initial neutrophil count offered valuable insights for evaluating the severity of COVID-19 in hospitalized patients39. The contribution of neutrophils to the LightGBM model could be ascribed to the presence of a pro-inflammatory state accompanied by impaired innate and adaptive immunity due to the uremic environment which was often found in CKD patients. Impaired renal function could increase the vulnerability of neutrophils to hyper-inflammation and cytokine storm caused by SARS-CoV-2 infection, resulting in severe illness and a higher risk of mortality13. This conclusion could be proved by what had been observed in patients with type 2 diabetes, affected concurrently by COVID-19. Increased white blood cell counts and lymphopenia in COVID-19 patients were frequently encountered41. As for CKD patients who unfortunately succumbed to COVID-19, elevated leukocyte and neutrophil counts, along with decreased lymphocytes were also evident34,42. Notably, nearly all COVID-19 patients experienced an elevation in neutrophil count and a reduction in lymphocytes during severe periods, leading to an increased percentage of neutrophils43. The oxygen tension at which 50% of hemoglobin is saturated with oxygen, is known as P50. Some researchers have pointed out that the median P50 value in patients who died showed a more pronounced right shift compared to survivors, indicating an impaired oxygen binding capacity44. Patients with a higher mortality risk tended to have a higher P50 value. Additionally, a lower bicarbonate level also contributed to a higher mortality rate in COVID-19 patients45. Generally, measurements of total CO2 using ion-selective electrodes or enzymatic methods typically yield similar results to those of venous HCO3- concentrations46. Recent investigations have revealed a correlation between elevated blood glucose levels and severe COVID-19 infection. Increased blood glucose was very common in COVID-19 patients and was an indicator of worse outcomes even for non-diabetes patients47. Intensive monitoring of blood glucose levels facilitated the pursuit of a good prognosis for patients with COVID-1948.
A notable advantage of our study was the prospective validation of the developed prediction model, which distinguishes it from other studies. Furthermore, our model utilized laboratory test results in those patients obtained at initial admission, which did not increase the burden on patients and was easy to implement. Most importantly, the previous studies primarily focused on clinical features such as age, gender, and symptoms, using traditional statistical analysis methods. Nevertheless, our research focused on molecular biomarkers that reflect disease states and incorporated more advanced techniques such as machine learning algorithms (XGBoost, logistic regression, LightGBM, random forest, AdaBoost), greatly improving the predictive accuracy of our models. Our model can serve as an excellent decision support tool for clinicians, allowing them to accurately assess a patient’s prognosis, adjust treatment plans in real-time, and prevent and manage complications in advance, thereby improving patient survival optimizing healthcare resource utilization, and ultimately improving patient outcomes.
Limitations
However, it was crucial to acknowledge that there were somethe limitations in our study. Firstly, it was a single-center retrospective study with a relatively small sample size, which might restrict the generalizability of our findings. Second, the study population was limited to Chinese patients, so it remains unknown whether the model can be effectively extended to other populations. These limitations highlight the need for further research and validation in larger multicenter studies involving diverse patient populations. To assess the extrapolation and generalization of our model, larger sample sizes and independent datasets are needed for comprehensive validation.
In further research, we will expand the participation by incorporating data from additional sources and collecting sufficient external validation datasets to enhance the model’s applicability, robustness and reliability. Furthermore, we will optimize the model based on these findings, rectify its deficiencies, and continuously improve its predictive capabilities. Our study provided promising insights into prediction model development for COVID-19-related mortality in patients with CKD, and further research efforts are required to optimize its utility for clinical decision-making.
Conclusions
To our knowledge, this is the first attempt to create a predictive model to assess the mortality of CKD patients with COVID-19. We developed and calibrated five different prediction models and evaluated their performance using various statistical metrics such as ROC curve analysis, the Brier score, and decision curve analysis. Based on these evaluations, we identified the best-performing model with robust performance and practical utility. By providing personalized predictions for COVID-19-related mortality in CKD patients, this model enabled tailored risk assessment for each individual, making it an essential tool for screening those who were at elevated risk of COVID-19 death based on the findings of our study. By implementing this predictive approach, healthcare providers could prioritize interventions, allocate resources efficiently, and deliver timely and targeted care to those who need it the most.
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Healey, C., Simmons, S., Manivannan, C. & Ro, Y. Visual analytics for the coronavirus COVID-19 pandemic. Big Data 10, 95–114. https://doi.org/10.1089/big.2021.0023 (2022).
Liu, X. et al. Metabolomics acts as a powerful tool for comprehensively evaluating vaccines approved under emergency: A CoronaVac retrospective study. Front. Immunol. 14, 1168308. https://doi.org/10.3389/fimmu.2023.1168308 (2023).
Peng, J. et al. Diagnostic value of peripheral hematologic markers for coronavirus disease 2019 (COVID-19): A multicenter, cross-sectional study. J. Clin. Lab. Anal. 34, e23475. https://doi.org/10.1002/jcla.23475 (2020).
Mohammad Mehdi, B. et al. Unraveling complex relationships between COVID-19 risk factors using machine learning based models for predicting mortality of hospitalized patients and identification of high-risk group: A large retrospective study. Front. Med. 10, 1170331. https://doi.org/10.3389/fmed.2023.1170331 (2023).
Khadijeh, M., Mostafa, S., Zahra, M.-T. & Hadi, K.-A. Comparing machine learning algorithms for predicting COVID-19 mortality. BMC Med. Inform. Decis. Mak. 22, 2. https://doi.org/10.1186/s12911-021-01742-0 (2022).
Kai, Z. et al. Eleven routine clinical features predict COVID-19 severity uncovered by machine learning of longitudinal measurements. Comput. Struct. Biotechnol. J. 19, 3640–3649. https://doi.org/10.1016/j.csbj.2021.06.022 (2021).
Ali, S.-K., Azin, N. & Abbas, S. Machine learning-based mortality prediction models for smoker COVID-19 patients. BMC Med. Inform. Decis. Mak. 23, 129. https://doi.org/10.1186/s12911-023-02237-w (2023).
Zhenchao, Z., Yuxiang, Q., Yimin, Y. & Ying, Y. A predictive model for disease severity among COVID-19 elderly patients based on IgG subtypes and machine learning. Front. Immunol. 14, 1286380. https://doi.org/10.3389/fimmu.2023.1286380 (2023).
Ting, W. et al. Multi-organ dysfunction in patients with COVID-19: A systematic review and meta-analysis. Aging Dis. 11, 874. https://doi.org/10.14336/ad.2020.0520 (2020).
Sarnak, M. & Jaber, B. Pulmonary infectious mortality among patients with end-stage renal disease. Chest 120, 1883–1887. https://doi.org/10.1378/chest.120.6.1883 (2001).
Ge, J., Pletcher, M. J., Lai, J. C., N3C Consortium. Outcomes of SARS-CoV-2 infection in patients with chronic liver disease and cirrhosis: A national COVID cohort collaborative study. Gastroenterology 161, 1487–1501. https://doi.org/10.1053/j.gastro.2021.07.010 (2021).
Heald, A. H. et al. SARS-CoV-2, diabetes and mortality: month by month variation in mortality rate from June 2020 to June 2021. Cardiovasc. Endocrinol. Metab. 11, e0258. https://doi.org/10.1097/XCE.0000000000000258 (2022).
Williamson, E. et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature 584, 430–436. https://doi.org/10.1038/s41586-020-2521-4 (2020).
Ron, T. G. & Luuk, B. H. CKD is a key risk factor for COVID-19 mortality. Nat. Rev. Nephrol. 16, 705–706. https://doi.org/10.1038/s41581-020-00349-4 (2020).
Andrassy, K. M. Comments on ‘KDIGO 2012 clinical practice guideline for the evaluation and management of chronic kidney disease’. Kidney Int. 84, 622–623. https://doi.org/10.1038/ki.2013.243 (2013).
Charles, C. & Ferris, A. H. Chronic kidney disease. Prim. Care 47, 585–595. https://doi.org/10.1016/j.pop.2020.08.001 (2020).
Paul, E. S. & Adeera, L. Evaluation and management of chronic kidney disease: synopsis of the kidney disease: Improving global outcomes 2012 clinical practice guideline. Ann. Intern. Med. 158, 825–830. https://doi.org/10.7326/0003-4819-158-11-201306040-00007 (2013).
Edmund, J. L., Andrew, S. L. & Paul, E. S. The kidney disease improving global outcomes (KDIGO) guideline update for chronic kidney disease: Evolution not revolution. Clin. Chem. 59, 462–465. https://doi.org/10.1373/clinchem.2012.184259 (2013).
Andrew, S. L. et al. A new equation to estimate glomerular filtration rate. Ann. Intern. Med. 150, 604–612. https://doi.org/10.7326/0003-4819-150-9-200905050-00006 (2009).
Paul, M. et al. Executive summary of the KDIGO 2022 clinical practice guideline for the prevention, diagnosis, evaluation, and treatment of hepatitis C in chronic kidney disease. Kidney Int. 102, 1228–1237. https://doi.org/10.1016/j.kint.2022.07.012 (2022).
Chung, E. Y. M. et al. Incidence and outcomes of COVID-19 in people with CKD: A systematic review and meta-analysis. Am. J. Kidney Dis. 78, 804–815. https://doi.org/10.1053/j.ajkd.2021.07.003 (2021).
Ozdemir, A., Kocak, S., Karabela, S. & Yılmaz, M. Can systemic immune inflammation index at admission predict in-hospital mortality in chronic kidney disease patients with SARS-CoV-2 infection?. Nefrologia 42, 549–558. https://doi.org/10.1016/j.nefroe.2021.09.009 (2022).
von Elm, E. et al. The Strengthening the reporting of observational studies in epidemiology (STROBE) statement: Guidelines for reporting observational studies.
Frndak, S. et al. Reducing the complexity of high-dimensional environmental data: An analytical framework using LASSO with considerations of confounding for statistical inference. Int. J. Hyg. Environ. Health 249, 114116. https://doi.org/10.1016/j.ijheh.2023.114116 (2023).
Variable selection and regularization in quantile regression via minimum covariance determinant based weights. https://doi.org/10.3390/e23010033.
Frndak, S. et al. Reducing the complexity of high-dimensional environmental data: An analytical framework using LASSO with considerations of confounding for statistical inference. Int. J. Hyg. Environ. Health 249, 114116. https://doi.org/10.1016/j.ijheh.2023.114116 (2023).
Hicks, S. A. et al. On evaluation metrics for medical applications of artificial intelligence. Sci. Rep. 12, 5979. https://doi.org/10.1038/s41598-022-09954-8 (2022).
Salomão, F., Rita, S., Patrícia, P. & Carlos, P. Validation of the acute physiology and chronic health evaluation (APACHE) II score in COVID-19 patients admitted to the intensive care unit in times of resource scarcity. Cureus 15, e34721. https://doi.org/10.7759/cureus.34721 (2023).
Erin, M. W. et al. Severity of illness scores at presentation predict ICU admission and mortality in COVID-19. J. Emerg. Crit. Care Med. 5, 7. https://doi.org/10.21037/jeccm-20-92 (2021).
Vaidas, V. et al. Comparison of mortality risk evaluation tools efficacy in critically ill COVID-19 patients. BMC Infect. Dis. 21, 1–7. https://doi.org/10.1186/s12879-021-06866-2 (2021).
Xiaojing, Z. et al. Acute physiology and chronic health evaluation II score as a predictor of hospital mortality in patients of coronavirus disease. Crit. Care Med. 48, e657–e665 (2019).
Risk stratification of patients admitted to hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: development and validation of the 4C Mortality Score. J BMJ 371, 4334. https://doi.org/10.1136/bmj.m4334 (2020).
Yamada, T. et al. Patients with chronic kidney disease have a poorer prognosis of coronavirus disease 2019 (COVID-19): An experience in New York City. Int. Urol. Nephrol. 52, 1405–1406. https://doi.org/10.1007/s11255-020-02494-y (2020).
Ercan, Z. et al. The aggregate index of systemic inflammation may predict mortality in COVID-19 patients with chronic renal failure. Eur. Rev. Med. Pharmacol. Sci. 27, 3747–3752. https://doi.org/10.26355/eurrev_202304_32173 (2023).
Guo, X. et al. Serum prealbumin deserves more significance in the early triage of COVID-19 patients. Clin. Chem. Lab. Med. 58, e209–e211. https://doi.org/10.1515/cclm-2020-0663 (2020).
Zinellu, A. & Mangoni, A. Serum prealbumin concentrations, COVID-19 severity, and mortality: A systematic review and meta-analysis. Front. Med. 8, 638529. https://doi.org/10.3389/fmed.2021.638529 (2021).
Zinellu, A. & Mangoni, A. A. Serum prealbumin concentrations, COVID-19 severity, and mortality: A systematic review and meta-analysis. Front. Med. (Lausanne) 8, 638529. https://doi.org/10.3389/fmed.2021.638529 (2021).
Camp, J. & Jonsson, C. A role for neutrophils in viral respiratory disease. Front. Immunol. 8, 550. https://doi.org/10.3389/fimmu.2017.00550 (2017).
Kåsine, T. et al. Neutrophil count predicts clinical outcome in hospitalized COVID-19 patients: Results from the NOR-Solidarity trial. J. Intern. Med. 291, 241–243. https://doi.org/10.1111/joim.13377 (2022).
Qin, C. et al. Dysregulation of immune response in patients with coronavirus 2019 (COVID-19) in Wuhan, China. Clin. Infect. Dis. 71, 762–768. https://doi.org/10.1093/cid/ciaa248 (2020).
Alawadi, F. et al. Risk and predictors of severity and mortality in patients with type 2 diabetes and COVID-19 in Dubai. World J. Diabetes 14, 1259–1270. https://doi.org/10.4239/wjd.v14.i8.1259 (2023).
Menez, S. et al. Evaluation of plasma biomarkers to predict major adverse kidney events in hospitalized patients with COVID-19. Am. J. Kidney Dis. 82, 322-332.e321. https://doi.org/10.1053/j.ajkd.2023.03.010 (2023).
Borges, L., Pithon-Curi, T., Curi, R. & Hatanaka, E. COVID-19 and neutrophils: The relationship between hyperinflammation and neutrophil extracellular traps. Mediat. Inflamm. 2020, 8829674. https://doi.org/10.1155/2020/8829674 (2020).
Ceruti, S. et al. Reply to Böning et al. Comment on Ceruti et al. Temporal changes in the oxyhemoglobin dissociation curve of critically Ill COVID-19 patients. J. Clin. Med. 11, 788. https://doi.org/10.3390/jcm11154547 (2022).
Anushiravani, A. et al. A reliable prognostic marker for liver dysfunction in COVID-19 infection. Middle East J. Dig. Dis. 13, 193–199. https://doi.org/10.34172/mejdd.2021.225 (2021).
Kraut, J., Lew, V. & Madias, N. Re-evaluation of total CO2 concentration in apparently healthy younger adults. Am. J. Nephrol. 48, 15–20. https://doi.org/10.1159/000489965 (2018).
Khunti, K. et al. COVID-19, hyperglycemia, and new-onset diabetes. Diabetes Care 44, 2645–2655. https://doi.org/10.2337/dc21-1318 (2021).
Chen, J., Wu, C., Wang, X., Yu, J. & Sun, Z. The impact of COVID-19 on blood glucose: A systematic review and meta-analysis. Front. Endocrinol. 11, 574541. https://doi.org/10.3389/fendo.2020.574541 (2020).
Funding
This research was funded by the National Natural Science Foundation of China, Grant Number 82002211.
Author information
Authors and Affiliations
Contributions
LL conceived and designed the study. SY conducted the data collection. LL and SY wrote the manuscript. CY and PG critically reviewed the manuscript. All authors have read and approved the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
This retrospective study has obtained approval from the Ethics Committee of the Second Affiliated Hospital of Dalian Medical University. Informed consent was waived by the Ethics Committee of the Second Affiliated Hospital of Dalian Medical University. The study was conducted by the relevant guidelines and regulations.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Luo, L., Gao, P., Yang, C. et al. Predictive modeling of COVID-19 mortality risk in chronic kidney disease patients using multiple machine learning algorithms. Sci Rep 14, 26979 (2024). https://doi.org/10.1038/s41598-024-78498-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-024-78498-w
Keywords
This article is cited by
-
Machine learning for predicting all-cause mortality of metabolic dysfunction-associated fatty liver disease: a longitudinal study based on NHANES
BMC Gastroenterology (2025)
-
A robot process automation based mobile application for early prediction of chronic kidney disease using machine learning
Discover Applied Sciences (2025)