Skip Navigation
Skip to contents

J Korean Acad Nurs : Journal of Korean Academy of Nursing

OPEN ACCESS

Articles

Page Path
HOME > J Korean Acad Nurs > Ahead-of print articles > Article
Research Paper
Development of a machine learning-based prediction model for early hospital readmission after kidney transplantation: a retrospective study
Hye Jin Chong1orcid, Ji-hyun Yeom2orcid

DOI: https://doi.org/10.4040/jkan.25030
Published online: November 21, 2025

1Department of Nursing, Sunchon National University, Suncheon, Korea

2Division of Nephrology, Jeonbuk National University Hospital, Jeonju, Korea

Corresponding author: Hye Jin Chong Department of Nursing, Sunchon National University, 255 Jungang-ro, Suncheon 57922, Korea E-mail: hyejin@scnu.ac.kr
• Received: March 11, 2025   • Revised: September 28, 2025   • Accepted: September 28, 2025

© 2025 Korean Society of Nursing Science

This is an Open Access article distributed under the terms of the Creative Commons Attribution NoDerivs License (http://creativecommons.org/licenses/by-nd/4.0) If the original work is properly cited and retained without any modification or reproduction, it can be used and re-distributed in any format and medium.

  • 75 Views
  • 8 Download
  • Purpose
    This study aimed to develop and validate a machine learning-based prediction model for early hospital readmission (EHR) post-kidney transplantation.
  • Methods
    The study was conducted at the organ transplantation center of a university hospital, utilizing data from 470 kidney transplant recipients. We built and trained four machine learning models and tested them to identify the strongest EHR predictors. Predictive performance was evaluated using confusion matrices and the area under the receiver operating characteristic curve (ROC AUC).
  • Results
    Among the 470 kidney transplant recipients with a mean age of 46.1±15.30 years, 322 (68.5%) were males, and 74 (15.7%) were readmitted within 30 days after kidney transplantation. In total, 241 (51.2%) recipients were found to have experienced EHR after applying the random over-sampling examples method. The random forest model achieved the best performance, with an ROC AUC of .87 (validation set) and .82 (test set). The 15 most important features were steroid pulse therapy (recipient), cerebrovascular accident (recipient), heart failure (recipient), male sex (donor), cardiovascular disease (recipient), weekend discharge (recipient), peritoneal dialysis (recipient) cerebrovascular accident as the cause of brain death (donor), current smoker (recipient), cardiac arrest (donor), previous kidney transplantation (recipient), age (donor), hypertension (donor), male sex (recipient), and dialysis duration (recipient).
  • Conclusion
    Our framework demonstrated strong predictive interpretability. It can support appropriate and effective clinical decision-making by assisting transplant professionals in stratifying recipients based on their risk of EHR. prioritizing post-discharge care and follow-up for high-risk individuals, and allocating targeted interventions such as closer monitoring or education.
Kidney transplantation (KT) is the optimal renal replacement therapy option for patients with end-stage renal disease (ESRD) compared with dialysis [1]. Successful KT reduces morbidity and mortality in these patients, improves quality of life, and is a cost-effective alternative to dialysis [2].
Early hospital readmission (EHR), defined as an unplanned hospitalization within 30 days post-KT discharge [3], is a prevalent and significant issue. Approximately 30% of KT recipients in the United States and Korea are readmitted within this period, with several European cohorts reporting rates of 20%–35% [4,5]. This is considerably higher than the 4%–15% observed for other surgical procedures. EHR in KT recipients is associated with markedly worse outcomes, including increased healthcare costs, a two‐fold higher risk of graft failure, a three‐fold rise in subsequent readmissions, and up to a 75% increase in mortality [6-11]. Consequently, reducing EHR remains a critical priority for transplant healthcare providers and systems.
Several factors contribute to the elevated readmission risk. KT is frequently performed on patients with compromised baseline health due to ESRD, and many recipients have pre‐existing comorbidities, including diabetes, hypertension, and cardiovascular disease (CVD) [12,13]. The long‐term burden of chronic illness and associated frailty heightens postoperative vulnerability. Post-discharge, KT recipients must follow complex immunosuppressive regimens, which increase the risks of infection, metabolic disturbances, and drug‐related toxicities [12]. Additionally, donor-related and procedural factors contribute to post-transplant complications and elevated EHR risk. Donor-specific factors include age, cause of death, and cerebrovascular history, while transplant process characteristics include cold ischemic time, induction therapy, and duration of intensive care unit (ICU) stay [7]. Policymakers and health insurance services generally use the 30-day EHR rate as an important proxy for evaluating hospital quality due to its strong correlation with mortality [14]. Common causes of EHR within this period include infections, acute rejection episodes, surgical complications, fluid imbalances, and adverse effects from immunosuppressive therapy, all typically occurring in the early postoperative phase post-KT [6,15].
Previous studies indicate that many EHRs are preventable through timely and coordinated interventions, including early outpatient recipient follow-up, medication reconciliation, and targeted recipient education [16]. They emphasize the importance of identifying high-risk recipients who could benefit from these interventions. However, specific risk factors contributing to EHR remain inconsistently reported, with many studies based on the US healthcare systems, limiting their applicability to Korea due to differences in clinical practice, healthcare access, and recipient management protocols [17]. Population-specific evidence on Korean KT recipients also remains limited. Therefore, identifying risk features within the Korean clinical setting is critical for guiding risk-based surveillance and improving recipient outcomes.
Prediction models have been employed for risk assessment in healthcare settings in the past [6]. These multivariable logistic regression models facilitate early identification of individuals at risk of illness or adverse events, enabling effective interventions for those who could benefit the most from identifying specific risk factors. However, only a few studies attempt to predict EHR post-KT, reporting a low accuracy of .61–.69 [18]. These data are frequently restricted to surveillance datasets of varying quality and limited granularity. Prior studies also focus exclusively on isolated features associated with EHR post-KT, thereby limiting their impact [19,20].
Notably, machine learning (ML) methods have gained traction in healthcare for outcome prediction and clinical decision support due to their ability to model complex, non-linear relationships in structured clinical data [21]. Compared with traditional linear models, including logistic regression, ML approaches provide enhanced predictive accuracy for hospital readmission by utilizing flexible algorithms that capture intricate interactions among features [22]. ML methods are particularly advantageous because: (1) They automatically detect non-linear relationships and higher-order interactions without requiring a predefined model structure [23], which is vital for predicting EHR post-KT, where multiple recipient, donor, and perioperative factors interact. (2) ML facilitates the incorporation of numerous features. In our study, these features were selected based on prior literature and clinical relevance. Although feature selection was not entirely automated, the ML models effectively evaluated each feature’s significance and optimized prediction performance by their relative contributions. ML models are typically scalable to large datasets. However, our dataset includes 470 transplant cases, which is smaller than national registry cohorts but remains statistically adequate. It includes comprehensive, multi-dimensional clinical features across recipient, donor, and transplant process domains. This sample size is comparable to prior ML-based transplantation studies [24] and supports predictive modeling with meaningful interpretation and internal validation.
Therefore, the study aimed to address previous research limitations [18,25] that primarily utilize linear models to identify EHR risk factors in KT recipients. We applied ML techniques to Korean kidney transplant data to predict EHR risk. Besides recipient-related features, this study incorporated donor-specific factors and the transplant process characteristics. Here, complex factors refer to the combined influence of donor, recipient, and procedural features on EHR. While ML methods can model interactions, our analysis focused on identifying key predictors using feature importance from algorithms including decision tree (DT), random forest (RF), extreme gradient boosting (XGBoost), and support vector machine (SVM). The developed models may support risk stratification and early intervention for preventing EHR in KT recipients [26].
1. Data extraction and content
This retrospective observational cohort was derived from the electronic medical record (EMR) system of a single university transplant center in South Korea and reported following the Strengthening the Reporting of Observational Studies in Epidemiology guidelines. The inclusion criteria were (1) KT recipients aged ≥18 years; (2) those who underwent KT between January 1, 2000, and December 31, 2022; and (3) those who received follow-up care at the study institution. Recipients were excluded if they were lost to follow-up within 30 days.
Data were extracted from the EMR system between April 1 and August 31, 2024. A researcher with prior experience in transplant data abstraction and serving as a KT coordinator retrieved the data from the hospital’s transplant center. A board-certified transplant nephrologist reviewed all data to ensure clinical validity.
The primary outcome was unplanned rehospitalization within 30 days of discharge from the index hospitalization during which KT was performed. Only unplanned hospital readmissions were included, defined as the first unexpected inpatient admission, including those occurring through the emergency room. Planned readmissions—including protocol biopsies, scheduled follow-up procedures, or elective admissions—were excluded from the outcome’s definition to focus on clinically relevant, unanticipated events.
Features were selected based on our clinical experience and prior research [3,6,27-29]. The features included in the model were initially selected for their clinical relevance and further refined through consultation with a multidisciplinary team comprising a transplant physician, nurse, and transplant coordinator. They were organized as follows: (1) donor‑related characteristics (e.g., age, sex, and cause of brain death), (2) recipient‑related characteristics (e.g., body mass index, number of the human leukocyte antigen mismatch, and dialysis modality and duration), and (3) transplant process factors (e.g., cold ischemic time, induction immunosuppression, and delayed graft function) (Table 1).
2. Statistical analyses
Clinically relevant features were selected and statistically assessed for associations with EHR using appropriate parametric and non-parametric tests. All statistical analyses were performed using STATA ver. 18.0 (Stata Corp., 2023) and R software ver. 4.3.2 (R Core Team, 2023; https://www.R-project.org/). Descriptive statistics were generated using means and standard deviations for continuous features, and frequencies and percentages for categorical features. To compare the predictive features between recipients readmitted within 30 days post-KT and those not readmitted within 30 days, the t-test and chi-square or Fisher’s exact test were used for continuous and categorical features, respectively. A two-sided p-value <.05 was considered statistically significant for all tests.
3. Preprocessing

1) Handling missing data

A notable proportion of cases were labeled as “unknown” for some categorical variables, such as diabetes mellitus, cardiac arrest, and panel reactive antibody (PRA). Rather than removing these observations or applying imputation, we retained “unknown” as a distinct categorical variable since it may represent clinically relevant uncertainty or an unrecorded status in real-world settings. This approach preserves sample integrity and reflects potential underlying patterns in clinical data. Tree-based models, including RF and XGBoost, are known to handle such categories robustly without requiring explicit imputation.
We excluded participants with >10% missing data. Among the 650 KT recipients, 470 were included in the primary analysis. Continuous features were imputed using cohort means, and categorical features using cohort modes, based on the following rationale: (1) residual missingness was low, (2) single‑value imputation preserves sample size, and (3) subsequent tree‑based algorithms are inherently robust to minor variance distortion introduced by mean/mode substitution.

2) Data encoding and scaling

Categorical features were handled differently depending on each ML algorithm’s requirements. They were specified as factors in R for DT, RF, and SVM, enabling each model to process them without explicit encoding. Additionally, categorical features were one-hot encoded since XGBoost does not directly support them. For continuous features, neither standardization nor normalization was applied since tree-based models (e.g., RF, DT, and XGBoost) do not require feature scaling. ML algorithms were employed to predict EHR post-KT since they are robust against issues including overfitting and collinearity [30]. Random over-sampling examples (ROSE) resampling was applied only within the training folds during cross-validation (CV). Stratified sampling based on EHR status was used to preserve the original distribution of the target feature.

3) Class balance

To address class imbalance—where only 15.7% (74/470) of recipients experienced EHR—we applied the ROSE method to the training set alone. ROSE synthetically increased the minority class prevalence to approximately 51%, while preserving the overall feature distribution [31,32]. This technique generates synthetic data points for the minority class using a smoothed resampling approach, which is more robust than simple duplication and helps prevent overfitting [30].
4. Model construction
We developed four ML models to predict EHR after KT—DT, RF, XGBoost, and SVM with a radial basis function kernel (SVM-RBF)—chosen for their ability to model non-linear relations in structured clinical data. Because SVM is distance-based, continuous variables were min–max scaled as described in preprocessing. To balance complexity and generalization, we searched the following hyperparameter ranges: DT cp 0.001–0.05, RF mtry 3–10, SVM-RBF C ∈ {0.1, 1, 10} with sigma fixed at 0.01, and for XGBoost η ∈ {0.3, 0.4}, max_depth 1–3, subsample {0.50, 0.75, 1.00}, up to 140 boosting rounds. Hyperparameters were tuned via grid search, with the cross-validation (CV) protocol, model-selection criteria, and performance results detailed in Model Evaluation. XGBoost grid outcomes are summarized in Supplementary Figure 1, and CV summaries in Supplementary Table 1.
5. Model evaluation
We adopted a leakage-free evaluation protocol following model construction. The dataset was split once into training/validation/test sets (80%/10%/10%) using stratified sampling by EHR status. All hyperparameter tuning and model selection were performed only on the training data via stratified 10-fold CV with grid search; all preprocessing and any class-imbalance handling were fit within the training folds and applied to the hold-out sets without refitting.
The primary metric was the receiver operating characteristic curve (ROC AUC), and secondary metrics were the precision–recall curve (PRC AUC), F1-score, accuracy, sensitivity, and specificity. For thresholded metrics, a single probability threshold was fixed by maximizing the mean F1 across training–CV folds and then held constant for validation and test. Uncertainty for ROC/PRC AUCs was quantified with nonparametric bootstrap (1,000 resamples) on the validation and test sets. CV summaries are reported in Supplementary Table 1.
For XGBoost, grid search evaluated learning rate (η ∈ {0.3, 0.4}), max_depth (1–3), subsample (0.50, 0.75, 1.00), and the number of boosting iterations (≥100). The highest cross-validated ROC AUC (approximately .78) was obtained at η=0.3, max_depth=2, and subsample=1.00 (see Supplementary Figure 1), which guided the final hyperparameters used in subsequent analyses. Across models, RF and XGBoost were retained as the top performers; on the validation/test sets they achieved accuracy around .70–.79, F1-score of .79, and ROC AUC between .79 and .87, as summarized in Table 2.
6. Feature importance analysis
Permutation feature importance was computed from RF and XGBoost to identify major predictors of EHR. Additionally, a shallow DT was fitted as an illustrative surrogate model for visualization (Figure 1). Each node within the model processes data based on specific features, while the leaf nodes present the final prediction outcomes, indicating EHR status with “yes.”
7. Ethical considerations
The Institutional Review Board of the National Hospital approved this study and granted a waiver of informed consent (No. 2024-01-057-002) since it involved the use of existing, de-identified retrospective data without direct interaction with or additional risk to the participants.
1. Demographic and clinical characteristics
Among the 470 identified KT recipients, 74 (15.7%) were readmitted within 30 days post-KT. The original imbalanced samples were upsampled using the ROSE method, resulting in 241 readmitted (51.2%) and 229 non-readmitted (48.8%) recipients.
The mean post-KT hospital stay was 19 days, with 92.3% of recipients discharged on weekdays (Table 1). Baseline characteristics were compared between the two groups based on the EHR status (Table 1). Differences were observed between the groups in terms of recipient discharge days (weekdays or weekends), duration of ICU stay post-KT surgery, and administration of steroid pulse therapy (all p<.05).
2. Model evaluation
Table 2 summarizes each ML model’s performance metrics based on validation and test datasets. Overall, the RF and XGBoost models demonstrated the highest predictive performance, while the SVM and DT models showed relatively lower but interpretable outcomes.
The RF model achieved the highest accuracy on validation (.79; 95% CI, 0.63–0.89) and test (.79; 95% CI, 0.65–0.90) sets. It exhibited balanced sensitivity (.83) and specificity (.74) in the validation set, along with an ROC AUC of .87 and a high PRC AUC of .90 (95% CI, 0.78–0.97), indicating strong discrimination and precision–recall balance. The XGBoost model showed slightly lower performance in the validation set (accuracy, .70; 95% CI, .55–.83; ROC AUC=.79), but matched the RF model in test set accuracy (.79; 95% CI, .65–.90). Its test set ROC AUC was .81, and PRC was .82 (95% CI, .66–.92), suggesting that despite potential sensitivity to overfitting during training, it demonstrated strong generalizability in external evaluation. The SVM model demonstrated moderate classification performance [33], as evidenced by its validation and test accuracies of .66 (95% CI, .51–.79) and .67 (95% CI, .52–.80), respectively, which were lower than those of the RF and XGBoost models (both .79) but higher than those of the DT model (.60) in the test set. Its ROC AUC scores were .74 (validation) and .72 (test), and the PRC AUC was .76 (95% CI, .56–.90), indicating fair but suboptimal discriminative ability. According to prior benchmarks [34], ROC AUC values between .70 and .80 are typically interpreted as acceptable or moderate in classification tasks involving imbalanced datasets. The SVM provided stable but less discriminative performance than the tree-based models, supporting its classification as a moderately effective model for EHR prediction in this context. While the DT model showed moderate performance in the validation set (accuracy, .72; 95% CI, .57–.84; ROC AUC=.73) [33], it experienced a notable performance drop on the test set (accuracy, .60; 95% CI, .45–.74; ROC AUC=.61). Sensitivity declined markedly to .40, although specificity remained relatively high (.83). The PRC AUC of .74 (95% CI, .53–.90) confirmed its limited generalizability compared to other models.
As shown in Table 2 and Supplementary Figure 1, model performance was compared across four classifiers. Among them, the RF model achieved the most stable and superior performance across validation and test datasets, recording the highest ROC AUC (validation=.87, test=.82) and F1-score (.79). These results formed the basis for selecting RF as the primary model for further interpretation and feature importance analysis.
3. Feature importance
Figure 1 depicts the DT model’s structure used to predict EHR post-KT. The root node begins with steroid pulse therapy, and each subsequent split is determined by key clinical features, including the brain-dead donor due to cerebrovascular accident (CVA), donor sex, discharge day (weekend vs. weekday), heart failure, and dialysis modality. Each node presents the predicted outcome (EHR=1 or 0), associated probability of readmission, and population proportion represented. Blue and red nodes indicate predicted (EHR=1) and no (EHR=0) readmission, respectively. This tree structure visually illustrates how combinations of clinical conditions step-wisely stratify risk.
The first criterion was whether the recipient had received pulse steroid therapy. If a recipient did not receive this therapy (value=0), the probability of not being readmitted was 44%, representing 85% of the sample. However, readmission probability increased to 90% if a recipient received steroid pulse therapy (value=1), representing 15% of the sample. The second criterion was whether the brain-dead donor was due to CVA. Among recipients who did not receive steroid pulse therapy, those whose recipients did not become brain-dead donor due to CVA (value=0) showed a 38% probability of avoiding readmission, representing 63% of the sample. Conversely, readmission probability increased to 64% if the brain-dead donor was due to CVA (value=1). The third criterion was donor sex. Among recipients with a brain-dead donor due to CVA, the readmission probability was 83% if the donor was female (value=0), representing 13% of the sample. If the donor was male (value=1), the probability of not being readmitted was 34%. The fourth criterion was discharge day (weekend vs. weekday). Readmission probability was 34% if the recipient was discharged on a weekend (value=1) but increased to 69% if discharged on a weekday (value=0). For cases involving weekend discharge (value=1), the DT advanced to the next split based on heart failure status. Readmission probability was 61% if the recipient had heart failure (value=1). If the recipient did not have heart failure (value=0), the DT proceeded to another split, where readmission probability increased to 65% if the recipient had received peritoneal dialysis (value=1).
We analyzed the importance of various features using the RF and XGBoost models, both of which demonstrated relatively high accuracy in predicting EHR post-KT. Figure 2 illustrates the importance of the best-performing RF and XGBoost models. Feature importance analysis in the RF model showed that steroid pulse therapy was the most influential feature, with a feature importance score of .12, greatly impacting the model’s predictions. Other critical features included CVA (donor), heart failure (recipient), male sex (donor), and CVD (recipient), all of which played a vital role in predicting readmission. Since RF assesses feature importance by evaluating numerous DTs, these features could be key decision points across multiple trees. Similarly, analysis of the XGBoost model identified steroid pulse therapy as the most influential feature, aligning with the findings of the RF model.
The importance of steroid pulse therapy in the XGBoost model was even more pronounced, with a feature importance score of .19, compared with .12 in the RF model, indicating a relatively greater weight placed on this variable during prediction. This difference reflects the boosting mechanism of XGBoost, which iteratively emphasizes difficult-to-classify instances, thereby amplifying the contribution of key features such as steroid pulse therapy. Additional significant features included cold ischemic time (donor), the number of any other comorbid conditions (recipient), age (recipient), and dialysis duration (recipient). The XGBoost model, which utilizes a boosting algorithm, builds trees iteratively by correcting errors from previous iterations, resulting in a higher importance being assigned to some features.
Collectively, the results of the feature importance analysis from both models suggested that key features, including steroid pulse therapy, brain-dead donor due to CVA, recipient comorbid conditions, and dialysis-related factors, play a critical role in predicting EHR post-KT. Although RF and XGBoost assess differently, both models identified similar feature importance. A detailed analysis of specific features may further aid in improving the developed models’ performance.
We developed and validated an ML-based model to predict EHR post-KT using a dataset from Korean KT recipients. The RF model tested on the test set demonstrated high predictive performance (ROC AUC=.82) and identified 15 key clinical predictors, including steroid pulse therapy, brain-dead donor due to CVA, and donor-related characteristics. These findings suggest that integrating this model into clinical workflows enhances individualized risk assessment and supports risk-based post-transplant care planning, particularly for high-risk populations. The early and data-driven identification of at-risk recipients can enable more precise allocation of follow-up resources, targeted nursing interventions, and closer clinical monitoring.
EHR occurred in 15.7% of KT recipients in this study, a rate slightly lower than those reported in studies from the United States (18%–47%) [12,15,18,25,35], Canada (22.4%) [24], and Korea (approximately 30%) [15,36]. Although direct comparisons are limited by differences in healthcare systems and recipient populations, variations in the operational definition of EHR likely contribute to these discrepancies. We strictly defined EHR as the first unplanned hospital readmission within 30 days of discharge from the index admission during which KT was performed, regardless of the number or type of admissions [24,25]. Previous studies have defined EHR more broadly or included planned readmissions (such as protocol biopsies or delayed procedures), leading to inflated reported rates [3]. Therefore, our more conservative definition may partly explain the lower EHR rate.
To our knowledge, this is the first study to develop and validate an ML-based prediction model for EHR using clinical data from Korean KT recipients. Our RF model achieved an ROC AUC of .82 with accuracies of .79 on the validation and test datasets, respectively. Although our dataset (N=470) is modest compared to national registries, it represents one of the most comprehensive Korean datasets incorporating recipient- and donor-level features. These findings have important implications for guiding risk-based follow-up strategies and personalized post-transplant care in Korean KT recipients. ML algorithms, including DT, RF, XGBoost, and SVM, were employed because of their ability to handle complex and non-linear clinical data. Compared with traditional regression methods, RF and XGBoost aggregate multiple DTs to optimize classification and assess feature importance, a feature proven effective in predicting readmission in transplant-specific and general medical populations [18,37]. For example, a recent US-based study involving >2,000 KT recipients reported a 30.7% EHR rate within 30 days and demonstrated that ML-based models outperformed logistic regression in risk prediction [18]. Our model achieved slightly higher predictive performance in a Korean context despite using fewer input features. This demonstrates that even localized datasets can yield meaningful and generalizable predictions to support early clinical decision-making.
Although traditional statistical approaches, including logistic regression, have been used to identify risk factors, they assume linear relationships and cannot effectively capture complex, non-linear interactions among multiple predictors [21,38]. However, ML methods, including RF and XGBoost, provide several advantages [39]. First, they require no assumptions about feature distributions or linearity and are less sensitive to multicollinearity. Second, these methods can automatically detect and incorporate higher-order interactions that might go unnoticed in traditional models [38]. Third, they are generally more robust to noise and overfitting, particularly when combined with techniques such as CV and bootstrap aggregation [40]. Recent studies have reinforced ML’s clinical utility for predicting readmission risk in transplant populations. Arenson et al. [18] developed models based on clinical notes and EHR data in KT recipients that outperformed traditional models. Similarly, Orfanoudaki et al. [41] highlighted the role of metabolic features, including glucose variability, in readmission risk. These findings reinforce ML’s clinical utility for accurate prediction and interpretability in identifying the most influential clinical features.
Here, RF and XGBoost yielded more stable and accurate predictions than logistic regression, providing interpretable rankings of predictor importance that can facilitate practical clinical implementation. Therefore, ML-based models can enhance risk stratification for EHR by identifying high-risk KT recipients who might otherwise be overlooked, ultimately supporting the delivery of tailored interventions and improved post-transplant care. Although the DT model showed limited generalizability with lower test-set performance, the RF and XGBoost models demonstrated more consistent and robust results. In particular, RF achieved stable accuracy across validation and test datasets, with balanced sensitivity, specificity, and F1-scores. Both RF and XGBoost maintained superior ROC AUC values and relatively narrow confidence intervals, supporting their reliability for clinical application [33,42,43].
In our study, feature importance analysis identified the top 15 EHR predictors in the highest-performing RF model. This EHR prediction model underscores the value of EHR in risk prediction by integrating features related to transplant complications, including the administration of steroid pulse therapy. It also incorporates recipient-specific factors, including pre-existing conditions (e.g., heart failure and CVD), demographics (e.g., age and sex), smoking status, and aspects of hospital management such as discharge timing (e.g., weekend discharges). Donor-related characteristics (e.g., age and sex), cause of brain death (e.g., CVA), and underlying conditions (e.g., hypertension) were incorporated to improve predictive accuracy.
Notably, steroid pulse therapy consistently ranked highest in both models, emphasizing its strong association with EHR risk—likely reflecting the occurrence of acute rejection episodes, which require such therapy and contribute to post-transplant complications. Other high-ranking features included donor-related factors (e.g., donor age, CVA as the cause of brain death, and cold ischemic time) and recipient characteristics (comorbidity burden, dialysis duration, ICU stay, and weekend discharge). These results underscore the complex, multifactorial nature of EHR risk and highlight the potential of ML-based models to uncover subtle, non-linear relationships that traditional statistical methods may miss. Although many of these variables—including donor characteristics and pre-existing comorbidities—are non-modifiable, the model provides valuable support for early identification of KT recipients at higher EHR risk. These predictors are statistically important in the model performance context; however, their inclusion does not imply causal relationships with EHR. For instance, variables including steroid pulse therapy likely reflect underlying clinical conditions (e.g., acute rejection) that precede readmission, rather than independently causing it. Accordingly, the model utility lies in risk stratification and informed decision-making, rather than directly guiding causal intervention. The model may assist in prioritizing surveillance and education strategies for at-risk patients from a nursing perspective, although such applications should be cautiously interpreted and in conjunction with clinical judgment.
Several previous studies predicted EHR post-KT using a limited set of features, typically restricted to recipient-related clinical factors or administrative data [15,24,25]. For instance, McAdams-DeMarco et al. [25] analyzed demographic and comorbidity data, while Lubetzky et al. [15] focused on discharge-level factors without fully integrating donor characteristics. By incorporating pre- and peri-transplant data, our broader feature set enabled the ML model to uncover complex interactions usually missed by traditional methods. Therefore, our study provides a more robust and generalizable predictive framework that can inform targeted interventions across different stages of transplant care.
EHR has been associated with various adverse post-transplant outcomes, including inferior graft function, increased mortality, and diminished quality of life [25]. Although our study focused on identifying EHR predictors rather than establishing causal pathways, acknowledging the downstream clinical consequences of EHR emphasizes the value of early risk stratification. For instance, while steroid pulse therapy is commonly administered for acute rejection, and acute rejection is frequently observed among readmitted recipients, these relationships should be interpreted as associative rather than causal [24,44,45]. A recent Canadian study reported a higher incidence of rejection among recipients who experienced EHR [24], suggesting that transplant-related complications coincide with early readmission events. Therefore, predictive models—when carefully interpreted—may support individualized monitoring and intervention strategies that could contribute to improved long-term outcomes. Donor characteristics, including donor age or type (e.g., living vs. deceased donor and donation after brain death vs. circulatory death), are strongly associated with rehospitalization [6]. However, the recipient characteristics were the most important predictors using the RF model in our study. These findings may not apply to other countries owing to disparities in organ acceptance policies and healthcare delivery systems [12,25,46].
Nevertheless, transplant programs worldwide, including in Korea, have recently expanded their recipient and donor pools to encompass more medically complex individuals, such as those classified as expanded criteria donors [47]. Consistent with previous regression studies conducted in the United States and Canada, our ML model identified the recipient’s age and pre-existing comorbidities (e.g., heart disease) as factors associated with EHR post-KT. This finding aligns with reports by Famure et al. [24] and McAdams-DeMarco et al. [25], who demonstrated that older age and a higher comorbidity burden were significantly associated with increased readmission risk in a Canadian and the US cohort, respectively. As the KT recipient population ages and presents with a higher prevalence of comorbid conditions, this aspect is likely to become increasingly significant [48]. This is an important factor to consider in the future, particularly in Korea, which is entering an era of super-aging.
Early risk prediction is essential for effectively preventing readmission. Ideally, preventive measures should be implemented pre-discharge. Our study models aimed to predict EHR risk in KT recipients at an early stage. Our results will contribute to individualized evaluations, education, and treatment of KT recipients. Notably, our study builds on existing predictive factors and provides new insights that can improve KT outcomes and enhance the quality of life of Korean KT recipients. The integration of EHR risk assessment systems into electronic health record platforms can be achieved by developing medical record-based prediction models. EHR alert systems can be designed for transplant care, enabling healthcare professionals, including transplant specialists, nurses, and physicians, to effectively allocate preventive interventions to KT recipients identified as high risk for EHR [49].
This study has some limitations. First, the predictive model was developed using retrospective data from a single university hospital without external validation, limiting its generalizability. External validation using multicenter datasets is necessary to ensure the model’s robustness, reproducibility, and broader applicability. Second, although the model outperformed traditional logistic regression, key predictors, including steroid pulse therapy, raise concerns regarding clinical interpretability. Since steroid pulse therapy is typically administered in response to complications, including acute rejection, it may reflect a consequence of clinical deterioration rather than a true pre-discharge risk factor, making it less actionable for early intervention or discharge planning. Third, the study did not differentiate between the types, causes, or timing of readmissions—including preventable vs. non-preventable events—limiting the ability to tailor interventions or evaluate their effectiveness. Important post-discharge determinants of readmission, including outpatient follow-up adherence, social support systems, socioeconomic status, and patient self-management behaviors, were not included in the model, potentially omitting critical predictors of early readmission. Therefore, future research should incorporate a broader range of pre- and post-discharge variables, validate the model across diverse clinical environments, and explore cause-specific readmission patterns to enhance ML-based prediction tools’ clinical utility and generalizability in KT care. Finally, one limitation of our approach is that categorical entries labeled as “unknown” (e.g., PRA and diabetes mellitus) were retained as separate levels in the model rather than being excluded or imputed. While this approach mirrors clinical uncertainty and enables the model to learn from missingness patterns, it may affect interpretability and model calibration. Future studies should explore alternative preprocessing strategies for unknown categories, including exclusion, multiple imputation, or sensitivity analyses to compare predictive performance and stability [50].
The models developed in this study can assist healthcare professionals—particularly transplant nurses and coordinators—in stratifying KT recipients by their risk of EHR, prioritizing post-discharge care and follow-up for high-risk individuals, and allocating targeted interventions, including closer monitoring or education when implementing early interventions to prevent EHR in KT recipients, particularly within the transplant care context. This algorithm facilitates the enrollment of recipients in EHR prevention programs and provides targeted interventions for various cohorts of KT recipients.

Conflicts of Interest

No potential conflict of interest relevant to this article was reported.

Acknowledgements

The authors thank Professor Sik Lee (Nephrology, MD, PhD) at the Jeonbuk National University.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (No. RS-2023-00241842). The sponsor had no involvement in study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication.

Data Sharing Statement

The data that support the findings of this study are available on reasonable request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

Supplementary Data

Supplementary data to this article can be found online at https://doi.org/10.4040/jkan.25030.

Supplementary Table 1. Cross-validation results for each model

jkan-25030-Supplementary-Table-1.pdf

Supplementary Figure 1. Cross-validation results for XGBoost hyperparameter tuning. The figure displays the cross-validated receiver operating characteristic (ROC) scores under various combinations of hyperparameters, including the number of boosting iterations, subsample ratios (0.50, 0.75, 1.00), learning rates (eta: 0.3, 0.4), and max tree depths (1, 2, 3). The x-axis represents the number of boosting iterations and the y-axis indicates the ROC score, which measures the ability of the model to distinguish between classes. Each panel corresponds to a specific combination of subsample and eta values, with lines representing different tree depths.

jkan-25030-Supplementary-Figure-1.pdf

Author Contributions

Conceptualization or/and Methodology: HJC. Data curation or/and Analysis: HJC. Funding acquisition: HJC. Investigation: HJC, JHY. Project administration or/and Supervision: HJC. Resources or/and Software: HJC, JHY. Validation: HJC. Visualization: HJC. Writing original draft or/and Review & Editing: HJC. Final approval of the manuscript: HJC.

Fig. 1.
Decision tree for predicting early hospital readmission (EHR) after kidney transplantation. A value of “yes” indicates patients who were readmitted within 30 days (EHR=1, blue nodes), while “no” indicates non-readmitted cases (EHR=0, red nodes). CVA, cerebrovascular accident; PD, peritoneal dialysis.
jkan-25030f1.jpg
Fig. 2.
Comparison of feature importance between random forest and XGBoost models. Feature importance for early hospital readmission prediction using random forest (A) and extreme gradient boosting (XGBoost) (B). Steroid pulse therapy was the strongest predictor in both models, followed by cerebrovascular accident (CVA), comorbid conditions, and dialysis duration. Comorbid condition; the number of any other comorbid disease. CIT, cold ischemic time; CVA, cerebrovascular accident; BMI, body mass index; CVD, cardiovascular disease; d/t, due to; HLA, human leukocyte antigen; HTN, hypertension; ICU, intensive care unit; KT, kidney transplantation; PD, peritoneal dialysis; Tx, therapy.
jkan-25030f2.jpg
Table 1.
Comparison of baseline characteristics between KT recipients with and without early hospital readmission (N=470)
Characteristic Total (N=470) Readmission within 30 days (n=74) No readmission within 30 days (n=396) p
Donor‑related characteristics
 Body mass index (kg/m2) 24.23±3.14 23.76±2.95 24.31±3.17 .144
 Age (yr) 46.1±15.30 48.5±15.80 45.7±15.20 .170
 Sex .482
  Male 246 (52.3) 42 (56.8) 204 (51.5)
  Female 224 (47.7) 32 (43.2) 192 (48.5)
 Cardiac arrest .343
  No 142 (30.2) 28 (37.8) 114 (28.8)
  Yes 72 (15.3) 11 (14.9) 61 (15.4)
  Unknown 256 (54.5) 35 (47.3) 221 (55.8)
 Cause of brain death .061
  Accident 121 (25.7) 18 (24.3) 103 (26.0)
  Cerebrovascular accident 104 (22.1) 25 (33.8) 79 (19.9)
  Others 245 (52.1) 31 (41.9) 214 (53.0)
 Diabetes mellitus .103
  No 422 (89.8) 66 (89.1) 356 (89.9)
  Yes 22 (14.7) 1 (1.4) 21 (5.3)
  Unknown 26 (5.5) 7 (9.5) 19 (4.8)
 Hypertension .232
  No 375 (79.8) 55 (74.3) 320 (80.8)
  Yes 69 (14.7) 12 (16.2) 57 (14.4)
  Unknown 26 (5.5) 7 (9.5) 19 (4.8)
 Creatinine (mg/dL) 1.07±0.70 1.13±0.70 1.06±0.70 .453
Recipient‑related characteristics
 Duration of dialysis (yr) 3.60±4.30 3.85±3.90 3.56±4.40 .562
 Human leukocyte antigen mismatch 3.56±1.40 3.63±1.40 3.55±1.40 .639
 Body mass index (kg/m2) 22.55±4.40 22.84±3.60 22.50±4.60 .471
 Post-KT admission day 19.00±7.80 19.35±10.60 18.93±7.20 .747
 Discharge .005
  Weekday 434 (92.3) 62 (83.8) 372 (93.9)
  Weekend 36 (7.7) 12 (16.2) 24 (6.1)
 Age (yr) 47.1±12.00 49.5±12.70 46.6±11.80 .074
 Sex .549
  Male 322 (68.5) 48 (64.9) 274 (69.2)
  Female 148 (31.5) 26 (35.1) 122 (30.8)
 KT type .867
  Deceased donor KT 312 (66.4) 48 (64.9) 264 (66.7)
  Living donor KT 158 (33.6) 26 (35.1) 132 (3.3)
 ABO-incompatible transplantation .291
  No 426 (90.6) 70 (94.6) 356 (89.9)
  Yes 44 (9.4) 4 (5.4) 40 (10.1)
 Heart failure .194
  No 412 (87.7) 61 (82.4) 351 (88.6)
  Yes 58 (12.3) 13 (17.6) 45 (11.4)
 Lung disease .805
  No 445 (94.7) 71 (95.9) 374 (94.4)
  Yes 25 (5.3) 3 (4.1) 22 (5.6)
 Cardiovascular disease .052a)
  No 445 (94.7) 74 (100) 371 (93.7)
  Yes 25 (5.3) 0 (0) 25 (6.3)
 Cerebrovascular accident .260a)
  No 421 (89.6) 69 (93.2) 352 (88.9)
  Yes 49 (10.4) 5 (6.8) 44 (11.1)
 Peripheral vascular disease .682a)
  No 443 (94.3) 71 (95.9) 372 (93.9)
  Yes 27 (5.7) 3 (4.1) 24 (6.1)
 History of orthopedic surgery >.999a)
  No 450 (95.7) 71 (95.9) 379 (95.7)
  Yes 20 (4.3) 3 (4.1) 17 (4.3)
 No. of any other comorbid condition .661
  0 340 (72.4) 49 (66.2) 291 (73.5)
  1 120 (25.5) 23 (31.1) 97 (24.4)
  2 8 (1.7) 2 (2.7) 6 (1.5)
  3 1 (0.2) 0 (0) 1 (0.3)
  >4 1 (0.2) 0 (0) 1 (0.3)
 Hypertension .117
  No 65 (13.8) 15 (20.3) 50 (12.6)
  Yes 405 (86.2) 59 (79.7) 346 (87.4)
 Smoking habit .629
  Nonsmoker 343 (73.0) 51 (68.9) 292 (73.7)
  Ex-smoker 78 (16.6) 15 (20.3) 63 (15.9)
  Current smoker 49 (10.4) 8 (10.8) 41 (10.4)
 Drinking habit .902
  Does not drink alcohol 371 (78.9) 57 (77.0) 314 (79.3)
  History of drinking 71 (15.1) 12 (16.2) 59 (14.9)
  Current drinking 28 (6.0) 5 (6.8) 23 (5.8)
 Diabetes mellitus .350
  No 305 (64.9) 44 (59.5) 261 (65.9)
  Yes 165 (35.1) 30 (40.5) 135 (34.1)
 Hepatitis B .652
  No 437 (93.0) 70 (94.6) 367 (92.7)
  Yes 29 (6.1) 4 (5.4) 25 (6.3)
  Unknown 4 (0.9) 0 (0) 4 (1.0)
 Cancer .972
  No 441 (93.8) 70 (94.6) 371 (93.7)
  Yes 29 (6.2) 4 (5.4) 25 (6.3)
 Dialysis type .358
  Hemodialysis 429 (91.3) 65 (87.8) 364 (91.9)
  Peritoneal dialysis 41 (8.7) 9 (12.2) 32 (8.1)
 Panel reactive antibody ≥50% .173
  No 311 (66.2) 42 (56.8) 269 (67.9)
  Yes 53 (11.3) 11 (14.8) 42 (10.6)
  Unknown 106 (22.5) 21 (28.4) 85 (21.5)
 Previous KT .748
  No 427 (90.9) 66 (89.2) 361 (91.2)
  Yes 43 (9.1) 8 (10.8) 35 (8.8)
Transplant process factors
 Cold ischemic time (min) 113.73±111.70 122.39±122.20 112.12±111.70
 Delayed graft function
  No 438 (93.2) 65 (87.8) 373 (94.2)
  Yes 32 (6.8) 9 (12.2) 23 (5.8)
 Induction immunosuppression >.999a)
  Bacilimab 411 (87.4) 65 (87.8) 346 (87.4)
  Anti-thyroglobulin 59 (12.6) 9 (12.2) 50 (12.6)
 Intensive care unit stay (day) 4.65±2.30 4.06±2.40 4.76±2.20 .021
 Creatinine at discharge 1.51±1.30 1.54±1.10 1.51±1.30 .808
 Steroid pulse therapy .001
  No 436 (92.8) 57 (77.0) 379 (95.7)
  Yes 34 (7.2) 17 (23.0) 17 (4.3)

Values are presented as mean±standard deviation or number (%) unless otherwise stated.

KT, kidney transplantation.

a)By Fishier exact test.

Table 2.
Model evaluation via cross-validation (N=470)
Accuracy (95% CI) F1-score Sensitivity Specificity ROC AUC PRC AUC (95% CI)a)
Decision tree 0.74 (0.53–0.90)
 Validation set 0.72 (0.57–0.84) 0.51 0.63 0.83 0.73
 Test set 0.60 (0.45–0.74) 0.51 0.40 0.83 0.61
Random forest 0.90 (0.78–0.97)
 Validation set 0.79 (0.63–0.89) 0.79 0.83 0.74 0.87
 Test set 0.79 (0.65–0.90) 0.79 0.76 0.83 0.82
XGBoost 0.82 (0.66–0.92)
 Validation set 0.70 (0.55–0.83) 0.79 0.79 0.61 0.79
 Test set 0.79 (0.65–0.90) 0.79 0.76 0.83 0.81
Support vector machine 0.76 (0.56–0.90)
 Validation set 0.66 (0.51–0.79) 0.68 0.71 0.61 0.74
 Test set 0.67 (0.52–0.80) 0.68 0.68 0.65 0.72

AUC, area under the curve; CI, confidence interval; PRC, precision-recall curve; ROC, receiver operating characteristic curve; XGBoost, extreme gradient boost.

a)PRC 95% CI results obtained from bootstrapping (1,000 repetitions).

Figure & Data

REFERENCES

    Citations

    Citations to this article as recorded by  

      • ePub LinkePub Link
      • Cite
        CITE
        export Copy Download
        Close
        Download Citation
        Download a citation file in RIS format that can be imported by all major citation management software, including EndNote, ProCite, RefWorks, and Reference Manager.

        Format:
        • RIS — For EndNote, ProCite, RefWorks, and most other reference management software
        • BibTeX — For JabRef, BibDesk, and other BibTeX-specific software
        Include:
        • Citation for the content below
        Development of a machine learning-based prediction model for early hospital readmission after kidney transplantation: a retrospective study
        Close
      • XML DownloadXML Download
      Figure
      • 0
      • 1
      We recommend
      Development of a machine learning-based prediction model for early hospital readmission after kidney transplantation: a retrospective study
      Image Image
      Fig. 1. Decision tree for predicting early hospital readmission (EHR) after kidney transplantation. A value of “yes” indicates patients who were readmitted within 30 days (EHR=1, blue nodes), while “no” indicates non-readmitted cases (EHR=0, red nodes). CVA, cerebrovascular accident; PD, peritoneal dialysis.
      Fig. 2. Comparison of feature importance between random forest and XGBoost models. Feature importance for early hospital readmission prediction using random forest (A) and extreme gradient boosting (XGBoost) (B). Steroid pulse therapy was the strongest predictor in both models, followed by cerebrovascular accident (CVA), comorbid conditions, and dialysis duration. Comorbid condition; the number of any other comorbid disease. CIT, cold ischemic time; CVA, cerebrovascular accident; BMI, body mass index; CVD, cardiovascular disease; d/t, due to; HLA, human leukocyte antigen; HTN, hypertension; ICU, intensive care unit; KT, kidney transplantation; PD, peritoneal dialysis; Tx, therapy.
      Development of a machine learning-based prediction model for early hospital readmission after kidney transplantation: a retrospective study
      Characteristic Total (N=470) Readmission within 30 days (n=74) No readmission within 30 days (n=396) p
      Donor‑related characteristics
       Body mass index (kg/m2) 24.23±3.14 23.76±2.95 24.31±3.17 .144
       Age (yr) 46.1±15.30 48.5±15.80 45.7±15.20 .170
       Sex .482
        Male 246 (52.3) 42 (56.8) 204 (51.5)
        Female 224 (47.7) 32 (43.2) 192 (48.5)
       Cardiac arrest .343
        No 142 (30.2) 28 (37.8) 114 (28.8)
        Yes 72 (15.3) 11 (14.9) 61 (15.4)
        Unknown 256 (54.5) 35 (47.3) 221 (55.8)
       Cause of brain death .061
        Accident 121 (25.7) 18 (24.3) 103 (26.0)
        Cerebrovascular accident 104 (22.1) 25 (33.8) 79 (19.9)
        Others 245 (52.1) 31 (41.9) 214 (53.0)
       Diabetes mellitus .103
        No 422 (89.8) 66 (89.1) 356 (89.9)
        Yes 22 (14.7) 1 (1.4) 21 (5.3)
        Unknown 26 (5.5) 7 (9.5) 19 (4.8)
       Hypertension .232
        No 375 (79.8) 55 (74.3) 320 (80.8)
        Yes 69 (14.7) 12 (16.2) 57 (14.4)
        Unknown 26 (5.5) 7 (9.5) 19 (4.8)
       Creatinine (mg/dL) 1.07±0.70 1.13±0.70 1.06±0.70 .453
      Recipient‑related characteristics
       Duration of dialysis (yr) 3.60±4.30 3.85±3.90 3.56±4.40 .562
       Human leukocyte antigen mismatch 3.56±1.40 3.63±1.40 3.55±1.40 .639
       Body mass index (kg/m2) 22.55±4.40 22.84±3.60 22.50±4.60 .471
       Post-KT admission day 19.00±7.80 19.35±10.60 18.93±7.20 .747
       Discharge .005
        Weekday 434 (92.3) 62 (83.8) 372 (93.9)
        Weekend 36 (7.7) 12 (16.2) 24 (6.1)
       Age (yr) 47.1±12.00 49.5±12.70 46.6±11.80 .074
       Sex .549
        Male 322 (68.5) 48 (64.9) 274 (69.2)
        Female 148 (31.5) 26 (35.1) 122 (30.8)
       KT type .867
        Deceased donor KT 312 (66.4) 48 (64.9) 264 (66.7)
        Living donor KT 158 (33.6) 26 (35.1) 132 (3.3)
       ABO-incompatible transplantation .291
        No 426 (90.6) 70 (94.6) 356 (89.9)
        Yes 44 (9.4) 4 (5.4) 40 (10.1)
       Heart failure .194
        No 412 (87.7) 61 (82.4) 351 (88.6)
        Yes 58 (12.3) 13 (17.6) 45 (11.4)
       Lung disease .805
        No 445 (94.7) 71 (95.9) 374 (94.4)
        Yes 25 (5.3) 3 (4.1) 22 (5.6)
       Cardiovascular disease .052a)
        No 445 (94.7) 74 (100) 371 (93.7)
        Yes 25 (5.3) 0 (0) 25 (6.3)
       Cerebrovascular accident .260a)
        No 421 (89.6) 69 (93.2) 352 (88.9)
        Yes 49 (10.4) 5 (6.8) 44 (11.1)
       Peripheral vascular disease .682a)
        No 443 (94.3) 71 (95.9) 372 (93.9)
        Yes 27 (5.7) 3 (4.1) 24 (6.1)
       History of orthopedic surgery >.999a)
        No 450 (95.7) 71 (95.9) 379 (95.7)
        Yes 20 (4.3) 3 (4.1) 17 (4.3)
       No. of any other comorbid condition .661
        0 340 (72.4) 49 (66.2) 291 (73.5)
        1 120 (25.5) 23 (31.1) 97 (24.4)
        2 8 (1.7) 2 (2.7) 6 (1.5)
        3 1 (0.2) 0 (0) 1 (0.3)
        >4 1 (0.2) 0 (0) 1 (0.3)
       Hypertension .117
        No 65 (13.8) 15 (20.3) 50 (12.6)
        Yes 405 (86.2) 59 (79.7) 346 (87.4)
       Smoking habit .629
        Nonsmoker 343 (73.0) 51 (68.9) 292 (73.7)
        Ex-smoker 78 (16.6) 15 (20.3) 63 (15.9)
        Current smoker 49 (10.4) 8 (10.8) 41 (10.4)
       Drinking habit .902
        Does not drink alcohol 371 (78.9) 57 (77.0) 314 (79.3)
        History of drinking 71 (15.1) 12 (16.2) 59 (14.9)
        Current drinking 28 (6.0) 5 (6.8) 23 (5.8)
       Diabetes mellitus .350
        No 305 (64.9) 44 (59.5) 261 (65.9)
        Yes 165 (35.1) 30 (40.5) 135 (34.1)
       Hepatitis B .652
        No 437 (93.0) 70 (94.6) 367 (92.7)
        Yes 29 (6.1) 4 (5.4) 25 (6.3)
        Unknown 4 (0.9) 0 (0) 4 (1.0)
       Cancer .972
        No 441 (93.8) 70 (94.6) 371 (93.7)
        Yes 29 (6.2) 4 (5.4) 25 (6.3)
       Dialysis type .358
        Hemodialysis 429 (91.3) 65 (87.8) 364 (91.9)
        Peritoneal dialysis 41 (8.7) 9 (12.2) 32 (8.1)
       Panel reactive antibody ≥50% .173
        No 311 (66.2) 42 (56.8) 269 (67.9)
        Yes 53 (11.3) 11 (14.8) 42 (10.6)
        Unknown 106 (22.5) 21 (28.4) 85 (21.5)
       Previous KT .748
        No 427 (90.9) 66 (89.2) 361 (91.2)
        Yes 43 (9.1) 8 (10.8) 35 (8.8)
      Transplant process factors
       Cold ischemic time (min) 113.73±111.70 122.39±122.20 112.12±111.70
       Delayed graft function
        No 438 (93.2) 65 (87.8) 373 (94.2)
        Yes 32 (6.8) 9 (12.2) 23 (5.8)
       Induction immunosuppression >.999a)
        Bacilimab 411 (87.4) 65 (87.8) 346 (87.4)
        Anti-thyroglobulin 59 (12.6) 9 (12.2) 50 (12.6)
       Intensive care unit stay (day) 4.65±2.30 4.06±2.40 4.76±2.20 .021
       Creatinine at discharge 1.51±1.30 1.54±1.10 1.51±1.30 .808
       Steroid pulse therapy .001
        No 436 (92.8) 57 (77.0) 379 (95.7)
        Yes 34 (7.2) 17 (23.0) 17 (4.3)
      Accuracy (95% CI) F1-score Sensitivity Specificity ROC AUC PRC AUC (95% CI)a)
      Decision tree 0.74 (0.53–0.90)
       Validation set 0.72 (0.57–0.84) 0.51 0.63 0.83 0.73
       Test set 0.60 (0.45–0.74) 0.51 0.40 0.83 0.61
      Random forest 0.90 (0.78–0.97)
       Validation set 0.79 (0.63–0.89) 0.79 0.83 0.74 0.87
       Test set 0.79 (0.65–0.90) 0.79 0.76 0.83 0.82
      XGBoost 0.82 (0.66–0.92)
       Validation set 0.70 (0.55–0.83) 0.79 0.79 0.61 0.79
       Test set 0.79 (0.65–0.90) 0.79 0.76 0.83 0.81
      Support vector machine 0.76 (0.56–0.90)
       Validation set 0.66 (0.51–0.79) 0.68 0.71 0.61 0.74
       Test set 0.67 (0.52–0.80) 0.68 0.68 0.65 0.72
      Table 1. Comparison of baseline characteristics between KT recipients with and without early hospital readmission (N=470)

      Values are presented as mean±standard deviation or number (%) unless otherwise stated.

      KT, kidney transplantation.

      By Fishier exact test.

      Table 2. Model evaluation via cross-validation (N=470)

      AUC, area under the curve; CI, confidence interval; PRC, precision-recall curve; ROC, receiver operating characteristic curve; XGBoost, extreme gradient boost.

      PRC 95% CI results obtained from bootstrapping (1,000 repetitions).


      J Korean Acad Nurs : Journal of Korean Academy of Nursing
      Close layer
      TOP