Detection of CKD Status using Feature Selection Approach based on Random Forest Classifier

Main Article Content

Shibi Mathai, K.S. Thirunavukkarasu

Abstract

Chronic kidney disease (CKD) is a serious condition that can last a lifetime occurs on by either impaired kidney function or kidney cancer. With CKD, the kidneys are unable to filter blood properly or have totally stopped functioning, which results in the accumulation of toxins in the bloodstream and ultimately killing the patient. Early detection of CKD is probably impossible, and saving a patient's life in the last stages of CKD is extremely challenging. High variance frequently impedes clinical decision-making in the prognosis of chronic disorders, resulting in ambiguity and unfavorable outcomes, particularly in situations like CKD. Early identification and suitable treatment can raise this risk. Techniques for Machine Learning (ML) have become important instruments for improving clinical decision-making and lowering unpredictability. However, because they rely on a small number of biological characteristics, current approaches for CKD identification frequently lack accuracy. This study investigates creatinine levels by assessing estimated Glomerular Filtration Rate (eGFR) and Blood Urea Nitrogen (BUN) derived from serum creatinine through indirect measurement using the dataset's available attributes. The dataset comprises 28 attributes, including "Gender," and encompasses a total of 523 records. This research work suggests label encoder, hot encoder, standard scaler, and iterative imputation for missing values as preprocessing techniques to solve issues in medical datasets. The Boruta method is used for feature selection, and ML algorithms are utilized to create the model. This prediction analysis involved eGFR and BUN computing values for assist in providing an accurate classification of CKD and non-CKD status using proposed Boruta Feature Selection (BFS) technique with Ensemble Based Random Forest Classifier (ERFC). Moreover, the performance evaluation of proposed BFS-ERFC model is compared with Radom Weighted Optimization (RWO) with Neural Network (NN), RWO with Logistic Regression (LR) and ERFC for evaluating the patient's medical records to determine the improved classification status of CKD.

Article Details

Section
Articles