Evaluating the Impact of Genomic and Clinical Features on Breast Cancer Prognosis Using Deep Learning Approaches
Main Article Content
Abstract
Breast cancer prognosis plays a crucial role in tailoring personalized treatment plans and improving survival rates. This research evaluates the impact of genomic and clinical features on breast cancer prognosis prediction using deep learning techniques, with a specific focus on the METABRIC dataset. The study utilizes a gene expression feature set of 300, 350, and 400 features, selected using the Minimum Redundancy Maximum Relevance (mRMR) algorithm to reduce feature redundancy while retaining high-relevance predictors. A deep neural network (DNN) model was employed to assess the predictive power of the selected features. The architecture was designed to handle the integration of these genomic features efficiently, optimizing for both performance and interpretability. The research also investigates the influence of different optimizer-activation function pairings on model performance, examining how these combinations affect the training stability, convergence, and accuracy of the model. Results were evaluated using a variety of performance metrics, including accuracy, precision, recall, F1 score, and AUC-ROC, providing a comprehensive assessment of the model’s ability to predict breast cancer prognosis. Comparisons between the feature sets showed that the inclusion of additional gene expression features (400 features) slightly improved model performance, though diminishing returns were observed beyond a certain threshold. The best-performing model utilized the Adam optimizer paired with the ReLU activation function, which offered superior convergence rates and robust performance across all metrics. This study highlights the potential of combining genomic data with advanced feature selection techniques and deep learning models to improve prognosis prediction in breast cancer. The findings suggest that a careful balance of feature selection and model optimization is crucial for achieving reliable predictions, with important implications for integrating such models into clinical decision-making systems.