Improved Methodology for Breast Cancer Prediction through Integration of Hard Voting Ensemble Classifier on WDBC Data Set

Main Article Content

Archana Singh, Kuldeep Singh Kaswan

Abstract

Introduction: Disease that is prevalent and highly fatal is the breast cancer disease and it affects many people in the world. To effectively prevent the fatality rate caused by breast cancer, tools need to be developed that are capable of early diagnosis and efficient treatment. Researchers and medical experts across the world have pointed out several diagnostic techniques for this sickness; however, higher enhancement of such present methods is still needed to enhance a perfect and effective diagnosis of this disease.


Objective: It’s an objective of this research to establish quick and precise forecasts of breast cancer, which is estimated to rank second as a leading killer of women globally.


Methodology: In this paper, we provide a methodology based on hard voting ensemble classifier that combines three machine learning algorithms: logistic regression, support vector machine and decision tree to diagnose the kind of breast cancer, whether benign or malignant. The proposed model’s performance is evaluated in this study using the Wisconsin Diagnostic Breast Cancer dataset (WDBC), with random oversampling (ROS) being used to balance the dataset and Standard Scaler being used for feature scaling.


Results: The suggested approach achieved an accuracy of 0.9825, a precision of 0.9859, a recall of 0.9859, F1 score of 0.9859and AUC of .9813. Using a 10-fold cross validation it obtained a mean accuracy of .9738.


Conclusions: The suggested approach yielded superior results after the individual classifier and many acknowledged existing works are directly compared with the results.

Article Details

Section
Articles