Predicting E-Commerce Revenue with SHAP Insights: A Comparative Study of SMOTE-Enhanced Machine Learning Models
Main Article Content
Abstract
This study explores on the revenue prediction of e-commerce by applying cutting-edge machine learning techniques and Explainable AI (XAI) frameworks. The class imbalance in the dataset online_shoppers_intention was treated using the Synthetic Minority Over-sampling Technique, SMOTE. The performance of the various models, such as XGBoost (XGBst), Random Forest (RndF), Logistic Regression (L-Reg), Support Vector Machine (SupVM), Decision Tree (D-Tree), k-Nearest Neighbors (kNeigh), Gradient Boosting (GradBst), and a Voting Classifier(VotClf) ensemble were extensively investigated using various performance metrics. GridSearchCV hyperparameter tuning was employed along with feature scaling, and cross-validation to achieve optimal performance of models. The results are compared with as well as without the application of SMOTE. RndF classifier with SMOTE gave the best accuracy of 92.45%, precision of 91.02%, recall of 94.22%, F1-score of 92.59% and AUC-ROC of 97.88% was noted without SMOTE. An XAI model, SHAP, was employed to make the classification model transparent and identify the features contributing to revenue-generation.