A Knowledge Engineering on Spam Review by using Selected Deductive Learning Approaches
Main Article Content
Abstract
The rapidly growing digital marketplace continuously generates a substantial volume of online reviews, such as those found on Yelp and Amazon. Consumers increasingly adopt the practice of reading previous reviews before making purchasing decisions for various products. This study addresses the critical challenge of detecting spam reviews in e-commerce platforms, a problem that significantly impacts consumer trust and business integrity. We propose a novel comparative analysis of six machine learning models for spam review detection: Naive Bayes Multinomial Updatable (NBMU), Stochastic Gradient Descent (SGD), Classification Via Regression (CVR), PART, Random Tree (RT), and K-Star. Our research contributes a comprehensive evaluation of these models on a large-scale Amazon product review dataset, consisting of 26.7 million reviews, implementing a robust feature selection process to enhance model performance and efficiency. Through in-depth analysis using multiple metrics including accuracy, precision, recall, F-measure, ROC, and PRC, we identify the CVR model as the top performer, achieving 82.70% accuracy, 0.83 precision, 0.82 recall, and 0.90 ROC. Additionally, we provide a detailed error analysis comparing Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Relative Absolute Error (RAE), and Relative Root Squared Error (RRSE) across all models. Our findings demonstrate the superiority of the CVR model in spam review detection, offering a promising approach for e-commerce platforms to maintain review integrity. This research provides valuable insights for both academia and industry in combating online review fraud and enhancing the reliability of e-commerce ecosystems.