A Comparative Analysis of Machine Learning Imputation Techniques for MAR Missingness

Shweta Tiwaskar

doi:10.52783/anvi.v28.2848

PDF

Published: Dec 18, 2024

DOI: https://doi.org/10.52783/anvi.v28.2848

Keywords:

Multivariate Imputation, Machine Learning, MissForest, Data Quality, MAR Missingness.

Shweta Tiwaskar, Sandip Thite, Rashid Mamoon

Abstract

Electronic health records (EHR) are essential for making informed patient care decisions., but missing data can hinder decision-making. This study addresses the issue of missing data, specifically under the Missing at Random (MAR) mechanism, which is common in real-world datasets. While statistical methods are traditionally used for data imputation, machine learning (ML) approaches offer greater flexibility and can capture complex relationships within the data. The paper evaluates three prominent ML-based imputation techniques—K Nearest Neighbor Imputation (KNNI), Multivariate Imputation by Chained Equations (MICE), and MissForest—focusing on their performance in handling MAR missingness in multivariate configurations. The study simulates MAR missingness (5%-30% of the dataset) across multiple variables and imputes the missing values using these methods. The imputed datasets are evaluated against a complete subset of the original data using several performance metrics e.g. (accuracy, F1 score, MAE, RMSE, R-squared, Pearson correlation, and BIC etc.). particularly examining correlations between missing and observed values. To calculate these performance metrics, eighteen imputed datasets are compared with one complete subset of original dataset. As compared to KNNI and MICE, MissForest imputation method demonstrated reduced SD, MAE, and RMSE in 83.33% of MR cases, and higher R-squared values in all (100%) MR cases. MissForest performs better in 100% of MR cases in all the five performance metrics of model performance. This suggests that MissForest is a superior imputation method for handling MAR missingness in multivariate settings.

Issue

Vol. 28 No. 3s (2025)

Section

Articles

Announcements

Call for Papers

Call for Papers for the Upcoming Issue.

Last Date of Submission: April 30^th, 2026

Call for Reviewers

Call for Editorial Member/ Reviewers Submitting your Application
If you would like to apply for the position of an Editorial Board Member on the journal, please contact the Editor including your CV and a brief covering letter detailing why you are a suitable candidate, to editor@internationalpubls.com. Your cover letter should be no longer than one page and should cover where you believe the research field is going (and the journal's place within it), as well as details of any previous relevant journal editorial and peer review management experience.