University of Twente Student Theses


Handling Missing Data: Traditional Techniques Versus Machine Learning

Cojocaru, Andrei (2019) Handling Missing Data: Traditional Techniques Versus Machine Learning.

[img] PDF
Abstract:Missing data is a serious problem in data science and in other fields that rely on statistical inference. Improper handling of missing data often leads to biased or invalid conclusions. Given this risk, existing research compares many techniques for the practical analysis of a dataset with missing data, all of varying levels of quality. This paper examines and documents the methodological flaws that affect many of these studies, presenting a comparison based on more realistic assumptions. Traditional statistical techniques are compared to machine learning algorithms, and the strengths and weaknesses of each category are described based on the observed results, offering some prescriptions for the right time to apply machine learning to missing data problems.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science BSc (56964)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page