University of Twente Student Theses


Comparison of different types of auto-encoders for data cleaning

Alberts, K.J. (2021) Comparison of different types of auto-encoders for data cleaning.

[img] PDF
Abstract:Using machine learning techniques for data cleaning has a lot of potential, for example in repairing corrupted data or restoring missing information. Previous research has given rise to a lot of different ways of using machine learning in this way, one of which being the auto-encoder. A lot of different types of auto-encoders have since emerged, which are usually tested on one dataset or compared to one other type. This begs the question which type is best and if auto-encoders can be used in a more general sense. In this research, we propose to experimentally compare five different auto-encoders (basic, sparse, contractive, denoising and variational) for cleaning and to see which types of auto-encoders are the most suited and most accurate for data cleaning for three different datasets, namely CIFAR-10, MNIST (images) and US Weather Data (tabular). We implement a testing framework that allows easy implementation of different auto-encoders and datasets, and use this framework to test five different types of auto-encoders on two different image datasets. We find that for some types of auto-encoders there is no big difference in the type of dataset, but other types of auto-encoders work a lot better on certain types of data.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science BSc (56964)
Awards:Best Paper (34th Twente Student Conference on IT)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page