University of Twente Student Theses
Comparison of different types of auto-encoders for data cleaning
Alberts, K.J. (2021) Comparison of different types of auto-encoders for data cleaning.
PDF
1MB |
Abstract: | Using machine learning techniques for data cleaning has a lot of potential, for example in repairing corrupted data or restoring missing information. Previous research has given rise to a lot of different ways of using machine learning in this way, one of which being the auto-encoder. A lot of different types of auto-encoders have since emerged, which are usually tested on one dataset or compared to one other type. This begs the question which type is best and if auto-encoders can be used in a more general sense. In this research, we propose to experimentally compare five different auto-encoders (basic, sparse, contractive, denoising and variational) for cleaning and to see which types of auto-encoders are the most suited and most accurate for data cleaning for three different datasets, namely CIFAR-10, MNIST (images) and US Weather Data (tabular). We implement a testing framework that allows easy implementation of different auto-encoders and datasets, and use this framework to test five different types of auto-encoders on two different image datasets. We find that for some types of auto-encoders there is no big difference in the type of dataset, but other types of auto-encoders work a lot better on certain types of data. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Computer Science BSc (56964) |
Awards: | Best Paper (34th Twente Student Conference on IT) |
Link to this item: | https://purl.utwente.nl/essays/85686 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page