University of Twente Student Theses


The potential of synthetic training data for training deep learning models

Vugt, M.M.P. van (2019) The potential of synthetic training data for training deep learning models.

[img] PDF
Abstract:This report was written in accordance with the graduation project “The potential of synthetic training data for training deep learning models”. As the title suggests, this report will look into the potential of synthetic training for training deep learning models. First, an overview will be given regarding the problems for training deep learning models such as data scarcity and the proposed solution, which is to train deep learning models on simulated data. In the state of the art, the current methods of data simulation will be given and based on these current methods, the correct method for the problem at hand will be chosen. The ideation chapter will describe the work process in advance to the ideation and realisation of the project. During the ideation phase, the two different types of simulations will be presented as well as the motivation on why these types of simulations were chosen. The first objective is to simulate pictures of smoke as a result of forest fires. This is a type of data that is lacking in sources and thus is a scarce data type. Also, the detection of smoke as a result of forest fires can have a lot of potential for limiting and preventing natural disasters. By the use of synthetic data, the training dataset becomes larger. It is expected that this will also improve the accuracy of the model. When trained on synthetic data, the model was able to reach a validation accuracy of 1.0. which is very promising and shows that synthetic data can be used for smoke detection. After the smoke vs forest scenario, a different scenario was tried out. Namely, the use of simulated data for training houses from satellite images. Unfortunately, the results of the houses scenario were not as promising as they were with the smoke vs forest scenario. This could be because of the limitations of the model that was used or because the simulated data was not designed properly to train a deep neural network. In conclusion, this report proves that synthetic data has potential when training deep neural networks. It also shows that no scenario is the same and that each scenario requires a different approach. In some cases, it might be nearly impossible as is shown by the houses scenario.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Creative Technology BSc (50447)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page