University of Twente Student Theses

Login

Feature extraction and selection on sparse, complex, sensor-based exhaled-breath data sets

Tintelen, B.F.M. van (2022) Feature extraction and selection on sparse, complex, sensor-based exhaled-breath data sets.

[img] PDF
3MB
Abstract:The aeoNose device, developed by The eNose Company, is used for diagnosing cancer by analysing the Volatile Organic Compounds residing within a person’s exhaled breath. These Volatile Organic Compounds cause redox reactions at the surface of the device’s sensors influencing their conductivity readings. These conductivity readings are then processed by peak shaving and either transforming or rescaling the signal data. This is followed by the Singular Value Decomposition to extract features by reducing the dimensionality of the data. Once all these preprocessing steps are applied the data is ready for disease classification with Machine Learning algorithms. In this paper, we propose an optimization approach for this data processing pipeline by first limiting the input data size as well as introducing the Natural Logarithmic rescaling function as preprocessing steps. This results in an area under the receiver operating characteristics curve increase from 59.29% to 66.69% on a 100 runs average for five classifiers. For this setup, the Extra Trees Classifier performs best with an average performance of 68.07%. Additionally, we exchanged the Singular Value Decomposition with a Fully Connected Autoencoder and a Convolutional Autoencoder resulting in a performance of 67.34% and 64.69% on average respectively. For both Autoencoders, the Random Forest Classifier performs best, resulting on average in 70.21% and 67.80% respectively. With this, we show that the Fully Connected Autoencoder can surpass the SVD compression with 2.14% when comparing best performing models, while the Convolutional Autoencoder can obtain similar results compared to the SVD compression.
Item Type:Essay (Master)
Clients:
The eNose Company, Zutphen, Netherlands
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:https://purl.utwente.nl/essays/92849
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page