University of Twente Student Theses

Login

Reducing labeled data usage in duplicate detection using deep belief networks

Janssen, Stefan C. (2016) Reducing labeled data usage in duplicate detection using deep belief networks.

[img]
Preview
PDF
938kB
Abstract:Modern duplicate detection systems typically use supervised machine learning algorithms to create duplicate detection models. These algorithms require a large amount of manually labeled data to train on. Using semi-supervised deep learning techniques would allow the training to use not only labeled data, but also unlabeled data, which is easily available. The expectation is that this will allow models with less manually labeled data to achieve similar or better accuracy as traditional supervised algorithms.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Human Media Interaction MSc (60030)
Link to this item:http://purl.utwente.nl/essays/70362
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page