University of Twente Student Theses

Login

Reducing labeled data usage in duplicate detection using deep belief networks

Janssen, Stefan C. (2016) Reducing labeled data usage in duplicate detection using deep belief networks.

[img]

PDF
938kB

Abstract:	Modern duplicate detection systems typically use supervised machine learning algorithms to create duplicate detection models. These algorithms require a large amount of manually labeled data to train on. Using semi-supervised deep learning techniques would allow the training to use not only labeled data, but also unlabeled data, which is easily available. The expectation is that this will allow models with less manually labeled data to achieve similar or better accuracy as traditional supervised algorithms.
Item Type:	Essay (Master)
Faculty:	EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:	54 computer science
Programme:	Interaction Technology MSc (60030)
Link to this item:	https://purl.utwente.nl/essays/70362
Export this item as:	BibTeX EndNote HTML Citation Reference Manager

Show download statistics for this publication

Repository Staff Only: item control page