University of Twente Student Theses
Reducing labeled data usage in duplicate detection using deep belief networks
Janssen, Stefan C. (2016) Reducing labeled data usage in duplicate detection using deep belief networks.
PDF
938kB |
Abstract: | Modern duplicate detection systems typically use supervised machine learning algorithms to create duplicate detection models. These algorithms require a large amount of manually labeled data to train on. Using semi-supervised deep learning techniques would allow the training to use not only labeled data, but also unlabeled data, which is easily available. The expectation is that this will allow models with less manually labeled data to achieve similar or better accuracy as traditional supervised algorithms. |
Item Type: | Essay (Master) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Interaction Technology MSc (60030) |
Link to this item: | https://purl.utwente.nl/essays/70362 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page