University of Twente Student Theses


Fashion product entity matching

Jundt, O. (2017) Fashion product entity matching.

[img] PDF
Abstract:Finding the same product at different webshops (entity matching) plays an important role for many product search engines like Google Shopping. Knowing which products are identical is essential for deduplicating search results and providing attractive price comparison features. In many product domains, the matching process is trivial as globally unique identifier (e.g. ISBN or EAN) can be used. However, for fashion products like clothing, shoes and accessories, globally unique identifiers are often missing or unreliable, making product entity matching much more challenging. This thesis presents an entity matching approach for fashion products that is independent of globally unique identifiers. The basic idea is to utilize the combination of description, color, shape and texture features instead to compare and classify product pairs between webshops. However, for the approach to be viable in practice it has to be fast and scalable, robust against varying data quality and achieve near perfect accuracy. This research addresses these challenges based on a real-world example dataset of 1.5 million products from 250+ webshops active in the Netherlands.
Item Type:Essay (Master)
Fashion Evolution
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page