University of Twente Student Theses
Fashion product entity matching
Jundt, O. (2017) Fashion product entity matching.
PDF
4MB |
Abstract: | Finding the same product at different webshops (entity matching) plays an important role for many product search engines like Google Shopping. Knowing which products are identical is essential for deduplicating search results and providing attractive price comparison features. In many product domains, the matching process is trivial as globally unique identifier (e.g. ISBN or EAN) can be used. However, for fashion products like clothing, shoes and accessories, globally unique identifiers are often missing or unreliable, making product entity matching much more challenging. This thesis presents an entity matching approach for fashion products that is independent of globally unique identifiers. The basic idea is to utilize the combination of description, color, shape and texture features instead to compare and classify product pairs between webshops. However, for the approach to be viable in practice it has to be fast and scalable, robust against varying data quality and achieve near perfect accuracy. This research addresses these challenges based on a real-world example dataset of 1.5 million products from 250+ webshops active in the Netherlands. |
Item Type: | Essay (Master) |
Clients: | Fashion Evolution |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Computer Science MSc (60300) |
Link to this item: | https://purl.utwente.nl/essays/72275 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page