University of Twente Student Theses

Login

Harvesting unstructured data in heterogenous business environments; exploring modern web scraping technologies

Vording, R.M. (2021) Harvesting unstructured data in heterogenous business environments; exploring modern web scraping technologies.

[img] PDF
458kB
Abstract:Web scraping technology can be used to retrieve data from multiple sources efficiently and effectively, but it could be difficult for companies to adopt this technology. This research includes a literature review on web scraping technology in general as well as its techniques and tools. Web scraping has been identified to have a syntactic and a semantic level. There seems to be no valid alternative for web scraping, where APIs have been discussed. A specific adoption model for web scraping technology is nonexistent, making it difficult to manage the adoption of said technology. Current adoption models are discussed and used as foundation for a web scraping adoption model. To determine the applicability of the technology, a case study will be conducted at a Dutch Logistics Service Provider to identify use cases for the use of web scraping technology, resulting in a recommendation for the use of the technology.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Business & IT BSc (56066)
Link to this item:http://purl.utwente.nl/essays/85663
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page