Harvesting unstructured data in heterogenous business environments; exploring modern web scraping technologies

Vording, R.M. (2021)

Web scraping technology can be used to retrieve data from multiple sources efficiently and effectively, but it could be difficult for companies to adopt this technology. This research includes a literature review on web scraping technology in general as well as its techniques and tools. Web scraping has been identified to have a syntactic and a semantic level. There seems to be no valid alternative for web scraping, where APIs have been discussed. A specific adoption model for web scraping technology is nonexistent, making it difficult to manage the adoption of said technology. Current adoption models are discussed and used as foundation for a web scraping adoption model. To determine the applicability of the technology, a case study will be conducted at a Dutch Logistics Service Provider to identify use cases for the use of web scraping technology, resulting in a recommendation for the use of the technology.
Vording_BA_EEMCS.pdf