Harvesting unstructured data in heterogenous business environments; exploring modern web scraping technologies

Author(s): Vording, R.M. (2021)

Abstract:
Web scraping technology can be used to retrieve data from multiple sources efficiently and effectively, but it could be difficult for companies to adopt this technology. This research includes a literature review on web scraping technology in general as well as its techniques and tools. Web scraping has been identified to have a syntactic and a semantic level. There seems to be no valid alternative for web scraping, where APIs have been discussed. A specific adoption model for web scraping technology is nonexistent, making it difficult to manage the adoption of said technology. Current adoption models are discussed and used as foundation for a web scraping adoption model. To determine the applicability of the technology, a case study will be conducted at a Dutch Logistics Service Provider to identify use cases for the use of web scraping technology, resulting in a recommendation for the use of the technology.

Document(s):

Vording_BA_EEMCS.pdf