University of Twente Student Theses

As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

A QA-pair generation system for the incident tickets of a public ICT Shared Service Center

Lammers, Mick (2019) A QA-pair generation system for the incident tickets of a public ICT Shared Service Center.

PDF
3MB

Abstract:	The days of AI have begun, Artificial Intelligence becomes a common term in our vocabulary, even though most of us know and understand so little about it. It seems like only the huge and elusive companies like IBM and Google understand its use and potential fully. In customer service, chatbots arise that answer customer questions based on most often manually crafted data structures called Question Answer-pairs, making companies look like one of the elite. However, what about those organizations that process so many questions that manual labeling is not an option? Should they remain old fashioned static servants that only react to their customer’s inquiries that do not see a way to cater them proactively? The large companies provide the solution but with a price tag of millions of dollars. There must be something in between right? TopDesk, capping 80% market share in the Dutch incident management branch (Datanyze, 2019) does not see how. In this study, we propose a low threshold QA-pair generation system using state-of-the-art technologies with the purpose of automatically identifying unique problems, and their solutions from a large and high variety incident ticket dataset of the nation-wide public IT Shared Service Center. In order to achieve this, we researched the in related works applied components and techniques, and determined the for SSC-ICT best combination using identified characteristics of the dataset and organizational context. Furthermore, a set of component-based evaluation measures is designed in order to evaluate the different techniques and determine the best solutions. Then, a recommendation is provided with a system architecture, its use cases, and potential further improvements. The result is a system consisting of 4 components: categorizational clustering, intent identification, action recommendation, and reinforcement learning. For categorizational clustering, we determine categorizational keywords using an existing Latent Semantic Indexing (LSI) algorithm to which we allocate the tickets using Levenshtein distance, which overcomes misspelling exclusions. For the intent identification component, we compared two very different but state-of-the-art techniques: POS Patterns and Topic Modeling (LDA). After applying the evaluation measure, Topic modeling came out as the winner with a slightly lower QA-pair quality score, but higher improvement potential and a much higher ticket coverage rate. The actions are cleaned, clustered and provided using a recommended application, a knowledge base application with reinforcement learning capabilities for use by the 40.000 customers of SSC-ICT. With enough feedback, the expected success rate of the system is about 50%. With further improvements, we believe this can lead up to 70-80%. Other uses of the system’s QA-pairs are Business Intelligence, FAQ extraction, and Anomaly Detection.
Item Type:	Essay (Master)
Clients:	Ministry of Interior and Kingdom Relations, Den Haag, Nederland
Faculty:	EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:	54 computer science
Programme:	Business Information Technology MSc (60025)
Link to this item:	https://purl.utwente.nl/essays/77562
Export this item as:	BibTeX EndNote HTML Citation Reference Manager

Show download statistics for this publication

Repository Staff Only: item control page