University of Twente Student Theses


Predicting user loyalty in an education support web application based on usage data

El Assal, Karim M. (2018) Predicting user loyalty in an education support web application based on usage data.

[img] PDF
Abstract:Data use in education has been increasing in the last 20 years. School management and teachers are moving towards a data-driven policy and improvement process due to the potential benefits. Studies show that data use can increase student achievement to some extent, and that the use of information systems that facilitate data use has a positive impact on educational management. There are many barriers that limit the adoption of data use, one of which is the teachers and specifically their data (il)literacy and their attitude towards data use. The proper adoption of data systems by teachers is paramount in this context: without appropriate use of the means, data use can not be effectively utilized. A few factors that influence the adoption of data systems are data's availability, reliability, findability and interpretability. Knowing users' opinions on these factors is essential to improving such a data system. Gathering these opinions, however, is time and resource intensive. That's why Reichheld's Net-Promoter Score (NPS) survey is so popular. It asks the question "How likely is it that you would recommend our system to a friend or colleague?" and expects an answer between 0 (extremely unlikely) and 10 (extremely likely). Users scoring 9 or 10 are called Promoters, users scoring 7 or 8 are Passives and users scoring between 0 and 6 are Detractors. It is considered a measure of loyalty: Reichheld argues that users who promote your system put their own reputation on the line. This research focuses on predicting a user's NPS response based on their system usage data. With this loyalty prediction, system owners can better target their system evaluation and improvement efforts. The use case is \somtoday, an exemplary Dutch high school administration system developed by Topicus Education. The research question is: "How reliably can teachers' usage data, generated by an education support data system, be used to predict user loyalty towards that system?" There are 1085 NPS responses available, consisting of a score and a reason for that score, with an average of 1408 relevant log entries per NPS response. Additionally, non-identifying profile data is available, such as a teacher's school and the education levels he or she is teaching. Efforts to determine what behavior might be an influence on the NPS score have yielded data feature specifications about prevalence of functionality use, repetitive tasks, clickstreams, encountered downtimes, login frequencies, the amount of system usage and teacher profile data. Based on the expected impact on the NPS response, preparation time and generalizability to other systems, data features were selected about system usage, repetitive tasks and profile data. After applying a range of extraction, transformation and load (ETL) operations and applying the sequential pattern mining algorithm Apriorirep1i we created, numerical or binary values for each data feature were extracted. The actual NPS prediction was done using machine learning. A brute force approach was applied to account for differences in models, model parameters, and preprocessing methods such as normalization and outlier removal. Additionally, the data type of the predicted NPS score was treated in three different ways: numerical (0-10), eleven-value categorical (0-10), and three-value categorical (Promoter, Passive, Detractor). The best models performed with a mean absolute error (MAE) of 1.727 with numerical prediction, an accuracy of 27.92% with eleven-value categorical prediction, and 54.30% with three-value categorical prediction. If one always predicts the dominant class or value, i.e. 7 or Passive, the MAE with numerical prediction is 1.733, the accuracy with eleven-value categorical prediction is 27.63%, and the accuracy with three-value categorical prediction is 44.07%. Validation was provided by the utilized brute force approach and by looking at different performance metrics. The small performance difference between the trained models and always predicting the dominant class or value shows that there is practically no predictive value in the dataset. The conclusion of this study is that the researched data features can not reliably be used with machine learning models to predict NPS scores. This does not rule out the possibility that user loyalty towards a system can be predicted based on their behavior. Our main recommendation for Topicus is to focus on finding predictive value in other usage patterns. The recommendation from a more scientific perspective is the same, but with a preliminary step: to conduct a more in-depth feature discovery research project. A full-scale user experience study with the goal of usage data analysis gives the researcher quantitative and qualitative data about what users think of different aspects (e.g. navigation and layout) and functionalities and how their behavior is mapped onto the log entries dataset. Having better insight leads not only to validated feature selection, it also leads to refined knowledge about how data features can be measured and what the nuances are. Choosing this approach, one or multiple directed studies into specific usage patterns can be set up and the researcher has a better chance of finding predictive value in user behavior: the researcher is no longer looking in the proverbial dark.
Item Type:Essay (Master)
Topicus, Deventer, Netherlands
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science, 81 education, teaching
Programme:Computer Science MSc (60300)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page