University of Twente Student Theses
Implications of Detection of Moral Foundations in Written Text
Kleijn, T.F.C.L. de (2024) Implications of Detection of Moral Foundations in Written Text.
PDF
1MB |
Abstract: | The Moral Foundation Theory is a way of classifying intuitive moral behaviour. The intuitive and fundamental nature of the Moral Foundation Theory theorizes that people will fall back on the morals they find important before logical reasoning. These morals also affect language used and understanding their presence can help with better understanding the underlying message. Detecting the presence of moral foundations in a piece of text can be interesting for the psychology domain, but such a task requires knowledge in the natural language processing domain. This thesis tries to bridge the gap from natural language processing to psychology and elaborate on steps taken for granted within the natural language processing community. In particular, this research compares and analyses text representations methods and classification algorithms, and tests their suitability for cross data set classification with the available data sets. We use two large data sets with annotations reflecting the Moral Foundation Theory: the Moral Foundation Twitter Corpus and the Moral Foundation Reddit Corpus. Based on majority voting, a single label is selected from all annotations for each post. All experiments are performed in two variations of classification: moral against non-moral and morals-only. For text representation methods, we compare a general Word2Vec embedding, GloVe’s pre-trained Twitter-200 model, against a dedicated dictionary based on the Moral Foundation Theory, the extended Moral Foundation Dictionary. The different classification algorithms are Logistic Regression, Support Vector Machines and distilBERT. Lastly, we test the performance for cross data set classification. The results show that moral against non-moral classification is successful regardless of text representation or classification methods, whereas morals-only classification is only successful with GloVe’s representation. Comparing the classification algorithms, distilBERT generally has better performance, but does not strictly outclass Logistic Regression or Support Vector Machines. Unfortunately, cross data set classification is not successful with the data sets at hand. Future work should consider improving on text embedding techniques, returning more classification outputs to cover the ambiguous nature of the Moral Foundation Theory, and aligning the theme of the training data set to the testing data set. |
Item Type: | Essay (Master) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science, 77 psychology |
Programme: | Interaction Technology MSc (60030) |
Link to this item: | https://purl.utwente.nl/essays/99784 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page