University of Twente Student Theses
Author context extraction for interpreting the external validity of opinion mining results
Bloemen, O. (2013) Author context extraction for interpreting the external validity of opinion mining results.
PDF
4MB |
Abstract: | Nurtured by the seemingly ever-growing amount of user-generated content publicly available on the Internet, opinion mining is a growing field of interest. Applications show promising results in brand monitoring, where the public sentiment towards product features is observed, the monitoring of sentiment towards current issues in politics, or the prediction of voting outcomes. Despite the fact that the existence of opinion mining can be traced back to span at least a decade, the validity can be questioned. The majority of publications in this research area provide new techniques for text-analysis and/or incremental innovations for sentiment classifiers. By using labeled corpora, these innovations are typically compared to a baseline method to indicate their superior accuracy in extracting features and determining the corresponding sentiment. For generalizability of the results, however, it is important to possess information about the context of the sample; i.e. information about the authors of the user-generated content. When this information is missing, the relation between the sample and the target population is unknown and conclusions cannot be drawn, possibly rendering opinion mining reports useless in practice. The purpose of this thesis is to investigate whether this problem related to the external validity of opinion mining, or more specific, the generalizability of opinion mining results with respect to the review authors, indeed plays a role. Three types of threats to the external validity are identified from a literature review and examined: (1) a mismatch in demographic characteristics of the sample, (2) manipulation of online reviews, and (3) bias due to irrelevant experiences. Different methods are proposed and tested to analyze the influence of these biases on the sentiment report. Theoretical sampling confirmed existence of both a demographic and an experience bias. |
Item Type: | Essay (Master) |
Faculty: | BMS: Behavioural, Management and Social Sciences |
Subject: | 85 business administration, organizational science |
Programme: | Business Administration MSc (60644) |
Link to this item: | https://purl.utwente.nl/essays/63600 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page