University of Twente Student Theses


Improving the Informativeness of Abstractive Opinion Summarization

Schaik, Eric van (2022) Improving the Informativeness of Abstractive Opinion Summarization.

[img] PDF
Abstract:Although current state-of-the-art abstractive text summarization has improved significantly since the introduction of the transformer architecture, opinion summarization has lagged somewhat behind, partly due to the lack of labeled training data. Various unsupervised learning methods have been tried to solve this problem. Most seem to produce less informative text than regular text summarization, reflecting the lower quality of the opinion summarization datasets. This effect is exaggerated by an apparent lack of metrics measuring this perceived informativeness, resulting in little attention being payed to this aspect of opinion summarization in related research. Therefore, this research will focus on the following research question: How can the informativeness of opinion summarization be improved? First, some ways of measuring informativeness are proposed. Then, a new opinion summarization model is proposed, in an attempt to improve upon these metrics. This new proposed model roughly consists of three parts: topic modeling, where a number of relevant review topics are generated, ranking, where all the sentences from the input reviews are ordered based on importance, and filtering, where only the most important sentences are selected as input for the summarization model. Finally, the model is tested and compared with current opinion and text summarization models, namely the COOP [1] and PEGASUS [2] model, respectively. It is shown that the performance of our model is often closer to the state-of-the-art text summarization than to opinion summarization models, while retaining an accurate sentiment value. Small-scale human evaluation is included as well, its results partly supporting the conclusions drawn from the automatic evaluation. Therefore, the model proposed here can provide a good alternative to existing models, albeit dependent on the context.
Item Type:Essay (Master)
Capgemini, Utrecht, Netherlands
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page