University of Twente Student Theses
Hierarchical deep neural networks for MeSH subject prediction
Sadananda Bhat, A. (2019) Hierarchical deep neural networks for MeSH subject prediction.
PDF
2MB |
Abstract: | Extreme Mutli-Label Text classification (XMTC) problems attempt to assign a few relevant labels to text from an extremely large label-space. XMTC label spaces generally follow a power law distribution, resulting in data sparsity issues for tail labels and aggressive prediction of head labels. Deep learning methods for tackling such large scale problems have recently gained attention and have reached state-of-the-art performance. Notably, XML-CNN is a deep learning architecture that was tailored specifically towards XMTC problems. Assigning relevant labels to medical journals in the Medline dataset is an XMTC problem with a highly skewed label-space and highly arcane terms. This project explored modifications to XML-CNN by implementing a hierarchical XML-CNN architecture to leverage the inter-label relationships for training and classification. An automated hierarchy generated by Hierarchical Agglomerative Clustering and the expert-curated MeSH hierarchy for medline were used to evaluate prediction performance. Borrowing from the concept of multi-task learning, the hierarchies were used to modify the XML-CNN architecture to function as a single model using hard parameter sharing with separate loss functions for each level of the hierarchy. The experiments were focused on testing the effect of the hierarchical approach, the effect of an automatically generated hierarchy and that of a manually curated hierarchy. The use of hierarchies were found to be less suited for medline label prediction than the original XML-CNN model. However, the performance of the hierarchical models were comparable to XML-CNN and is sufficiently high that the hierarchical models cannot be considered ineffective for subject prediction tasks. |
Item Type: | Essay (Master) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Interaction Technology MSc (60030) |
Link to this item: | https://purl.utwente.nl/essays/79361 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page