University of Twente Student Theses

As of Friday, 8 August 2025, the current Student Theses repository is no longer available for thesis uploads. A new Student Theses repository will be available starting Friday, 15 August 2025.

ExpressTTS: Augmentation for Speech Recognition with Expressive Speech Synthesis

Kempen, Lindsay (2022) ExpressTTS: Augmentation for Speech Recognition with Expressive Speech Synthesis.

PDF
6MB

Abstract:	Current automatic speech recognition (ASR) systems are greatly impacted by expressive speech, causing higher Word Error Rates (WER). Producing a large-scale training corpus with human expressive speech is a very laborious task. Similar to data augmentation, we explore the field of expressiveness within a text-to-speech (TTS) system, creating a larger amount of speech data. Our speech synthesizer, ExpressTTS, aims to separately explore prosodic factors (pitch, energy, duration) and spectral tilt in a regularized latent space while conditioning on the text and speaker. This way, we find expressive patterns that are natural in these contexts. Our non-autoregressive model parallelizes inference, allowing us to generate a large-scale corpus. We focus on the TTS part of the augmentation pipeline. We train the system on a small-scale expressive corpus, padded with neutral speech data. Quantitative analysis shows that the resulting domain mismatch inhibits model and baseline stability. Nonetheless, the model generates speech with prosodic variation, and we find that ExpressTTS consistently generalizes better to unseen in-domain data than the baseline Glow-TTS. The user study suggests our model produces more diverse expressiveness and significantly more emotion and style than the baseline. We conclude with directions on how to use our model for exploring expressiveness.
Item Type:	Essay (Master)
Clients:	Sony, Stuttgart, Germany
Faculty:	EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:	54 computer science
Programme:	Computer Science MSc (60300)
Link to this item:	https://purl.utwente.nl/essays/89721
Export this item as:	BibTeX EndNote HTML Citation Reference Manager

Show download statistics for this publication

Repository Staff Only: item control page