University of Twente Student Theses


Clustered acoustic modelling in speech recognition

Veelen, P. van (2007) Clustered acoustic modelling in speech recognition.

[img] PDF
Abstract:Speech recognition uses statistical models to compute the most likely spoken sentence from a given audio input. An acoustic model is used for examining the input and a grammar or language model uses knowledge about the language for which sequence of words is most likely to be said. The focus in this research is on the possible ways to improve the acoustic model. An acoustic model is created by taking audio recordings of speech and their text transcriptions, and using software to create statistical knowledge about the sounds that make up each word. This process is called the training of the acoustic model. When an acoustic model is used on another domain (i.e. other speakers or other acoustic environment) than on the domain that it was trained for, it can be enhanced through acoustic model adaptation with the use of adaptation data from the new domain. Different adaptation techniques exist and the effect of the adaptation is influenced by the amount of adaptation data. When the adaptation data consists of a single speaker, a general speakerindependent acoustic model can be transformed into a speaker-dependent acoustic model. When enough adaptation data is available in the new domain for every single speaker, it is possible to create a speaker-dependent acoustic model for every speaker. Because of its focus on one speaker, speaker-dependent systems achieve better results than speaker-independent systems. Thus the logical thing to do seems to always use speaker-dependent models.
Item Type:Essay (Master)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Computer Science MSc (60300)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page