University of Twente Student Theses
Counting People in Simultaneous Speech using Support Vector Machines
Hogema, T.A.M. (2019) Counting People in Simultaneous Speech using Support Vector Machines.
PDF
509kB |
Abstract: | The purpose of this paper is to look at the feasibility of counting the number of simultaneous speakers from an audio clip. Currently, there are multiple ways to automatically count people. WiFi, Bluetooth, and video tracking are some of the most common options. Each of these techniques has some downsides with regards to accuracy, usability or practicality. As an alternative, sound might be used to determine the number of people in a room. It could be used as an addition or alternative solution for presence detection. This paper will focus on counting the number of simultaneous speakers in an audio clip. In order to do this, we built a framework that can generate scenarios with overlapping speech and evaluate different features. The framework uses a support vector machine (SVM) for the prediction. First, the framework generated scenarios with overlapping speech. From these scenarios, features were extracted which are used to train the SVM. The primary feature we used was the Mel-frequency cepstral coefficients (MFCC). Results show that, on average, we are able to estimate the number of speakers up to 17 people with a mean error close to zero. |
Item Type: | Essay (Bachelor) |
Faculty: | EEMCS: Electrical Engineering, Mathematics and Computer Science |
Subject: | 54 computer science |
Programme: | Business & IT BSc (56066) |
Link to this item: | https://purl.utwente.nl/essays/78710 |
Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page