University of Twente Student Theses


Counting People in Simultaneous Speech using Support Vector Machines

Hogema, T.A.M. (2019) Counting People in Simultaneous Speech using Support Vector Machines.

[img] PDF
Abstract:The purpose of this paper is to look at the feasibility of counting the number of simultaneous speakers from an audio clip. Currently, there are multiple ways to automatically count people. WiFi, Bluetooth, and video tracking are some of the most common options. Each of these techniques has some downsides with regards to accuracy, usability or practicality. As an alternative, sound might be used to determine the number of people in a room. It could be used as an addition or alternative solution for presence detection. This paper will focus on counting the number of simultaneous speakers in an audio clip. In order to do this, we built a framework that can generate scenarios with overlapping speech and evaluate different features. The framework uses a support vector machine (SVM) for the prediction. First, the framework generated scenarios with overlapping speech. From these scenarios, features were extracted which are used to train the SVM. The primary feature we used was the Mel-frequency cepstral coefficients (MFCC). Results show that, on average, we are able to estimate the number of speakers up to 17 people with a mean error close to zero.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Business & IT BSc (56066)
Link to this item:
Export this item as:BibTeX
HTML Citation
Reference Manager


Repository Staff Only: item control page