University of Twente Student Theses

Login

Counting People in Simultaneous Speech using Support Vector Machines

Hogema, T.A.M. (2019) Counting People in Simultaneous Speech using Support Vector Machines.

[img]

PDF
509kB

Abstract:	The purpose of this paper is to look at the feasibility of counting the number of simultaneous speakers from an audio clip. Currently, there are multiple ways to automatically count people. WiFi, Bluetooth, and video tracking are some of the most common options. Each of these techniques has some downsides with regards to accuracy, usability or practicality. As an alternative, sound might be used to determine the number of people in a room. It could be used as an addition or alternative solution for presence detection. This paper will focus on counting the number of simultaneous speakers in an audio clip. In order to do this, we built a framework that can generate scenarios with overlapping speech and evaluate diﬀerent features. The framework uses a support vector machine (SVM) for the prediction. First, the framework generated scenarios with overlapping speech. From these scenarios, features were extracted which are used to train the SVM. The primary feature we used was the Mel-frequency cepstral coeﬃcients (MFCC). Results show that, on average, we are able to estimate the number of speakers up to 17 people with a mean error close to zero.
Item Type:	Essay (Bachelor)
Faculty:	EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:	54 computer science
Programme:	Business & IT BSc (56066)
Link to this item:	https://purl.utwente.nl/essays/78710
Export this item as:	BibTeX EndNote HTML Citation Reference Manager

Show download statistics for this publication

Repository Staff Only: item control page