Theodoros Giannakopoulos: Personal Web Page
SOFTWARE / MULTI-CLASS AUDIO SEGMENTATION TOOL

MultiClass Audio Segmentation V.1.0 is a DEMO tool for automatic segmentation and classification of audio streams.

Name MultiClass Audio Segmentation Demo
Version 1.0
Release Date September 2007
Implemented by Theodoros Giannakopoulos
Institution Dept of Informatics and Telecommunications, University of Athens, Greece
Contact tyiannak@di.uoa.gr

* This demo version is provided only for educational purposes without any warranty.
** The scientific background behind the provided interface has been studied and designed by Thedoros Giannakopoulos, Dr. Aggelos Pikrakis and Prof. Sergions Theodoridis.

General: The demo provides a user-friendly interface through which you can load audio streams stored in .wav files and then run an algorithm that breaks the audio streams into non overlapping segments and classifies each segment into one of the eight audio classes: Music, Speech, Others1 (low enviromental sounds: wind, rain etc), Others2 (sounds with abrupt changes, like a door closing), Others3 (louder sounds, mainly machines and cars), Gunshots, Fights and Screams. The definition of those classes has been chosen so that the content met in movies is described in detail. Furthermore, we have focused on defining audio classes of violent content, so that the system can also be used as a detector of violence in audio information. Such tools can be used in systems for protecting sensitive groups of the population (e.g. children) from violent multimedia content.

Source code of the provided demo is not yet provided in public.

Downloads: You can download the MultiClass Segmentation here. If you do not have Matlab v. 7.0.0 (R) installed on your Pc, you will have to download the Matlab Component Runtime installer here . After extracting the compressed folder, run the Gui_segmentation.exe. Finally, we provide a wav file for testing the method, and the respective .mat file with the true class labels here.

Instructions: The first time you will execute that file, you will have to wait for some seconds, until the .ctf file will produce the necessary binary Matlab files. In the following figure a screenshot of the provided interface is presented. Click on the image in order to view an enlarged version AND a brief user guide of the GUI.

Screenshot

Figure: Screenshot of Multiclass Audio Segmentation Demo

Algorithm: The algorithm behind the provided interface is partly described in "A MULTI-CLASS AUDIO CLASSIFICATION METHOD WITH RESPECT TO VIOLENT CONTENT IN MOVIES USING BAYESIAN NETWORKS" 2007 IEEE International Workshop on Multimedia Signal Processing, by Theodoros Giannakopoulos, Aggelos Pikrakis and Sergios Theodoridis. There, a multiclass classification scheme for audio segments has been proposed, based on Bayesian Networks. In the current version of the DEMO, a very simple algorithm that uses fixed overlapping long term windows, has been implemented for semgenting tha audio stream. The overall performance of the proposed segmentation scheme is above 60% when eight classes are used.

For any problems with the demo or ideas - suggestions please contact me at tyiannak@di.uoa.gr.




Total Visits: 3883