Signal and Image Processing Lab
Dept. Of Information and Telecommunications
University of Athens, Greece
Demo implemented by:
Theodoros Giannakopoulos, Dr. Aggelos Pikrakis and Prof. Sergios Theodoridis.
For more information on the algorithm, please visit our website at www.di.uoa.gr/~sp_mu
Algorithm Description
The algorithm consists of 3 stages:
Stage 1: A simple segmentation algorithm is applied on the original signal, in order to detect speech and music segments with a high class probability. The algorithm uses chroma entropy as a feature and its parameters are set so as the precision of the segmentation process is maximized. A part of the audio stream (usually between 30-60%) is left unclassified, but the classified segments are correctly classified with a ratio of more than 98%.
Stage 2: The unclassifed segments are fed as input to a more sophisticated (and therefore more time consuming) segmentation algorithm. The algorithm is based on a hybrid Hidden Markov Model and Bayesian Network architecture.
Stage 3: A number of post-processing procedures is executed, in order to improve the final result of the classification process.