Speech Signal Classification Using Support Vector Machines
Electronic Theses of Indian Institute of Science
View Archive InfoField | Value | |
Title |
Speech Signal Classification Using Support Vector Machines
|
|
Creator |
Sood, Gaurav
|
|
Subject |
Speech Recognition
Speech Signal Processing Automatic Speech Recognition Artificial Neural Networks Support Vector Machine Time Normalization Hidden Markov Models (HMMs) Computer Science |
|
Description |
Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving highâperformance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the dependency on Hidden Markov Models is a fact. During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed. In this work a novel approach based upon probabilistic kernels in support vector machines have been attempted for speech data classification. The classification accuracy in case of support vector classification depends upon the kernel function used which in turn depends upon the data set in hand. But still as of now there is no way to know a priori which kernel will give us best results The kernel used in this work tries to normalize the time dimension by fitting a probability distribution over individual data points which normalizes the time dimension inherent to speech signals which facilitates the use of support vector machines since it acts on static data only. The divergence between these probability distributions fitted over individual speech utterances is used to form the kernel matrix. Vowel Classification, Isolated Word Recognition (Digit Recognition), have been attempted and results are compared with state of art systems. |
|
Contributor |
Balakrishnan, N
|
|
Date |
2011-03-16T04:41:08Z
2011-03-16T04:41:08Z 2011-03-16 2009-07 |
|
Type |
Thesis
|
|
Identifier |
http://etd.iisc.ernet.in/handle/2005/1094
|
|
Language |
en_US
|
|
Relation |
G23702
|
|