<p>A Novel Method for Lip Movement Detection using Deep Neural Network</p>
Online Publishing @ NISCAIR
View Archive InfoField | Value | |
Authentication Code |
dc |
|
Title Statement |
<p>A Novel Method for Lip Movement Detection using Deep Neural Network</p> |
|
Added Entry - Uncontrolled Name |
Srilakshmi, Kanagala ; School of Electronics Engineering, 2Centre for Cyber Physical Systems and School of Electronics Engineering, Vellore Institute of Technology, Chennai 600 127, Tamil Nadu, India Karthik, R ; School of Electronics Engineering, 2Centre for Cyber Physical Systems and School of Electronics Engineering, Vellore Institute of Technology, Chennai 600 127, Tamil Nadu, India |
|
Uncontrolled Index Term |
Attention, CNN, EfficientNet, Lip movement, LSTM, MIRACL-VC1 |
|
Summary, etc. |
<p>Recognition of Lip movements has become one of the most challenging tasks and has crucial applications in the contemporary scenario. It is the recognition of the speech uttered by individual using visual cues. Visual interpretation of lip movement is especially useful in scenarios like video surveillance, where auditory signals are either not available or too noisy for interpretation. It is also useful for hearing-impaired individuals where audio signal is of no use. Many developments have taken place in this nascent field using various deep learning-based techniques. This research does analysis over various state-of-the-art deep-learning models on MIRACL-VC1 dataset. This study also aims to find out the optimal baseline architecture suitable for building a new model with high accuracy for lip movement detection. The models are trained from scratch over the pre-processed MIRACL-VC1 dataset consisting of small-size images. Experimental observations with state-of-the-art deep learning models indicate that EfficientNet B0 architecture yielded an accuracy of 80.13%. Thus, EfficientNet B0 is further utilized as baseline deep architecture to design a customized model for effective detection. This research proposes an attention based deep learning model combined with Long Short-Term Memory (LSTM) layer having EfficientNet B0 as the backbone architecture. The proposed model yielded an accuracy of 91.13%.</p> |
|
Publication, Distribution, Etc. |
Journal of Scientific and Industrial Research (JSIR) 2022-07-25 10:12:01 |
|
Electronic Location and Access |
application/pdf http://op.niscair.res.in/index.php/JSIR/article/view/53898 |
|
Data Source Entry |
Journal of Scientific and Industrial Research (JSIR); ##issue.vol## 81, ##issue.no## 06 (2022): Journal of Scientific and Industrial Research |
|
Language Note |
en |
|