• Home
  • Browse
    • Current Issue
    • By Issue
    • By Author
    • By Subject
    • Author Index
    • Keyword Index
  • Journal Info
    • About Journal
    • Aims and Scope
    • Editorial Board
    • Publication Ethics
    • Peer Review Process
  • Guide for Authors
  • Submit Manuscript
  • Contact Us
 
  • Login
  • Register
Home Articles List Article Information
  • Save Records
  • |
  • Printable Version
  • |
  • Recommend
  • |
  • How to cite Export to
    RIS EndNote BibTeX APA MLA Harvard Vancouver
  • |
  • Share Share
    CiteULike Mendeley Facebook Google LinkedIn Twitter
The Egyptian Journal of Language Engineering
arrow Articles in Press
arrow Current Issue
Journal Archive
Volume Volume 11 (2024)
Volume Volume 10 (2023)
Volume Volume 9 (2022)
Volume Volume 8 (2021)
Volume Volume 7 (2020)
Volume Volume 6 (2019)
Issue Issue 2
Issue Issue 1
Volume Volume 5 (2018)
Volume Volume 4 (2017)
Volume Volume 3 (2016)
Volume Volume 2 (2015)
Volume Volume 1 (2014)
Elmaghraby, E., Gody, A., Farouk, M. (2019). Speech Recognition Using Historian Multimodal Approach. The Egyptian Journal of Language Engineering, 6(2), 44-58. doi: 10.21608/ejle.2019.59164
Eslam Eid Elmaghraby; Amr Refaat Gody; Mohamed Hashem Farouk. "Speech Recognition Using Historian Multimodal Approach". The Egyptian Journal of Language Engineering, 6, 2, 2019, 44-58. doi: 10.21608/ejle.2019.59164
Elmaghraby, E., Gody, A., Farouk, M. (2019). 'Speech Recognition Using Historian Multimodal Approach', The Egyptian Journal of Language Engineering, 6(2), pp. 44-58. doi: 10.21608/ejle.2019.59164
Elmaghraby, E., Gody, A., Farouk, M. Speech Recognition Using Historian Multimodal Approach. The Egyptian Journal of Language Engineering, 2019; 6(2): 44-58. doi: 10.21608/ejle.2019.59164

Speech Recognition Using Historian Multimodal Approach

Article 4, Volume 6, Issue 2, September 2019, Page 44-58  XML PDF (1.06 MB)
Document Type: Original Article
DOI: 10.21608/ejle.2019.59164
View on SCiNiTO View on SCiNiTO
Authors
Eslam Eid Elmaghraby email orcid 1; Amr Refaat Godyorcid 2; Mohamed Hashem Faroukorcid 3
1Communication and Electronics Engineering Department from faculty of engineering, Fayoum University
2Faculty of Engineering, Fayoum University
3Engineering Math. & Physics Dept., Faculty of Engineering, Cairo University
Abstract
This paper proposes an Audio-Visual Speech Recognition (AVSR) model using both audio and visual speech information
to improve recognition accuracy in a clean and noisy environment. Mel frequency cepstral coefficient (MFCC) and Discrete
Cosine Transform (DCT) are used to extract the effective features from audio and visual speech signal respectively. The
Classification process is performed on the combined feature vector by using one of main Deep Neural Network (DNN)
architecture, Bidirectional Long-Short Term Memory (BiLSTM), in contrast to the traditional Hidden Markov Models (HMMs).
The effectiveness of the proposed model is demonstrated on a multi-speakers AVSR benchmark dataset named GRID. The
experimental results show that the early integration between audio and visual features achieved an obvious enhancement in the
recognition accuracy and prove that BiLSTM is the most effective classification technique when compared to HMM. The obtained
results when using integrated audio-visual features achieved highest recognition accuracy of 99.07%, this result demonstrates an
enhancement of up to 9.28% over audio-only recognition for clean data. While for noisy data, the highest recognition accuracy for
integrated audio-visual features is 98.47% with enhancement up to 12.05% over audio-only. The main reason for BiLSTM
effectiveness is it takes into account the sequential characteristics of the speech signal. The obtained results show the performance
enhancement compared to previously obtained highest audio visual recognition accuracies on GRID, and prove the robustness of
our AVSR model (BiLSTM-AVSR).
Keywords
DCT; MFCC; HMM; BiLSTM; and GRID
Statistics
Article View: 289
PDF Download: 749
Home | Glossary | News | Aims and Scope | Sitemap
Top Top

Journal Management System. Designed by NotionWave.