• Home
  • Browse
    • Current Issue
    • By Issue
    • By Author
    • By Subject
    • Author Index
    • Keyword Index
  • Journal Info
    • About Journal
    • Aims and Scope
    • Editorial Board
    • Publication Ethics
    • Peer Review Process
  • Guide for Authors
  • Submit Manuscript
  • Contact Us
 
  • Login
  • Register
Home Articles List Article Information
  • Save Records
  • |
  • Printable Version
  • |
  • Recommend
  • |
  • How to cite Export to
    RIS EndNote BibTeX APA MLA Harvard Vancouver
  • |
  • Share Share
    CiteULike Mendeley Facebook Google LinkedIn Twitter
The Egyptian Journal of Language Engineering
arrow Articles in Press
arrow Current Issue
Journal Archive
Volume Volume 11 (2024)
Volume Volume 10 (2023)
Volume Volume 9 (2022)
Volume Volume 8 (2021)
Volume Volume 7 (2020)
Volume Volume 6 (2019)
Volume Volume 5 (2018)
Issue Issue 2
Issue Issue 1
Volume Volume 4 (2017)
Volume Volume 3 (2016)
Volume Volume 2 (2015)
Volume Volume 1 (2014)
Gody, A., Emam, Y., Hussein, N. (2018). Novel Image PreprocessingApproach for Automatic Speech Recognition. The Egyptian Journal of Language Engineering, 5(2), 1-15. doi: 10.21608/ejle.2018.60081
Amr M. Gody; Youssra Abdelmoniem Emam; Nashaat M. Hussein. "Novel Image PreprocessingApproach for Automatic Speech Recognition". The Egyptian Journal of Language Engineering, 5, 2, 2018, 1-15. doi: 10.21608/ejle.2018.60081
Gody, A., Emam, Y., Hussein, N. (2018). 'Novel Image PreprocessingApproach for Automatic Speech Recognition', The Egyptian Journal of Language Engineering, 5(2), pp. 1-15. doi: 10.21608/ejle.2018.60081
Gody, A., Emam, Y., Hussein, N. Novel Image PreprocessingApproach for Automatic Speech Recognition. The Egyptian Journal of Language Engineering, 2018; 5(2): 1-15. doi: 10.21608/ejle.2018.60081

Novel Image PreprocessingApproach for Automatic Speech Recognition

Article 1, Volume 5, Issue 2, September 2018, Page 1-15  XML PDF (902.21 K)
Document Type: Original Article
DOI: 10.21608/ejle.2018.60081
View on SCiNiTO View on SCiNiTO
Authors
Amr M. Godyorcid 1; Youssra Abdelmoniem Emam email 2; Nashaat M. Hussein3
1Electrical Engineering Department, Faculty of Engineering, Fayoum University
2Communications and Electronics Department, Faculty of Engineering - Fayoum University
3Electronics& Communication Engineering, Faculty of Engineering, Fayoum University,Egypt
Abstract
This research is intending to provide a novel approach of manipulating automatic speech recognition using image recognition approach. This research introduces hybrid 2D-Image-Hidden Markov Model(2DI)-(HMM) approach to handle preprocessing classification task in Automatic Speech Recognition System (ASR). The focus in this research is in the classification task. Due to that the proposed approach is novel and is a task in the whole ASR, it is evaluated using relative comparison to other popular approaches to run the same task on the same database. The relative comparison with hybrid Gaussian Mixture (GMM)-HMM with Mel Frequency Cepstral (MFCC) features is considered as reference results. This research introduces a new method of mapping speech signal into two-dimensionalspace. Speech stream is segmented and then the frequency contents are projected into frequency domain using a balanced tree structure filter. The wavelet packets technique is used to implement the filtering. The tree structure is captured into image. Database is constructed of encoded images. The imagesthenare segregated into speech classes. Hybrid Discrete Cosine Transform (DCT) based featuresare used for image encoding with (HMM) as Class model is evaluated against MFCC-HMM for the same classification problem. The proposed hybrid model indicates better balanced results over MFCC-HMM for handling the different classes. The considered classes in this research are vowels, consonants, plosives and speech silence.
KED-TIMITCorpus is used in this research as source of speech information. This approach is indicating promising results especiallyin Silence and vowels detection.
Keywords
English Phone Recognition; Automatic Speech recognition (ASR); Mel-Scale; DCT; Wavelet packets; HTK; BTE and MFCC
Statistics
Article View: 234
PDF Download: 490
Home | Glossary | News | Aims and Scope | Sitemap
Top Top

Journal Management System. Designed by NotionWave.