Beta
81879

Broad Phonetic Classification of ASR using Visual Based Features

Article

Last updated: 24 Dec 2024

Subjects

-

Tags

-

Abstract

Abstract: This paper presents a novel method of classifying speech phonemes. Four hybrid techniques based on the acoustic-phonetic approach and pattern recognition approach are used to emphasize the principle idea of this research. The first hybrid model is constructed of fixed state, structured Hidden Markov Model, Gaussian Mixture, Mel scaled Best Tree Image, Convolution Neural network, Vector Quantization (FS-HMM-GM-MBTI-CNN-VQ). The second hybrid model is constructed of variable state, dynamically structured Hidden Markov Model, Gaussian Mixture, Mel scaled Best Tree Image, Convolution Neural network, Vector Quantization (VS-HMM-GM-MBTI-CNN-VQ). The third hybrid model is constructed of fixed state, structured Hidden Markov Model, Gaussian Mixture, Mel scaled Best Tree Image, Convolution Neural network (FS-HMM-GM-MBTI-CNN). The fourth hybrid model is constructed of variable state, dynamically structured Hidden Markov Model, Gaussian Mixture, Mel scaled Best Tree Image, Convolution Neural network (VS-HMM-GM-MBTI-CNN). TIMIT database is used in this paper. All phones are classified into five classes and segregated into Vowels, Plosives, Fricatives, Nasals, and Silences. The results show that using (VS-HMM-GM-MBTI-CNN-VQ) is an available method for classification of phonemes, with the potential for use in applications such as automatic speech recognition and automatic language identification. Competitive results are achieved especially in nasals, plosives, and silence high successive rates than others.

DOI

10.21608/ejle.2020.24358.1003

Keywords

ASR, HTK, Convolution Neural Network, Vector Quantization, Hidden Markov Model

Authors

First Name

Doaa

Last Name

Lehabik

MiddleName

Ahmed

Affiliation

Department of Communication and Electronic Faculty of Engineering Fayoum University

Email

da1174@fayoum.edu.eg

City

-

Orcid

-

First Name

Mohamed

Last Name

Merzban

MiddleName

H.

Affiliation

Faculty of Engineering Fayoum University

Email

mhm00@fayoum.edu.eg

City

Fayoum

Orcid

-

First Name

Sameh

Last Name

Saad

MiddleName

F.

Affiliation

Faculty of engineering

Email

dr.sam.far@gmail.com

City

cairo

Orcid

-

First Name

Amr

Last Name

Gody

MiddleName

M.

Affiliation

Faculty of Engineering, Fayoum University

Email

amg00@fayoum.edu.eg

City

Fayoum

Orcid

0000-0003-2079-9860

Volume

7

Article Issue

1

Related Issue

14114

Issue Date

2020-04-01

Receive Date

2020-01-06

Publish Date

2020-04-01

Page Start

14

Page End

26

Print ISSN

2356-8208

Online ISSN

2356-8216

Link

https://ejle.journals.ekb.eg/article_81879.html

Detail API

https://ejle.journals.ekb.eg/service?article_code=81879

Order

2

Type

Original Article

Type Code

1,039

Publication Type

Journal

Publication Title

The Egyptian Journal of Language Engineering

Publication Link

https://ejle.journals.ekb.eg/

MainTitle

Broad Phonetic Classification of ASR using Visual Based Features

Details

Type

Article

Created At

22 Jan 2023