Beta
160440

CONVOLUTIONAL NEURAL NETWORK FOR ARABIC SPEECH RECOGNITION

Article

Last updated: 24 Dec 2024

Subjects

-

Tags

-

Abstract

This work is focused on single word Arabic automatic speech recognition (AASR). Two techniques are used during the feature extraction phase; Log frequency spectral coefficients (MFSC) and Gammatone-frequency cepstral coefficients (GFCC) with their first and second-order derivatives. The convolutional neural network (CNN) is mainly used to execute feature learning and classification process. CNN achieved performance enhancement in automatic speech recognition (ASR). Local connectivity, weight sharing, and pooling are the crucial properties of CNNs that have the potential to improve ASR. We tested the CNN model using an Arabic speech corpus of isolated words. The used corpus is synthetically augmented by applying different transformations such as changing the pitch, the speed, the dynamic range, adding noise, and forward and backward shift in time. It was found that the maximum accuracy obtained when using GFCC with CNN is 99.77 %. The outcome results of this work are compared to previous reports and indicate that CNN achieved better performance in AASR.

DOI

10.21608/ejle.2020.47685.1015

Keywords

Arabic automatic speech recognition (AASR), Log frequency spectral coefficients (MFSC), gammatone-frequency cepstral coefficients (GFCC), Convolutional neural network (CNN), isolated words

Authors

First Name

Engy

Last Name

Abdelmaksoud

MiddleName

Ragaei

Affiliation

Basic science, Faculty of Computers and Informayion, Fayoum University

Email

era00@fayoum.edu.eg

City

Fayoum

Orcid

-

First Name

Arafa

Last Name

Hassen

MiddleName

-

Affiliation

Physics departrment, faculty of science, Fayoum University,

Email

ash02@fayoum.edu.eg

City

-

Orcid

-

First Name

Nabila

Last Name

Hassan

MiddleName

-

Affiliation

Basic Science department, faculty of computers & information, Fayoum University

Email

nmh00@fayoum.edu.eg

City

-

Orcid

-

First Name

Mohamed

Last Name

Hesham

MiddleName

-

Affiliation

Engineering Math & Physics Department, Faculty of Engineering, Cairo University

Email

mhesham@eng1.cu.edu.com

City

-

Orcid

-

Volume

8

Article Issue

1

Related Issue

23421

Issue Date

2021-04-01

Receive Date

2020-10-25

Publish Date

2021-04-01

Page Start

27

Page End

38

Print ISSN

2356-8208

Online ISSN

2356-8216

Link

https://ejle.journals.ekb.eg/article_160440.html

Detail API

https://ejle.journals.ekb.eg/service?article_code=160440

Order

3

Type

Original Article

Type Code

1,039

Publication Type

Journal

Publication Title

The Egyptian Journal of Language Engineering

Publication Link

https://ejle.journals.ekb.eg/

MainTitle

CONVOLUTIONAL NEURAL NETWORK FOR ARABIC SPEECH RECOGNITION

Details

Type

Article

Created At

22 Jan 2023