Beta
160437

Comparative Study of different types of RNN in Speech Classification

Article

Last updated: 24 Dec 2024

Subjects

-

Tags

-

Abstract

This paper introduces different models for pre-processing classification and their performance in Automatic Speech Recognition system. Different Recurrent Neural Network (RNN) architectures have been tested for this problem, such as RNN cells (RNN), bidirectional RNN (BRNN), Long Short-Term Memory (LSTM), and bidirectional LSTM. Mainly two features have been considered. First, Mel frequency cepstral coefficient (MFCC) plus delta and delta-delta coefficients (39 parameters) have been used. Second, MFCC quantization using Vector Quantization technique has been used as features. All models have been trained on TIMIT database. Vowels, nasals, Fricatives, plosives, and silences have been chosen as syllable classes for classification. Experiment results show that BRNN-MFCC-5- {30,30,20,25,25} and BLSTM-MFCC -4- {30,30,25,20} systems with MFCC plus delta and delta-delta coefficients (39 parameters) give the highest accuracy. It achieved 92.6% and 92.07%, respectively. vowels, nasals, and silences give the highest accuracy in BLSTM-MFCC -4- {30,30,25,20}model with 98.5%, 83.6% and 93.7%, respectively. Fricatives and plosives in BRNN-MFCC-5- {30,30,20,25,25} model with 89.7% and 66%, respectively.

DOI

10.21608/ejle.2021.45203.1014

Keywords

ASR, classification technique, RNN, MFCC, Vector Quantization

Authors

First Name

Ayat

Last Name

Ragheb

MiddleName

N.

Affiliation

Electronics and communication, Faculty of engineering, Fayoum university, Fayoum, Egypt

Email

an1162@fayoum.edu.eg

City

fayoum

Orcid

-

First Name

Amr

Last Name

Gody

MiddleName

-

Affiliation

Faculty of Engineering, Fayoum University

Email

amg00@fayoum.edu.eg

City

Fayoum

Orcid

0000-0003-2079-9860

First Name

Tarek

Last Name

Said

MiddleName

-

Affiliation

Electronics and communication, Faculty of engineering, Fayoum University, Fayoum, EGYPT

Email

tms02@fayoum.edu.eg

City

Fayoum, egypt

Orcid

-

Volume

8

Article Issue

1

Related Issue

23421

Issue Date

2021-04-01

Receive Date

2020-10-05

Publish Date

2021-04-01

Page Start

1

Page End

16

Print ISSN

2356-8208

Online ISSN

2356-8216

Link

https://ejle.journals.ekb.eg/article_160437.html

Detail API

https://ejle.journals.ekb.eg/service?article_code=160437

Order

1

Type

Original Article

Type Code

1,039

Publication Type

Journal

Publication Title

The Egyptian Journal of Language Engineering

Publication Link

https://ejle.journals.ekb.eg/

MainTitle

Comparative Study of different types of RNN in Speech Classification

Details

Type

Article

Created At

22 Jan 2023