Beta
60254

Using Mel-Mapped Best Tree Encoding for Baseline-Context-Independent-Mono-Phone Automatic Speech Recognition

Article

Last updated: 24 Dec 2024

Subjects

-

Tags

-

Abstract

Best-Tree Encoding (BTE) is first introduced by Amr M. Gody [1] as new features for Automatic Speech Recognition (ASR) problem. BTE is basically acting as spectrum analyzer. It relies on Wavelet packets to get projection of signal power into predefined filter banks. The feature components are encoded into digital form using certain entropy method and certain digital encoding procedure. In this research BTE is further developed by including two more key factors into the BTE process. The key factors are Mel-scale (MS) and baseband Bandwidth mapping (BM).This Research provides a baseline performance evaluation for Context-independent mono-phone recognition (Without Grammar) of English by using Vid-TIMIT database. Vid-TIMIT consists of 43 speakers (19 female and 24 male), reciting short sentences. The recording of this database was done in a noisy environment (mostly computer fan noise) and also it is not hand verified. Total of 15643 phone segments are used for testing and evaluating the newly proposed features. HMM is used as recognition engine via HTK toolkit for its popularity in ASR. Comparison to MFCC on the same database is considered to evaluate the system results. Although it gives the same recognition efficiency as MFCC on the same testing database, the proposed model saves almost 66% of the required storage than the feature vector of MFCC.

DOI

10.21608/ejle.2015.60254

Keywords

Automatic Speech recognition (ASR), Arabic Phone Recognition, Wavelet packets, Mel-Scale, WPBTE, MFCC, HTK and BTE

Authors

First Name

Amr

Last Name

Gody

MiddleName

-

Affiliation

Electronics and Communications Engineering Department, Faculty of Engineering, Fayoum University, Egypt

Email

amg00@fayoum.edu.eg

City

Fayoum, Egypt

Orcid

0000-0003-2079-9860

First Name

Rania

Last Name

Abul Seoud

MiddleName

-

Affiliation

Electronics and Communications Engineering Department, Faculty of Engineering, Fayoum University, Egypt

Email

raa00@fayoum.edu.eg

City

Cairo, Egypt

Orcid

-

First Name

Mai

Last Name

Ezz El-Din

MiddleName

-

Affiliation

Electronics and Communications Engineering Department, Faculty of Engineering, Fayoum University, Egypt

Email

mai.ezzeldin.89@gmail.com

City

Fayoum, Egypt

Orcid

-

Volume

2

Article Issue

1

Related Issue

9158

Issue Date

2015-04-01

Receive Date

2014-11-22

Publish Date

2015-04-23

Page Start

10

Page End

24

Print ISSN

2356-8208

Online ISSN

2356-8216

Link

https://ejle.journals.ekb.eg/article_60254.html

Detail API

https://ejle.journals.ekb.eg/service?article_code=60254

Order

2

Type

Original Article

Type Code

1,039

Publication Type

Journal

Publication Title

The Egyptian Journal of Language Engineering

Publication Link

https://ejle.journals.ekb.eg/

MainTitle

Using Mel-Mapped Best Tree Encoding for Baseline-Context-Independent-Mono-Phone Automatic Speech Recognition

Details

Type

Article

Created At

22 Jan 2023