Beta
411083

Deep Learning and Fourier Transform for Speaker Recognition(DLFSR)

Article

Last updated: 15 Feb 2025

Subjects

-

Tags

Electrical Engineering

Abstract

Automatic Speaker recognition (ASR) and verification have gained increased visibility and significance in society as speech technology. Speaker recognition has undergone a revolution due to deep learning techniques, specifically deep neural networks (DNNs). With the use of models like convolutional neural networks (CNNs) and recurrent neural networks (RNNs), it is possible to learn discriminative features directly from unprocessed speech signals without the requirement for manual feature extraction. A growing number of people are using end-to-end speaker recognition models because of how well they work and how easily they can link speaker IDs to speech waveforms. It can recognize and authenticate people based on their distinct vocal traits. A lot of Applications of automatic speaker recognition can be found in many areas, such as voice-based digital device authentication, forensic analysis of audio recordings, access control, and phone-based customer support identification. Through our study, we introduce a Deep Learning and Fourier Transform for Speaker Recognition model (LDLSR)that based on Short Term Fourier Transform (STFT) in which the input speech can be transformed into spectrogram then we apply deep learning especially Convolutional Neural Network (CNN) to the spectrogram images to extract feature and classify the spoken person. The training and validation test are applied on speaker recognition dataset 16000pcm.This model performs excellent result with 98.8% correct identification and classification.

DOI

10.21608/fuje.2024.313518.1090

Keywords

ASR, STFT, CNN, RNN, DLFTSR, pcm dataset

Authors

First Name

Taqwa

Last Name

Sayed

MiddleName

Mahmoud

Affiliation

tamiyyah-fayoum-egypt tamiyyah.fayoum.egypt

Email

taqwamahmoud92@gmail.com

City

-

Orcid

-

First Name

Amr

Last Name

Gody

MiddleName

-

Affiliation

Kyman Faryes Faculty of engineering

Email

amg00@fayoum.edu.eg

City

Fayoum

Orcid

0000-0003-2079-9860

First Name

Sayed

Last Name

Muhammad

MiddleName

T.

Affiliation

Computers and Systems Engineering Department, Faculty of Engineering, Fayoum University,Fayoum ,Egypt

Email

stm11@fayoum.edu.eg

City

Fayoum

Orcid

-

Volume

8

Article Issue

1

Related Issue

53725

Issue Date

2025-01-01

Receive Date

2024-08-18

Publish Date

2025-01-01

Page Start

143

Page End

151

Print ISSN

2537-0626

Online ISSN

2537-0634

Link

https://fuje.journals.ekb.eg/article_411083.html

Detail API

http://journals.ekb.eg?_action=service&article_code=411083

Order

411,083

Type

Original Article

Type Code

651

Publication Type

Journal

Publication Title

Fayoum University Journal of Engineering

Publication Link

https://fuje.journals.ekb.eg/

MainTitle

Deep Learning and Fourier Transform for Speaker Recognition(DLFSR)

Details

Type

Article

Created At

15 Feb 2025