Beta
319958

A Comparative Study for Different Resampling Techniques for Imbalanced datasets

Article

Last updated: 24 Dec 2024

Subjects

-

Tags

-

Abstract

The imbalanced data is a significant challenge for

researchers in supervised machine learning. Current data mining algorithms are not effective for processing imbalanced data.

In fact, this problem reduces classification accuracy because the

prediction of minority classes is inaccurate. The classification

of imbalanced data is the major challenge that has received

significant attention. Therefore, The use of sampling techniques

to improve classification performance has been a significant

consideration in related work. In this paper, a comparative

study of six different sampling algorithms is performed. The

employed sampling algorithms are from different sampling

techniques: two oversampling algorithms, two undersampling

algorithms, and two combination algorithms between oversampling and undersampling. The techniques used in oversampling

are random oversampling and SMOTE, while undersampling

techniques are random undersampling and a near miss. A

combination of oversampling and undersampling techniques

is SMOTE TOMEK and SMOTEEN. This comparative study

aims to examine the impact of the employed sampling method.

Algorithms on the performance of three classifiers: SVM, KNN,

and logistic regression. Cross-validation experiments on 12

standard datasets show that the SMOTEEN sampling The

algorithm achieves significant improvements compared with

other typical algorithms.

DOI

10.21608/ijci.2023.236287.1136

Keywords

Imbalanced data, resampling techniques, SMOTE, SMOTEEN, SMOTE Tomek

Authors

First Name

Alaa

Last Name

Elsobky

MiddleName

Mahmoud

Affiliation

menoufia

Email

alaa.elsobky90@gmail.com

City

shebeen

Orcid

-

First Name

Arabi

Last Name

Keshk

MiddleName

ELsaid

Affiliation

menoufia

Email

arabi77staff@ci.menofia.edu.eg

City

shebeen

Orcid

-

First Name

Mohamed

Last Name

Malhat

MiddleName

Gaber

Affiliation

menoufia

Email

mohamed.gaber@ci.menofia.edu.eg

City

Marinah

Orcid

-

Volume

10

Article Issue

3

Related Issue

43466

Issue Date

2023-11-01

Receive Date

2023-10-04

Publish Date

2023-11-01

Page Start

147

Page End

156

Print ISSN

1687-7853

Online ISSN

2735-3257

Link

https://ijci.journals.ekb.eg/article_319958.html

Detail API

https://ijci.journals.ekb.eg/service?article_code=319958

Order

21

Type

Original Article

Type Code

877

Publication Type

Journal

Publication Title

IJCI. International Journal of Computers and Information

Publication Link

https://ijci.journals.ekb.eg/

MainTitle

A Comparative Study for Different Resampling Techniques for Imbalanced datasets

Details

Type

Article

Created At

24 Dec 2024