406689

Efficient Email Spam Detection Using Machine Learning Techniques: A Comparative Analysis of Classification Models

Article

Last updated: 01 Feb 2025

Subjects

-

Tags

-

Abstract

Spam emails pose a significant challenge to digital communication by compromising user privacy and security. This study investigates the performance of classical machine learning and modern deep learning models for email spam detection using a publicly available Kaggle dataset consisting of over 5,000 emails. Among machine learning classifiers, the Support Vector Machine (SVM) demonstrated better performance, achieving an accuracy of 99.0\% and an F1-score of 0.97, underscoring its robustness and capability to effectively generalize across diverse data. Logistic Regression also exhibited competitive results with an accuracy of 98.4\%, complemented by its interpretability, enabling a detailed analysis of feature importance. Additionally, transformer-based deep learning models, including BERT, DistilBERT, RoBERTa, and XLNet, were evaluated. BERT achieved the highest accuracy among these models at 98.8\%, with an F1-score of 0.97, showcasing its ability to capture contextual nuances in text. Comprehensive evaluation metrics such as precision, recall, and specificity were employed to ensure a holistic comparison of model performance. To facilitate practical deployment, a user-friendly interface was developed for real-time email classification. These findings highlight the efficacy of both classical and modern approaches to spam detection, offering valuable insights for advancing email security and enabling the development of scalable, real-time applications.

DOI

10.21608/ijicis.2024.321043.1355

Keywords

Email Spam, Machine Learning, Deep learning, Text Classification, Spam Filter

Authors

First Name

Md Nurul

Last Name

Raihen

MiddleName

-

Affiliation

Department of Mathematics and Computer Science, Fontbonne University, Saint Louis, MO, USA

Email

nurul.raihen@gmail.com

City

Saint Louis, USA

Orcid

0000-0003-2680-0658

First Name

Shivani

Last Name

Rana

MiddleName

-

Affiliation

Department of Statistics, Western Michigan University, Kalamazoo, MI, USA

Email

shivani.38.rana@wmich.edu

City

Kalamazoo

Orcid

-

First Name

Sultana

Last Name

Akter

MiddleName

-

Affiliation

Institute for Data science and Informatics, University of Missouri, Columbia, MO, USA

Email

sa4kf@umsystem.edu

City

-

Orcid

-

First Name

Md Abdul

Last Name

Kadir

MiddleName

-

Affiliation

Department of Mathematics, University of Houston, Texas, USA

Email

kadir.w9@gmail.com

City

-

Orcid

-

Volume

24

Article Issue

4

Related Issue

52576

Issue Date

2024-12-01

Receive Date

2024-09-15

Publish Date

2024-12-01

Page Start

1

Page End

15

Print ISSN

1687-109X

Online ISSN

2535-1710

Link

https://ijicis.journals.ekb.eg/article_406689.html

Detail API

http://journals.ekb.eg?_action=service&article_code=406689

Order

406,689

Type

Original Article

Type Code

494

Publication Type

Journal

Publication Title

International Journal of Intelligent Computing and Information Sciences

Publication Link

https://ijicis.journals.ekb.eg/

MainTitle

Efficient Email Spam Detection Using Machine Learning Techniques: A Comparative Analysis of Classification Models

Details

Type

Article

Created At

01 Feb 2025