Subjects
-Abstract
Text classification is an important task in NLP for various applications from movie review classification to market analysis. NLP as a tool provides the capability to process huge amount of text and come up with conclusions. In this paper we inves-tigate statistical machine learning for NLP for document classification. The target problem of choice is sentiment analysis, we explore various techniques for text pre-processing, feature selection and model selection to find a good fit model. This paper acts as both a system proposal and also a primer for those who to start practicing NLP, we try to provide insight and intuition about modelling choices for text classi-fication that extend even beyond the task scope to general NLP. In this paper we propose a feature based text sentiment analysis relying heavily of the BoN (Bag of N-grams) model and utilizing these features with a statistical ML classifier. We use the IMDB movie review dataset (Maas et al. 2011) for benchmarking.
DOI
10.21608/fuje.2022.124088.1013
Keywords
Machine Learning, Sentiment Analysis, natural language processing, IMDB Sentiment Analysis, Text Classification
Authors
Affiliation
Teaching Assistant, Electrical Engineering Department, Faculty of Engineering, Fayoum University, Fayoum 63514, Egypt
Orcid
-Affiliation
Professor of Electrical Engineering - Faculty of Engineering - Fayoum University - Fayoum 63514, Egypt
City
-Orcid
-Link
https://fuje.journals.ekb.eg/article_239705.html
Detail API
https://fuje.journals.ekb.eg/service?article_code=239705
Publication Title
Fayoum University Journal of Engineering
Publication Link
https://fuje.journals.ekb.eg/
MainTitle
-