38725

Cybercrime and Authorship Detection in Very Short Texts A Quantitative Morpho-lexical Approach

Article

Last updated: 03 Jan 2025

Subjects

-

Tags

-

Abstract

 
Abstract
The present study proposes an integrated framework that considers letter-pair frequencies/combinations along with the lexical features of documents. Drawing on a quantitative morpho-lexical approach, the study tests the hypothesis that letter information or mapping carries unique stylistic features; and therefore detecting stable word combinations and morphological patterns can be used to enhance the authorship performance in relation to very short texts. The data used for analysis is a corpus of 12240 tweets derived from 87 Twitter accounts. Self-organizing maps (SOMs) model is used for classifying the input patterns that share common features together as a clue that tweets grouped under one class membership are written by the same author. Results indicate that the classification accuracy based on the proposed system is around 76%. Up to 22% of this accuracy was lost, however, when only distinctive words were used, and 26% was lost when the classification performance was based on letter combinations and morphological patterns only. The integration of letter-pairs and morphological patterns had the advantage of improving the accuracy of determining the author of a given tweet. This indicates that the integration of different linguistic variables into an integrated system leads to a better classification performance of very short texts. It is also clear that the use of the self-organizing map (SOM) led to better clustering performance for its capacity to integrate two different linguistic levels of each author profile together. 
.

DOI

10.21608/jssa.2019.38725

Authors

First Name

Abdulfattah

Last Name

Omar

MiddleName

-

Affiliation

Department of English, College of Science & Humanities, Prince Sattam Bin Abdulaziz University, Al-Kharj, Riyadh, 11942, Kingdom of Saudi Arabia.

Email

-

City

-

Orcid

-

Volume

20

Article Issue

العدد العشرون الجزء الأول

Related Issue

6207

Issue Date

2019-07-01

Receive Date

2019-07-03

Publish Date

2019-07-01

Page Start

1

Page End

25

Print ISSN

2356-8321

Online ISSN

2356-833X

Link

https://jssa.journals.ekb.eg/article_38725.html

Detail API

https://jssa.journals.ekb.eg/service?article_code=38725

Order

10

Type

المقالة الأصلية

Type Code

653

Publication Type

Journal

Publication Title

مجلة البحث العلمي في الآداب

Publication Link

https://jssa.journals.ekb.eg/

MainTitle

Cybercrime and Authorship Detection in Very Short Texts A Quantitative Morpho-lexical Approach

Details

Type

Article

Created At

22 Jan 2023