Beta
226901

Arabic Text to Image Generation based on Generative Network of Fine-Grained Visual Descriptions

Article

Last updated: 28 Dec 2024

Subjects

-

Tags

-

Abstract

Converting natural language text descriptions into images is a challenging problem in computer vision and has many practical applications. Text-image is not different from language translation problems. In the same way similar semantics can be encoded in two different languages, images and text are two different languages to encode related information. None the less, these problems are totally different because text-image or image-text conversions are highly multimodal problems. In this paper, we propose our model for Arabic text description that allows multi-stage, attention-driven for refinement for fine-grained Arabic text-to-image generation. With a modern attentional generative network, the Attentional model enable to synthesize fine-grained details at different sub-regions of the image by paying attentions to the related words in the natural Arabic language description. We train the model from scratch to Modified-Arabic dataset. The important term in our Network is a word level fine-grained image-text matching loss computed by the Deep Attentional Multimodal Similarity Model (DAMSM). The DAMSM learns two main neural networks that map sub-regions of the image and Arabic words of the sentence to a common semantic space. Our model achieves strong performance on Arabic-text encoder and image encoder, it is characterized by ease and accuracy in description the images on the Caltech-UCSD Birds 200-2011 dataset. 

DOI

10.21608/bjas.2020.226901

Keywords

Machine Learning, Deep learning, Generative Adversarial Networks, Recurrent Neural Network, natural language processing, text analysis, Image Matching

Authors

First Name

S.M.

Last Name

Salem

MiddleName

-

Affiliation

Mathematics Dept., Faculty of Science, Benha University, Benha, Egypt

Email

-

City

-

Orcid

-

First Name

M.L.

Last Name

Ramadan

MiddleName

-

Affiliation

Computer Science Dept., Faculty of Computers and Artificial Intelligence, Benha University, Benha, Egypt

Email

-

City

-

Orcid

-

Volume

5

Article Issue

Issue 7 part (1) - (2)

Related Issue

20337

Issue Date

2020-10-01

Receive Date

2020-10-24

Publish Date

2020-10-01

Page Start

167

Page End

173

Print ISSN

2356-9751

Online ISSN

2356-976X

Link

https://bjas.journals.ekb.eg/article_226901.html

Detail API

https://bjas.journals.ekb.eg/service?article_code=226901

Order

28

Type

Original Research Papers

Type Code

1,647

Publication Type

Journal

Publication Title

Benha Journal of Applied Sciences

Publication Link

https://bjas.journals.ekb.eg/

MainTitle

Arabic Text to Image Generation based on Generative Network of Fine-Grained Visual Descriptions

Details

Type

Article

Created At

23 Jan 2023