Beta
128966

Pre-processing Steps ‎ for Genome-wide High-‎density‎ NARAC Dataset Facilitates its ‎‎Haplotype Block Partitioning ‎

Article

Last updated: 04 Jan 2025

Subjects

-

Tags

Medical Engineering.

Abstract

The pre-processing ‎ ‎ phase‎ is a crucial step to prepare any data for deep considerable ‎ analysis. ‎Genome-wide data ‎is considered ‎ big data; dealing with such data is not an easy task and still poses ‎a significant challenge. The ‎genome-wide association study (GWAS) ‎ is based on enormous high-‎density data with high throughput. This paper has illustrated the main pre-processing ‎ steps on data ‎from North American Rheumatoid Arthritis Consortium ‎‎(NARAC) for preparing it for haplotype ‎block partitioning using different methods and with different platforms. This paper's main ‎objective is to summarize the steps of pre-processing the raw genotyped dataset to prepare it for ‎haplotype block partitioning and further analyses. Besides, we present each practical step by clear ‎tables for better visualizing, elucidation, and workflow interpretation. Besides, we aimed to ‎overcome the missing data and normalize the output in a standardized format. Eventually, this will ‎improve the understanding of such data formats and build the foundation stone of critical genome-wide experiments and studies. Thus, this work could a guide for other researchers who use similar ‎data. The pre-processed data will be applied to imputation, BigLD block partitioning under R and ‎Haploview methods. Our sequence of ‎pre-processing steps includes preparing the characters to be ‎in a form that is suitable for imputation. The next step is ‎recording data in 0,1,2 format to be ‎proper for the BigLD. We were finally preparing data for Haploview to ‎provide clear haplotype ‎block partitioning, association analysis, and furthermore.‎

DOI

10.21608/jaet.2020.40032.1035

Keywords

haplotype blocks, single-nucleotide polymorphism, linkage ‎disequilibrium, interval graph modeling of clusters, rheumatoid ‎arthritis. ‎

Authors

First Name

Fatma

Last Name

Ibrahim

MiddleName

Sayed

Affiliation

Biomedical Engineering Department,‎ Faculty of Engineering, Minia University, Egypt ‎

Email

fatmasayed93@mu.edu.eg

City

Minia

Orcid

https://orcid.org/00

First Name

Mohamed ‎

Last Name

Saad

MiddleName

-

Affiliation

Biomedical Engineering dept. Minia University ‎ Minia, Egypt

Email

m.n.saad@ieee.org

City

-

Orcid

-

First Name

Ashraf

Last Name

Said ‎

MiddleName

-

Affiliation

ashraf.mahroos@mu.edu.eg

Email

ashraf.mahroos@mu.edu.eg

City

-

Orcid

0000-0001-7330-9234

First Name

Hesham

Last Name

Hamed

MiddleName

-

Affiliation

Electrical Engineering Dept., Faculty of Engineering, Minia University, Al-Minia, Egypt. Electrical Engineering Dept., ‎Faculty of Engineering, Egyptian Russian University, Cairo, Egypt

Email

hesham.fathi@mu.edu.eg

City

-

Orcid

-

Volume

40

Article Issue

2

Related Issue

19169

Issue Date

2021-04-01

Receive Date

2020-08-21

Publish Date

2021-04-01

Page Start

61

Page End

69

Print ISSN

2682-2091

Online ISSN

2812-5487

Link

https://jaet.journals.ekb.eg/article_128966.html

Detail API

https://jaet.journals.ekb.eg/service?article_code=128966

Order

6

Publication Type

Journal

Publication Title

Journal of Advanced Engineering Trends

Publication Link

https://jaet.journals.ekb.eg/

MainTitle

Pre-processing Steps ‎ for Genome-wide High-‎density‎ NARAC Dataset Facilitates its ‎‎Haplotype Block Partitioning ‎

Details

Type

Article

Created At

22 Jan 2023