Beta
314563

On the Performance of Some Methods for Handling Missing Values in Classification Analysis. By: D. Adel M. Zaher and D. Ahmed H. Haroun and Mahmoud A. Mahmoud

Article

Last updated: 28 Dec 2024

Subjects

-

Tags

-

Abstract

Classification Analysis is concerned with the problem of classifying a subject to one of several distinct groups on the basis of a set of measurements. For example, in business, a bank loan officer wishes to classify loan applicants to low risk credit customers or high risk credit customers on the basis of a set of variables. The presence of missing values in a data set used for building a classification rule is a serious problem that may face the investigator when applying the classification analysis (CA) to a practical situation. Many procedures have been developed to handle the missing values when applying (CA). The default method of handling missing values in (CA) used by many statistical packages (for example, SAS, Minitab and SPSS) is to omit all units containing missing values. Thus, considerable information may be lost due to the reduction of the sample size. Many studies dealt with this problem in case of two multivariate populations with equal covariance matrices while a few studies treated it in case of two multivariate populations with unequal covariance matrices. The present study deals with the problem of classification analysis with missing values in case of two or three multivariate normal populations with equal and unequal covariance matrices through a simulation study. Three rules of classification and five methods of handling missing values are considered. The objective of this study is to compare the different methods of handling missing values with respect to their ability in obtaining a "good" classification rule. In this study, two patterns of missing values are considered and the mechanism that lead to the presence of missing values is assumed to be missing at random (MAR). Seven factors are taken into consideration. The impact of each factor on the methods of handling missing values is studied. A Minitab macro was designed to run the necessary calculations.

DOI

10.21608/esju.1998.314563

Keywords

Classification Analysis, Classification Rule, Covariance Matrices, Handling Missing Values, Missing at Random, Multivariate Populations

Volume

42

Article Issue

1

Related Issue

43141

Issue Date

1998-06-01

Publish Date

1998-06-01

Page Start

32

Page End

54

Print ISSN

0542-1748

Online ISSN

2786-0086

Link

https://esju.journals.ekb.eg/article_314563.html

Detail API

https://esju.journals.ekb.eg/service?article_code=314563

Order

4

Type

Original Article

Type Code

1,914

Publication Type

Journal

Publication Title

The Egyptian Statistical Journal

Publication Link

https://esju.journals.ekb.eg/

MainTitle

On the Performance of Some Methods for Handling Missing Values in Classification Analysis. By: D. Adel M. Zaher and D. Ahmed H. Haroun and Mahmoud A. Mahmoud

Details

Type

Article

Created At

28 Dec 2024