Abstract:
A novel metric that integrates the correlation
and reliability information between each feature
and each class obtained from Multiple
Correspondence Analysis (MCA) is currently the
popular solution to score the features for feature
selection. However, it has the disadvantage that
p-value which examines the reliability is
conventional confidence interval. The main goal
of this paper is to introduce a new classifier
independent (filter-based) feature selection
method, Modified Multiple Correspondence
Analysis (Modified-MCA) which is designed to
modify MCA, improving the reliability. The
efficiency and effectiveness of proposed method
is demonstrated through extensive comparisons
with MCA and other feature selection methods,
using five benchmark datasets provided by
WEKA and UCI repository. Naïve Bayes,
Decision Tree and JRip are used as the
classifiers. The classification results, in terms of
classification accuracy and size of feature
subspace, show that the proposed ModifiedMCA outperforms three other feature selection
methods, MCA, Information Gain, and Relief