Abstract:
Reducing the size of a feature set, without
altering the original representation, is an
essential data processing step prior to applying
a learning algorithm. The removal of irrelevant
and redundant information improves the
performance of machine learning algorithms. In
this paper, Modified-Multiple Correspondence
Analysis (Modified-MCA) is introduced. It
integrates the correlation and reliability
information between each feature and each
class. Moreover, the proposed method
contributes the optimal p-value to improve the
reliability. To evaluate the performance of
proposed method, experiments are carried out on
ten benchmark datasets. In the experiments,
three classifiers namely AdaBoost, Decision
Table, JRip are used to verify that the output
feature dataset produced by proposed method
outperforms. Using three different classifiers is
to get more accurate average classification
results than using one classifier. The proposed
Modified-MCA demonstrates reducing the size of
the feature subspace and promising
classification results. Moreover, the results
performs that the propose method is better than other well-known feature selection methods;
MCA, Information Gain and Relief.