Abstract:
Several recent machine learning publications
demonstrate the utility of using feature selection
algorithms in many learning. Feature selection helps to
acquire better understanding about the data by telling
which the important features are and how they are
related with each other and it can be applied to both
supervised and unsupervised learning. This paper aims
to find the best subset of features that not only
maximizes the classification accuracy but minimizes the
number of features. The other reason is to make aware
of the necessity and benefits of applying feature
selection methods. In this paper, genetic algorithm is
one of the wrapper feature selection methods and it is
used to reduce the irrelevant attributes of data.
Embedded feature selection method (C4.5) is used to
prune the features selected by genetic algorithm which
is suffering from overfitting problem. By combining
genetic algorithm with decision tree, this method
enhances the Bayesian classification to eliminate
unnecessary features and produces accurate classifier.