Abstract:
To classify Web objects into predefined semantic
structure is called the Web Page classification. One
of the most essential technique for Web Mining is
the automatic web page classification given that the
web is a huge repository of various information
including images, videos etc. And there is a need for
categorization web pages to satisfy user needs. The
classification of web pages into each category
exclusively relies on man power which cost much
time and effort. To alleviate this manually
classification problem, more researchers focus on
the issue of web pages classification technology. In
this paper, we proposed Random Forest Classifier
(RF) based on random forest method for multicategory
web page classification. The proposed RF
classifier can classify web pages efficiently
according to their corresponding class without using
other feature selection methods. We compared the
accuracy of the proposed approach to decision tree
classifier using in the same Yahoo web pages. The
experiments have shown that the proposed approach
is suitable for the multi-category web page
classification.