UCSY's Research Repository

Web Content Classification using Content Features and Ant Colony Optimization Algorithm

Show simple item record

dc.contributor.author Aye, Nilar
dc.contributor.author San, Pan Ei
dc.date.accessioned 2019-07-03T08:26:26Z
dc.date.available 2019-07-03T08:26:26Z
dc.date.issued 2016-02-25
dc.identifier.uri http://onlineresource.ucsy.edu.mm/handle/123456789/343
dc.description.abstract The web content classification system classifies the noise or content from HTML web pages. The system proposes the Content Extraction algorithm using content features to remove the boilerplate and to extract the main content from the web page. After observation the HTML tags, one line may not contain a piece of complete information and long texts are distributed in close lines, this system uses Text-Block Concept to determine the distance of any two neighbor lines with text and Feature Extraction such as Text Density (TD), anchor Anchor Link Density (ALD) and a new feature Title Keywords Density (TKD) classifies noise or content. After extracting the features, the system uses the C4.8 decision tree method to classify the block is content or non-content by using above features. After extracting the main contents, the system uses a new classification algorithm, Ant Colony Algorithm (ACO) that is able to solve discrete problems and discreteness of text document’s features. Texts are classified by crawling of class population ants which have class information with them to find an optimal path matching during it iterates in the algorithm. Finally, the system gains more interest as the classifier improves its performance with experience. en_US
dc.language.iso en en_US
dc.publisher Fourteenth International Conference On Computer Applications (ICCA 2016) en_US
dc.title Web Content Classification using Content Features and Ant Colony Optimization Algorithm en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository



Browse

My Account

Statistics