UCSY's Research Repository

Automatic Extraction of Data Record from Web Page based on Visual Features

Show simple item record

dc.contributor.author Hlaing, Nwe Nwe
dc.contributor.author Nyunt, Thi Thi Soe
dc.date.accessioned 2019-07-03T03:32:00Z
dc.date.available 2019-07-03T03:32:00Z
dc.date.issued 2011-05-05
dc.identifier.uri http://onlineresource.ucsy.edu.mm/handle/123456789/149
dc.description.abstract The Web is increasingly becoming a very large information source. However, the information is visually structured such that it is easy for humans to recognize data records and presentation patterns, but not for computers. As web sites are getting more complicated, the construction of web information extraction system becomes more troublesome and timeconsuming. Hence, tools for the mining of data regions, data records and data items need to be developed in order to provide value added services. Large number of techniques has been proposed to address this problem, but all of them have inherent limitations. In this paper, we propose an approach for automatic data record extraction method from web page, which we call Vision based Extraction of data Record (VER). The approach is based on the observation that visual similarity of the data record in web document. Firstly, we adopt VIPS (Vision-based Page Segmentation) algorithm to partition a web page into semantic blocks. Then, blocks are clustered by proposed block clustering method according to the appearance similarity. Among these clusters, we identify data region and finally extract data record from data region. en_US
dc.language.iso en en_US
dc.publisher Ninth International Conference On Computer Applications (ICCA 2011) en_US
dc.title Automatic Extraction of Data Record from Web Page based on Visual Features en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Repository



Browse

My Account

Statistics