Noise Elimination from Web Page in Web Content Mining

dc.contributor.author	Linn, Khaing Wah Wah
dc.contributor.author	Phyu, Sabai
dc.date.accessioned	2019-10-25T11:39:27Z
dc.date.available	2019-10-25T11:39:27Z
dc.date.issued	2015-02-05
dc.identifier.uri	http://onlineresource.ucsy.edu.mm/handle/123456789/2352
dc.description.abstract	Nowadays, a large number of web pages contained useful information is often accompanied by a large amount of noise such as banner advertisements, navigation bars, copyright notices, etc. These noise data can seriously harm for web miners by extracting whole document rather than the informative content and also retrieve non-relevant results. It is also important to distinguish valuable information from noisy data within a single web page. The web pages are constructed not only main contents information like product information in shopping domain, job information in a job domain but also advertisements bar, static content like navigation panels, copyright sections, etc. When web documents are processed, the main content is surrounded by noise in the retrieved data. To tackle these issues, a noise elimination process is described by using html tags and main content is retrieved by using gomory-hu tree.	en_US
dc.language.iso	en_US	en_US
dc.publisher	Thirteenth International Conference On Computer Applications (ICCA 2015)	en_US
dc.subject	noise elimination	en_US
dc.subject	block splitting	en_US
dc.title	Noise Elimination from Web Page in Web Content Mining	en_US
dc.type	Article	en_US