Web Document Clustering using Genetic Algorithm

Hein, Pyae Sandi; Khine, May Aye

UCSYRR Home
/
Conferences
/
Local Conference on Parallel and Soft Computing
/
Sixth Local Conference on Parallel and Soft Computing
/
View Item

dc.contributor.author	Hein, Pyae Sandi
dc.contributor.author	Khine, May Aye
dc.date.accessioned	2019-07-26T05:23:53Z
dc.date.available	2019-07-26T05:23:53Z
dc.date.issued	2011-12-29
dc.identifier.uri	http://onlineresource.ucsy.edu.mm/handle/123456789/1355
dc.description.abstract	Clustering (or cluster analysis) is one of the main data analysis techniques and deals with the organization of a set of objects in a multidimensional space into cohesive groups, called clusters. Each cluster contains objects that are very similar to each other and very dissimilar to object in other cluster. Web page clustering is one of the major preprocessing step in web mining analysis. Clustering is also useful extracting salient features of related web document to automatically formulated queries and search for other similar document on the Web. Web page clustering faces with and many challenges due to the high dimensionality and due to heterogeneity nature of the web document. Efficient and scalable algorithm are need for web clustering. Genetic algorithm is a of the algorithm from evolutionary computing that can effectively search in the large search space by simulating the nature of evolution. This paper present the genetic algorithm for web page clustering that is scalable and efficient. Genetic algorithm with medoid representation was used because it provides shorter chromosome length and medoid based clustering is more tolerable to noisy data such as web document and employs a supervised features selection method for selection of appropriate features terms.	en_US
dc.language.iso	en	en_US
dc.publisher	Sixth Local Conference on Parallel and Soft Computing	en_US
dc.title	Web Document Clustering using Genetic Algorithm	en_US
dc.type	Article	en_US