Implementation of a Distributed Web Crawler

dc.contributor.author	Win, Seint Seint
dc.contributor.author	Lwin, Nyein Nyein
dc.date.accessioned	2019-08-06T01:16:13Z
dc.date.available	2019-08-06T01:16:13Z
dc.date.issued	2009-12-30
dc.identifier.uri	http://onlineresource.ucsy.edu.mm/handle/123456789/1820
dc.description.abstract	Today’s search engines are equipped with specialized agents known as Web crawlers (download robots) dedicated to crawling large Web contents on line. Crawlers interact with thousands of Web servers over periods extending from a few weeks to several years. Large scale search engine such as Google use distributed crawler to crawl the entire WWW. The distributed crawler harnesses the excess bandwidth and computing resources of clients to crawl the web. This paper presents design and implemented a scalable distributed crawler by using distributed programming facilities provided by Java RMI. Hash based partitioning is used to partition the urls among the crawlers; communication among crawler is done by Remote Method Invocation. The Crawler can run many crawler instances at the same time.	en_US
dc.language.iso	en	en_US
dc.publisher	Fourth Local Conference on Parallel and Soft Computing	en_US
dc.subject	Distributed Crawling	en_US
dc.subject	Java RMI	en_US
dc.title	Implementation of a Distributed Web Crawler	en_US
dc.type	Article	en_US