Skip to content
wosiu edited this page Oct 3, 2013 · 3 revisions

Universal Crawler for web or local file system crawlering. The module enables you to extend the functionality of any location-related searches in really easy way - by overriding the appropriate default functions from the Crawler class into your subclass. For example you can change default web filters or add some statistics. In "thread-version" branch you can also find multi-threaded version. In "icm" branch you cand find extended module for collecting statistics from logs of Apache Hadoop jobs' tasks. In master you can find Demo1 and Demo2 - extensions of crawler.

Let me know if you will use my code, have some suggestions or just find this helpfull :)

Clone this wiki locally