Skip to content

Crawling som test data to play with

cmsmerge (Searchdaimon) edited this page Aug 29, 2013 · 1 revision

Before you can begin developing you will have to obtain some data to crawl so Searchdaimon have collected some data set useful for testing. More information about each one is available at http://www.opentestsearch.com/test-sets/ .

Use the web based administrator interface on the ES to setup at least 2 different collection with data. For example by indexing the following with the Intranet connector:

Collection name    Url
Enwiki             http://datasets.opentestset.com/datasets/enwiki_2011/basic/
Enronsmall         http://datasets.opentestset.com/datasets/Enron_files/basic/lay-k/

You may want to set the "delay" option to 0 when you add you collection so the crawler gos as fast as possible.