Search in sources :

Example 1 with BloomFilterDuplicateRemover

use of us.codecraft.webmagic.scheduler.BloomFilterDuplicateRemover in project webmagic by code4craft.

the class OschinaBlogPageProcesser method main.

public static void main(String[] args) throws JMException {
    Spider spider = Spider.create(new OschinaBlogPageProcesser()).setScheduler(new QueueScheduler().setDuplicateRemover(new BloomFilterDuplicateRemover(2000)));
    SpiderMonitor.instance().register(spider);
    spider.run();
}
Also used : BloomFilterDuplicateRemover(us.codecraft.webmagic.scheduler.BloomFilterDuplicateRemover) Spider(us.codecraft.webmagic.Spider) QueueScheduler(us.codecraft.webmagic.scheduler.QueueScheduler)

Aggregations

Spider (us.codecraft.webmagic.Spider)1 BloomFilterDuplicateRemover (us.codecraft.webmagic.scheduler.BloomFilterDuplicateRemover)1 QueueScheduler (us.codecraft.webmagic.scheduler.QueueScheduler)1