Search in sources :

Example 1 with MockGithubDownloader

use of us.codecraft.webmagic.downloader.MockGithubDownloader in project webmagic by code4craft.

the class ConfigurablePageProcessorTest method test.

@Test
public void test() throws Exception {
    List<ExtractRule> extractRules = new ArrayList<ExtractRule>();
    ExtractRule extractRule = new ExtractRule();
    extractRule.setExpressionType(ExpressionType.XPath);
    extractRule.setExpressionValue("//title");
    extractRule.setFieldName("title");
    extractRules.add(extractRule);
    extractRule = new ExtractRule();
    extractRule.setExpressionType(ExpressionType.XPath);
    extractRule.setExpressionValue("//ul[@class='pagehead-actions']/li[1]//a[@class='social-count js-social-count']/text()");
    extractRule.setFieldName("star");
    extractRules.add(extractRule);
    ResultItems resultItems = Spider.create(new ConfigurablePageProcessor(Site.me(), extractRules)).setDownloader(new MockGithubDownloader()).get("https://github.com/code4craft/webmagic");
    assertThat(resultItems.getAll()).containsEntry("title", "<title>code4craft/webmagic ยท GitHub</title>");
    assertThat(resultItems.getAll()).containsEntry("star", " 86 ");
}
Also used : ResultItems(us.codecraft.webmagic.ResultItems) ArrayList(java.util.ArrayList) MockGithubDownloader(us.codecraft.webmagic.downloader.MockGithubDownloader) Test(org.junit.Test)

Aggregations

ArrayList (java.util.ArrayList)1 Test (org.junit.Test)1 ResultItems (us.codecraft.webmagic.ResultItems)1 MockGithubDownloader (us.codecraft.webmagic.downloader.MockGithubDownloader)1