Search in sources :

Example 16 with Page

use of us.codecraft.webmagic.Page in project webmagic by code4craft.

the class AbstractDownloader method addToCycleRetry.

protected Page addToCycleRetry(Request request, Site site) {
    Page page = new Page();
    Object cycleTriedTimesObject = request.getExtra(Request.CYCLE_TRIED_TIMES);
    if (cycleTriedTimesObject == null) {
        page.addTargetRequest(request.setPriority(0).putExtra(Request.CYCLE_TRIED_TIMES, 1));
    } else {
        int cycleTriedTimes = (Integer) cycleTriedTimesObject;
        cycleTriedTimes++;
        if (cycleTriedTimes >= site.getCycleRetryTimes()) {
            return null;
        }
        page.addTargetRequest(request.setPriority(0).putExtra(Request.CYCLE_TRIED_TIMES, cycleTriedTimes));
    }
    page.setNeedCycleRetry(true);
    return page;
}
Also used : Page(us.codecraft.webmagic.Page)

Example 17 with Page

use of us.codecraft.webmagic.Page in project webmagic by code4craft.

the class HttpClientDownloaderTest method testCycleTriedTimes.

@Test
public void testCycleTriedTimes() {
    HttpClientDownloader httpClientDownloader = new HttpClientDownloader();
    Task task = Site.me().setDomain("localhost").setCycleRetryTimes(5).toTask();
    Request request = new Request(PAGE_ALWAYS_NOT_EXISTS);
    Page page = httpClientDownloader.download(request, task);
    assertThat(page.getTargetRequests().size() > 0);
    assertThat((Integer) page.getTargetRequests().get(0).getExtra(Request.CYCLE_TRIED_TIMES)).isEqualTo(1);
    page = httpClientDownloader.download(page.getTargetRequests().get(0), task);
    assertThat((Integer) page.getTargetRequests().get(0).getExtra(Request.CYCLE_TRIED_TIMES)).isEqualTo(2);
}
Also used : Task(us.codecraft.webmagic.Task) Request(us.codecraft.webmagic.Request) Page(us.codecraft.webmagic.Page) Test(org.junit.Test)

Example 18 with Page

use of us.codecraft.webmagic.Page in project webmagic by code4craft.

the class HttpClientDownloaderTest method test_set_request_cookie.

@Test
public void test_set_request_cookie() throws Exception {
    HttpServer server = httpServer(13423);
    server.get(eq(cookie("cookie"), "cookie-webmagic")).response("ok");
    Runner.running(server, new Runnable() {

        @Override
        public void run() throws Exception {
            HttpClientDownloader httpClientDownloader = new HttpClientDownloader();
            Request request = new Request();
            request.setUrl("http://127.0.0.1:13423");
            request.addCookie("cookie", "cookie-webmagic");
            Page page = httpClientDownloader.download(request, Site.me().toTask());
            assertThat(page.getRawText()).isEqualTo("ok");
        }
    });
}
Also used : Runnable(com.github.dreamhead.moco.Runnable) HttpServer(com.github.dreamhead.moco.HttpServer) HttpUriRequest(org.apache.http.client.methods.HttpUriRequest) Request(us.codecraft.webmagic.Request) Page(us.codecraft.webmagic.Page) IOException(java.io.IOException) UnsupportedEncodingException(java.io.UnsupportedEncodingException) Test(org.junit.Test)

Example 19 with Page

use of us.codecraft.webmagic.Page in project webmagic by code4craft.

the class HttpClientDownloaderTest method test_set_site_cookie.

@Test
public void test_set_site_cookie() throws Exception {
    HttpServer server = httpServer(13423);
    server.get(eq(cookie("cookie"), "cookie-webmagic")).response("ok");
    Runner.running(server, new Runnable() {

        @Override
        public void run() throws Exception {
            HttpClientDownloader httpClientDownloader = new HttpClientDownloader();
            Request request = new Request();
            request.setUrl("http://127.0.0.1:13423");
            Site site = Site.me().addCookie("cookie", "cookie-webmagic").setDomain("127.0.0.1");
            Page page = httpClientDownloader.download(request, site.toTask());
            assertThat(page.getRawText()).isEqualTo("ok");
        }
    });
}
Also used : Site(us.codecraft.webmagic.Site) Runnable(com.github.dreamhead.moco.Runnable) HttpServer(com.github.dreamhead.moco.HttpServer) HttpUriRequest(org.apache.http.client.methods.HttpUriRequest) Request(us.codecraft.webmagic.Request) Page(us.codecraft.webmagic.Page) IOException(java.io.IOException) UnsupportedEncodingException(java.io.UnsupportedEncodingException) Test(org.junit.Test)

Example 20 with Page

use of us.codecraft.webmagic.Page in project webmagic by code4craft.

the class HttpClientDownloaderTest method test_disableCookieManagement.

@Test
public void test_disableCookieManagement() throws Exception {
    HttpServer server = httpServer(13423);
    server.get(not(eq(cookie("cookie"), "cookie-webmagic"))).response("ok");
    Runner.running(server, new Runnable() {

        @Override
        public void run() throws Exception {
            HttpClientDownloader httpClientDownloader = new HttpClientDownloader();
            Request request = new Request();
            request.setUrl("http://127.0.0.1:13423");
            request.addCookie("cookie", "cookie-webmagic");
            Page page = httpClientDownloader.download(request, Site.me().setDisableCookieManagement(true).toTask());
            assertThat(page.getRawText()).isEqualTo("ok");
        }
    });
}
Also used : Runnable(com.github.dreamhead.moco.Runnable) HttpServer(com.github.dreamhead.moco.HttpServer) HttpUriRequest(org.apache.http.client.methods.HttpUriRequest) Request(us.codecraft.webmagic.Request) Page(us.codecraft.webmagic.Page) IOException(java.io.IOException) UnsupportedEncodingException(java.io.UnsupportedEncodingException) Test(org.junit.Test)

Aggregations

Page (us.codecraft.webmagic.Page)29 Request (us.codecraft.webmagic.Request)22 Test (org.junit.Test)19 IOException (java.io.IOException)11 HttpUriRequest (org.apache.http.client.methods.HttpUriRequest)11 HttpServer (com.github.dreamhead.moco.HttpServer)10 Runnable (com.github.dreamhead.moco.Runnable)10 UnsupportedEncodingException (java.io.UnsupportedEncodingException)10 PlainText (us.codecraft.webmagic.selector.PlainText)8 Site (us.codecraft.webmagic.Site)5 Task (us.codecraft.webmagic.Task)5 Ignore (org.junit.Ignore)3 Proxy (us.codecraft.webmagic.proxy.Proxy)2 Html (us.codecraft.webmagic.selector.Html)2 ArrayList (java.util.ArrayList)1 Map (java.util.Map)1 CloseableHttpResponse (org.apache.http.client.methods.CloseableHttpResponse)1 CloseableHttpClient (org.apache.http.impl.client.CloseableHttpClient)1 Cookie (org.openqa.selenium.Cookie)1 WebDriver (org.openqa.selenium.WebDriver)1