Search in sources :

Example 31 with Request

use of us.codecraft.webmagic.Request in project webmagic by code4craft.

the class FilePipelineTest method before.

@BeforeClass
public static void before() {
    resultItems = new ResultItems();
    resultItems.put("content", "webmagic 爬虫工具");
    Request request = new Request("http://www.baidu.com");
    resultItems.setRequest(request);
    task = new Task() {

        @Override
        public String getUUID() {
            return UUID.randomUUID().toString();
        }

        @Override
        public Site getSite() {
            return null;
        }
    };
}
Also used : Site(us.codecraft.webmagic.Site) Task(us.codecraft.webmagic.Task) ResultItems(us.codecraft.webmagic.ResultItems) Request(us.codecraft.webmagic.Request) BeforeClass(org.junit.BeforeClass)

Example 32 with Request

use of us.codecraft.webmagic.Request in project webmagic by code4craft.

the class DuplicateRemovedSchedulerTest method test_no_duplicate_removed_for_post_request.

@Test
public void test_no_duplicate_removed_for_post_request() throws Exception {
    DuplicateRemover duplicateRemover = Mockito.mock(DuplicateRemover.class);
    duplicateRemovedScheduler.setDuplicateRemover(duplicateRemover);
    Request request = new Request("https://www.google.com/");
    request.setMethod(HttpConstant.Method.POST);
    duplicateRemovedScheduler.push(request, null);
    verify(duplicateRemover, times(0)).isDuplicate(any(Request.class), any(Task.class));
}
Also used : Task(us.codecraft.webmagic.Task) Request(us.codecraft.webmagic.Request) DuplicateRemover(us.codecraft.webmagic.scheduler.component.DuplicateRemover) Test(org.junit.Test)

Example 33 with Request

use of us.codecraft.webmagic.Request in project webmagic by code4craft.

the class DuplicateRemovedSchedulerTest method test_duplicate_removed_for_get_request.

@Test
public void test_duplicate_removed_for_get_request() throws Exception {
    DuplicateRemover duplicateRemover = Mockito.mock(DuplicateRemover.class);
    duplicateRemovedScheduler.setDuplicateRemover(duplicateRemover);
    Request request = new Request("https://www.google.com/");
    request.setMethod(HttpConstant.Method.GET);
    duplicateRemovedScheduler.push(request, null);
    verify(duplicateRemover, times(1)).isDuplicate(any(Request.class), any(Task.class));
}
Also used : Task(us.codecraft.webmagic.Task) Request(us.codecraft.webmagic.Request) DuplicateRemover(us.codecraft.webmagic.scheduler.component.DuplicateRemover) Test(org.junit.Test)

Example 34 with Request

use of us.codecraft.webmagic.Request in project webmagic by code4craft.

the class HttpClientDownloaderTest method testCycleTriedTimes.

@Test
public void testCycleTriedTimes() {
    HttpClientDownloader httpClientDownloader = new HttpClientDownloader();
    Task task = Site.me().setDomain("localhost").setCycleRetryTimes(5).toTask();
    Request request = new Request(PAGE_ALWAYS_NOT_EXISTS);
    Page page = httpClientDownloader.download(request, task);
    assertThat(page.getTargetRequests().size() > 0);
    assertThat((Integer) page.getTargetRequests().get(0).getExtra(Request.CYCLE_TRIED_TIMES)).isEqualTo(1);
    page = httpClientDownloader.download(page.getTargetRequests().get(0), task);
    assertThat((Integer) page.getTargetRequests().get(0).getExtra(Request.CYCLE_TRIED_TIMES)).isEqualTo(2);
}
Also used : Task(us.codecraft.webmagic.Task) Request(us.codecraft.webmagic.Request) Page(us.codecraft.webmagic.Page) Test(org.junit.Test)

Example 35 with Request

use of us.codecraft.webmagic.Request in project webmagic by code4craft.

the class HttpClientDownloaderTest method test_set_request_cookie.

@Test
public void test_set_request_cookie() throws Exception {
    HttpServer server = httpServer(13423);
    server.get(eq(cookie("cookie"), "cookie-webmagic")).response("ok");
    Runner.running(server, new Runnable() {

        @Override
        public void run() throws Exception {
            HttpClientDownloader httpClientDownloader = new HttpClientDownloader();
            Request request = new Request();
            request.setUrl("http://127.0.0.1:13423");
            request.addCookie("cookie", "cookie-webmagic");
            Page page = httpClientDownloader.download(request, Site.me().toTask());
            assertThat(page.getRawText()).isEqualTo("ok");
        }
    });
}
Also used : Runnable(com.github.dreamhead.moco.Runnable) HttpServer(com.github.dreamhead.moco.HttpServer) HttpUriRequest(org.apache.http.client.methods.HttpUriRequest) Request(us.codecraft.webmagic.Request) Page(us.codecraft.webmagic.Page) IOException(java.io.IOException) UnsupportedEncodingException(java.io.UnsupportedEncodingException) Test(org.junit.Test)

Aggregations

Request (us.codecraft.webmagic.Request)45 Test (org.junit.Test)32 Page (us.codecraft.webmagic.Page)22 HttpUriRequest (org.apache.http.client.methods.HttpUriRequest)13 HttpServer (com.github.dreamhead.moco.HttpServer)12 Runnable (com.github.dreamhead.moco.Runnable)12 IOException (java.io.IOException)12 UnsupportedEncodingException (java.io.UnsupportedEncodingException)11 Task (us.codecraft.webmagic.Task)10 Ignore (org.junit.Ignore)8 Site (us.codecraft.webmagic.Site)6 PlainText (us.codecraft.webmagic.selector.PlainText)6 DuplicateRemover (us.codecraft.webmagic.scheduler.component.DuplicateRemover)4 Matcher (java.util.regex.Matcher)2 ResultItems (us.codecraft.webmagic.ResultItems)2 HashSetDuplicateRemover (us.codecraft.webmagic.scheduler.component.HashSetDuplicateRemover)2 JSONObject (com.alibaba.fastjson.JSONObject)1 URI (java.net.URI)1 ArrayList (java.util.ArrayList)1 Map (java.util.Map)1