Search in sources :

Example 1 with XPathEvaluator

use of us.codecraft.xsoup.XPathEvaluator in project webmagic by code4craft.

the class XpathSelectorTest method parserPerformanceTest.

@Ignore("take long time")
@Test
public void parserPerformanceTest() throws XPatherException {
    System.out.println(html.length());
    HtmlCleaner htmlCleaner = new HtmlCleaner();
    TagNode tagNode = htmlCleaner.clean(html);
    Document document = Jsoup.parse(html);
    long time = System.currentTimeMillis();
    for (int i = 0; i < 2000; i++) {
        htmlCleaner.clean(html);
    }
    System.out.println(System.currentTimeMillis() - time);
    time = System.currentTimeMillis();
    for (int i = 0; i < 2000; i++) {
        tagNode.evaluateXPath("//a");
    }
    System.out.println(System.currentTimeMillis() - time);
    System.out.println("=============");
    time = System.currentTimeMillis();
    for (int i = 0; i < 2000; i++) {
        Jsoup.parse(html);
    }
    System.out.println(System.currentTimeMillis() - time);
    time = System.currentTimeMillis();
    for (int i = 0; i < 2000; i++) {
        document.select("a");
    }
    System.out.println(System.currentTimeMillis() - time);
    System.out.println("=============");
    time = System.currentTimeMillis();
    for (int i = 0; i < 2000; i++) {
        htmlCleaner.clean(html);
    }
    System.out.println(System.currentTimeMillis() - time);
    time = System.currentTimeMillis();
    for (int i = 0; i < 2000; i++) {
        tagNode.evaluateXPath("//a");
    }
    System.out.println(System.currentTimeMillis() - time);
    System.out.println("=============");
    XPathEvaluator compile = Xsoup.compile("//a");
    time = System.currentTimeMillis();
    for (int i = 0; i < 2000; i++) {
        compile.evaluate(document);
    }
    System.out.println(System.currentTimeMillis() - time);
}
Also used : XPathEvaluator(us.codecraft.xsoup.XPathEvaluator) Document(org.jsoup.nodes.Document) HtmlCleaner(org.htmlcleaner.HtmlCleaner) TagNode(org.htmlcleaner.TagNode) Ignore(org.junit.Ignore) Test(org.junit.Test)

Aggregations

HtmlCleaner (org.htmlcleaner.HtmlCleaner)1 TagNode (org.htmlcleaner.TagNode)1 Document (org.jsoup.nodes.Document)1 Ignore (org.junit.Ignore)1 Test (org.junit.Test)1 XPathEvaluator (us.codecraft.xsoup.XPathEvaluator)1