Search in sources :

Example 91 with Document

use of org.jsoup.nodes.Document in project jsoup by jhy.

the class UrlConnectTest method followsRelativeDotRedirect.

public void followsRelativeDotRedirect() throws IOException {
    // redirects to "./ok.html", should resolve to
    // to ./ok.html
    Connection con = Jsoup.connect("");
    Document doc =;
    assertEquals(doc.location(), "");
Also used : Connection(org.jsoup.Connection) Document(org.jsoup.nodes.Document) Test(org.junit.Test)

Example 92 with Document

use of org.jsoup.nodes.Document in project jsoup by jhy.

the class UrlConnectTest method multiCookieSet.

public void multiCookieSet() throws IOException {
    Connection con = Jsoup.connect("");
    Connection.Response res = con.execute();
    // test cookies set by redirect:
    Map<String, String> cookies = res.cookies();
    assertEquals("asdfg123", cookies.get("token"));
    assertEquals("jhy", cookies.get("uid"));
    // send those cookies into the echo URL by map:
    Document doc = Jsoup.connect(echoURL).cookies(cookies).get();
    assertEquals("token=asdfg123; uid=jhy", ihVal("HTTP_COOKIE", doc));
Also used : Connection(org.jsoup.Connection) Document(org.jsoup.nodes.Document) Test(org.junit.Test)

Example 93 with Document

use of org.jsoup.nodes.Document in project webmagic by code4craft.

the class CharsetUtils method detectCharset.

public static String detectCharset(String contentType, byte[] contentBytes) throws IOException {
    String charset;
    // charset
    // 1、encoding in http header Content-Type
    charset = UrlUtils.getCharset(contentType);
    if (StringUtils.isNotBlank(contentType)) {
        logger.debug("Auto get charset: {}", charset);
        return charset;
    // use default charset to decode first time
    Charset defaultCharset = Charset.defaultCharset();
    String content = new String(contentBytes, defaultCharset);
    // 2、charset in meta
    if (StringUtils.isNotEmpty(content)) {
        Document document = Jsoup.parse(content);
        Elements links ="meta");
        for (Element link : links) {
            // 2.1、html4.01 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
            String metaContent = link.attr("content");
            String metaCharset = link.attr("charset");
            if (metaContent.indexOf("charset") != -1) {
                metaContent = metaContent.substring(metaContent.indexOf("charset"), metaContent.length());
                charset = metaContent.split("=")[1];
            } else // 2.2、html5 <meta charset="UTF-8" />
            if (StringUtils.isNotEmpty(metaCharset)) {
                charset = metaCharset;
    logger.debug("Auto get charset: {}", charset);
    // 3、todo use tools as cpdetector for content decode
    return charset;
Also used : Element(org.jsoup.nodes.Element) Charset(java.nio.charset.Charset) Document(org.jsoup.nodes.Document) Elements(

Example 94 with Document

use of org.jsoup.nodes.Document in project cucumber-jvm by cucumber.

the class HTMLFormatterTest method writes_index_html.

public void writes_index_html() throws IOException {
    URL indexHtml = new URL(outputDir, "index.html");
    Document document = Jsoup.parse(new File(indexHtml.getFile()), "UTF-8");
    Element reportElement = document.body().getElementsByClass("cucumber-report").first();
    assertEquals("", reportElement.text());
Also used : Element(org.jsoup.nodes.Element) Document(org.jsoup.nodes.Document) File( URL( Test(org.junit.Test)

Example 95 with Document

use of org.jsoup.nodes.Document in project opennms by OpenNMS.

the class HttpCollectionHandler method fillCollectionSet.

protected void fillCollectionSet(String urlString, Request request, CollectionAgent agent, CollectionSetBuilder builder, XmlSource source) throws Exception {
    Document doc = getJsoupDocument(urlString, request);
    for (XmlGroup group : source.getXmlGroups()) {
        LOG.debug("fillCollectionSet: getting resources for XML group {} using selector {}", group.getName(), group.getResourceXpath());
        Date timestamp = getTimeStamp(doc, group);
        Elements elements =;
        LOG.debug("fillCollectionSet: {} => {}", group.getResourceXpath(), elements);
        String resourceName = getResourceName(elements, group);
        LOG.debug("fillCollectionSet: processing XML resource {}", resourceName);
        final Resource collectionResource = getCollectionResource(agent, resourceName, group.getResourceType(), timestamp);
        LOG.debug("fillCollectionSet: processing resource {}", collectionResource);
        for (XmlObject object : group.getXmlObjects()) {
            Elements el =;
            if (el == null) {
      "No value found for object named '{}'. Skipping.", object.getName());
            builder.withAttribute(collectionResource, group.getName(), object.getName(), el.html(), object.getDataType());
        processXmlResource(builder, collectionResource, resourceName, group.getName());
Also used : XmlGroup(org.opennms.protocols.xml.config.XmlGroup) Resource( XmlObject(org.opennms.protocols.xml.config.XmlObject) Document(org.jsoup.nodes.Document) Elements( Date(java.util.Date)


Document (org.jsoup.nodes.Document)405 Test (org.junit.Test)194 Element (org.jsoup.nodes.Element)164 IOException ( File ( Elements ( ElementHandlerImpl (org.asqatasun.ruleimplementation.ElementHandlerImpl)51 ArrayList (java.util.ArrayList)41 Connection (org.jsoup.Connection)38 URL ( HashMap (java.util.HashMap)17 InputStream ( List (java.util.List)10 MalformedURLException ( Logger (org.slf4j.Logger)8 Matcher (java.util.regex.Matcher)7 Jsoup (org.jsoup.Jsoup)7 LoggerFactory (org.slf4j.LoggerFactory)7 Pattern (java.util.regex.Pattern)6 HttpGet (org.apache.http.client.methods.HttpGet)6