Search in sources :

Example 21 with Document

use of org.jsoup.nodes.Document in project jsoup by jhy.

the class UrlConnectTest method inWildUtfRedirect2.

@Test
public void inWildUtfRedirect2() throws IOException {
    Connection.Response res = Jsoup.connect("https://ssl.souq.com/sa-en/2724288604627/s").execute();
    Document doc = res.parse();
    assertEquals("http://saudi.souq.com/sa-en/%D8%AE%D8%B2%D9%86%D8%A9-%D8%A2%D9%85%D9%86%D8%A9-3-%D8%B7%D8%A8%D9%82%D8%A7%D8%AA-%D8%A8%D9%86%D8%B8%D8%A7%D9%85-%D9%82%D9%81%D9%84-%D8%A5%D9%84%D9%83%D8%AA%D8%B1%D9%88%D9%86%D9%8A-bsd11523-6831477/i/?ctype=dsrch", doc.location());
}
Also used : Connection(org.jsoup.Connection) Document(org.jsoup.nodes.Document) Test(org.junit.Test)

Example 22 with Document

use of org.jsoup.nodes.Document in project jsoup by jhy.

the class UrlConnectTest method ignoresContentTypeIfSoConfigured.

@Test
public void ignoresContentTypeIfSoConfigured() throws IOException {
    Document doc = Jsoup.connect("https://jsoup.org/rez/osi_logo.png").ignoreContentType(true).get();
    // this will cause an ugly parse tree
    assertEquals("", doc.title());
}
Also used : Document(org.jsoup.nodes.Document) Test(org.junit.Test)

Example 23 with Document

use of org.jsoup.nodes.Document in project jsoup by jhy.

the class UrlConnectTest method sendsRequestBody.

@Test
public void sendsRequestBody() throws IOException {
    final String body = "{key:value}";
    Document doc = Jsoup.connect(echoURL).requestBody(body).header("Content-Type", "text/plain").userAgent(browserUa).post();
    assertEquals("POST", ihVal("REQUEST_METHOD", doc));
    assertEquals("text/plain", ihVal("CONTENT_TYPE", doc));
    assertEquals(body, doc.select("th:contains(POSTDATA) ~ td").text());
}
Also used : Document(org.jsoup.nodes.Document) Test(org.junit.Test)

Example 24 with Document

use of org.jsoup.nodes.Document in project jsoup by jhy.

the class UrlConnectTest method handlesUnescapedRedirects.

@Test
public void handlesUnescapedRedirects() throws IOException {
    // URL locations should be url safe (ascii) but are often not, so we should try to guess
    // in this case the location header is utf-8, but defined in spec as iso8859, so detect, convert, encode
    String url = "http://direct.infohound.net/tools/302-utf.pl";
    String urlEscaped = "http://direct.infohound.net/tools/test%F0%9F%92%A9.html";
    Connection.Response res = Jsoup.connect(url).execute();
    Document doc = res.parse();
    assertEquals(doc.body().text(), "💩!");
    assertEquals(doc.location(), urlEscaped);
    Connection.Response res2 = Jsoup.connect(url).followRedirects(false).execute();
    assertEquals("/tools/test💩.html", res2.header("Location"));
// if we didn't notice it was utf8, would look like: Location: /tools/test💩.html
}
Also used : Connection(org.jsoup.Connection) Document(org.jsoup.nodes.Document) Test(org.junit.Test)

Example 25 with Document

use of org.jsoup.nodes.Document in project jsoup by jhy.

the class UrlConnectTest method sendsRequestBodyWithUrlParams.

@Test
public void sendsRequestBodyWithUrlParams() throws IOException {
    final String body = "{key:value}";
    Document doc = Jsoup.connect(echoURL).requestBody(body).data("uname", "Jsoup", "uname", "Jonathan", "百", "度一下").header("Content-Type", // todo - if user sets content-type, we should append postcharset
    "text/plain").userAgent(browserUa).post();
    assertEquals("POST", ihVal("REQUEST_METHOD", doc));
    assertEquals("uname=Jsoup&uname=Jonathan&%E7%99%BE=%E5%BA%A6%E4%B8%80%E4%B8%8B", ihVal("QUERY_STRING", doc));
    assertEquals(body, ihVal("POSTDATA", doc));
}
Also used : Document(org.jsoup.nodes.Document) Test(org.junit.Test)

Aggregations

Document (org.jsoup.nodes.Document)391 Test (org.junit.Test)194 Element (org.jsoup.nodes.Element)153 IOException (java.io.IOException)100 File (java.io.File)81 Elements (org.jsoup.select.Elements)70 ElementHandlerImpl (org.asqatasun.ruleimplementation.ElementHandlerImpl)51 Connection (org.jsoup.Connection)37 ArrayList (java.util.ArrayList)36 URL (java.net.URL)24 HashMap (java.util.HashMap)16 InputStream (java.io.InputStream)13 List (java.util.List)9 MalformedURLException (java.net.MalformedURLException)8 Matcher (java.util.regex.Matcher)7 Logger (org.slf4j.Logger)7 Pattern (java.util.regex.Pattern)6 HttpGet (org.apache.http.client.methods.HttpGet)6 Jsoup (org.jsoup.Jsoup)6 LoggerFactory (org.slf4j.LoggerFactory)6