Search in sources :

Example 1 with ObjectInstantiationException

use of com.gargoylesoftware.htmlunit.ObjectInstantiationException in project htmlunit by HtmlUnit.

the class HtmlUnitNekoHTMLErrorHandler method parse.

/**
 * Parses the WebResponse into an object tree representation.
 *
 * @param webResponse the response data
 * @param page the HtmlPage to add the nodes
 * @param xhtml if true use the XHtml parser
 * @param createdByJavascript if true the (script) tag was created by javascript
 * @throws IOException if there is an IO error
 */
@Override
public void parse(final WebResponse webResponse, final HtmlPage page, final boolean xhtml, final boolean createdByJavascript) throws IOException {
    final URL url = webResponse.getWebRequest().getUrl();
    final HtmlUnitNekoDOMBuilder domBuilder = new HtmlUnitNekoDOMBuilder(this, page, url, null, createdByJavascript);
    Charset charset = webResponse.getContentCharsetOrNull();
    try {
        if (charset == null) {
            charset = StandardCharsets.ISO_8859_1;
        } else {
            domBuilder.setFeature(HTMLScanner.IGNORE_SPECIFIED_CHARSET, true);
        }
        // xml content is different
        if (xhtml) {
            domBuilder.setFeature(HTMLScanner.ALLOW_SELFCLOSING_TAGS, true);
            domBuilder.setFeature(HTMLScanner.SCRIPT_STRIP_CDATA_DELIMS, true);
            domBuilder.setFeature(HTMLScanner.STYLE_STRIP_CDATA_DELIMS, true);
        }
    } catch (final Exception e) {
        throw new ObjectInstantiationException("Error setting HTML parser feature", e);
    }
    try (InputStream content = webResponse.getContentAsStream()) {
        final String encoding = charset.name();
        final XMLInputSource in = new XMLInputSource(null, url.toString(), null, content, encoding);
        page.registerParsingStart();
        try {
            domBuilder.parse(in);
        } catch (final XNIException e) {
            // extract enclosed exception
            final Throwable origin = extractNestedException(e);
            throw new RuntimeException("Failed parsing content from " + url, origin);
        }
    } finally {
        page.registerParsingEnd();
    }
}
Also used : ObjectInstantiationException(com.gargoylesoftware.htmlunit.ObjectInstantiationException) XMLInputSource(org.apache.xerces.xni.parser.XMLInputSource) InputStream(java.io.InputStream) Charset(java.nio.charset.Charset) URL(java.net.URL) XNIException(org.apache.xerces.xni.XNIException) IOException(java.io.IOException) InvocationTargetException(java.lang.reflect.InvocationTargetException) SAXException(org.xml.sax.SAXException) XMLParseException(org.apache.xerces.xni.parser.XMLParseException) ObjectInstantiationException(com.gargoylesoftware.htmlunit.ObjectInstantiationException) XNIException(org.apache.xerces.xni.XNIException)

Aggregations

ObjectInstantiationException (com.gargoylesoftware.htmlunit.ObjectInstantiationException)1 IOException (java.io.IOException)1 InputStream (java.io.InputStream)1 InvocationTargetException (java.lang.reflect.InvocationTargetException)1 URL (java.net.URL)1 Charset (java.nio.charset.Charset)1 XNIException (org.apache.xerces.xni.XNIException)1 XMLInputSource (org.apache.xerces.xni.parser.XMLInputSource)1 XMLParseException (org.apache.xerces.xni.parser.XMLParseException)1 SAXException (org.xml.sax.SAXException)1