Search in sources :

Example 56 with Node

use of org.jsoup.nodes.Node in project jsoup by jhy.

the class TraversorTest method filterVisit.

// Note: NodeTraversor.traverse(new NodeVisitor) is tested in
// ElementsTest#traverse()
@Test
public void filterVisit() {
    Document doc = Jsoup.parse("<div><p>Hello</p></div><div>There</div>");
    final StringBuilder accum = new StringBuilder();
    NodeTraversor.filter(new NodeFilter() {

        @Override
        public FilterResult head(Node node, int depth) {
            accum.append("<").append(node.nodeName()).append(">");
            return FilterResult.CONTINUE;
        }

        @Override
        public FilterResult tail(Node node, int depth) {
            accum.append("</").append(node.nodeName()).append(">");
            return FilterResult.CONTINUE;
        }
    }, doc.select("div"));
    assertEquals("<div><p><#text></#text></p></div><div><#text></#text></div>", accum.toString());
}
Also used : Node(org.jsoup.nodes.Node) TextNode(org.jsoup.nodes.TextNode) Document(org.jsoup.nodes.Document) Test(org.junit.jupiter.api.Test)

Example 57 with Node

use of org.jsoup.nodes.Node in project jsoup by jhy.

the class TraversorTest method filterSkipChildren.

@Test
public void filterSkipChildren() {
    Document doc = Jsoup.parse("<div><p>Hello</p></div><div>There</div>");
    final StringBuilder accum = new StringBuilder();
    NodeTraversor.filter(new NodeFilter() {

        @Override
        public FilterResult head(Node node, int depth) {
            accum.append("<").append(node.nodeName()).append(">");
            // OMIT contents of p:
            return ("p".equals(node.nodeName())) ? FilterResult.SKIP_CHILDREN : FilterResult.CONTINUE;
        }

        @Override
        public FilterResult tail(Node node, int depth) {
            accum.append("</").append(node.nodeName()).append(">");
            return FilterResult.CONTINUE;
        }
    }, doc.select("div"));
    assertEquals("<div><p></p></div><div><#text></#text></div>", accum.toString());
}
Also used : Node(org.jsoup.nodes.Node) TextNode(org.jsoup.nodes.TextNode) Document(org.jsoup.nodes.Document) Test(org.junit.jupiter.api.Test)

Example 58 with Node

use of org.jsoup.nodes.Node in project jsoup by jhy.

the class Cleaner method isValidBodyHtml.

public boolean isValidBodyHtml(String bodyHtml) {
    Document clean = Document.createShell("");
    Document dirty = Document.createShell("");
    ParseErrorList errorList = ParseErrorList.tracking(1);
    List<Node> nodes = Parser.parseFragment(bodyHtml, dirty.body(), "", errorList);
    dirty.body().insertChildren(0, nodes);
    int numDiscarded = copySafeNodes(dirty.body(), clean.body());
    return numDiscarded == 0 && errorList.isEmpty();
}
Also used : TextNode(org.jsoup.nodes.TextNode) Node(org.jsoup.nodes.Node) DataNode(org.jsoup.nodes.DataNode) ParseErrorList(org.jsoup.parser.ParseErrorList) Document(org.jsoup.nodes.Document)

Example 59 with Node

use of org.jsoup.nodes.Node in project jsoup by jhy.

the class NodeTraversor method filter.

/**
 * Start a depth-first filtering of the root and all of its descendants.
 * @param filter Node visitor.
 * @param root the root node point to traverse.
 * @return The filter result of the root node, or {@link FilterResult#STOP}.
 */
public static FilterResult filter(NodeFilter filter, Node root) {
    Node node = root;
    int depth = 0;
    while (node != null) {
        FilterResult result = filter.head(node, depth);
        if (result == FilterResult.STOP)
            return result;
        // Descend into child nodes:
        if (result == FilterResult.CONTINUE && node.childNodeSize() > 0) {
            node = node.childNode(0);
            ++depth;
            continue;
        }
        // No siblings, move upwards:
        while (true) {
            // depth > 0, so has parent
            assert node != null;
            if (!(node.nextSibling() == null && depth > 0))
                break;
            // 'tail' current node:
            if (result == FilterResult.CONTINUE || result == FilterResult.SKIP_CHILDREN) {
                result = filter.tail(node, depth);
                if (result == FilterResult.STOP)
                    return result;
            }
            // In case we need to remove it below.
            Node prev = node;
            node = node.parentNode();
            depth--;
            if (result == FilterResult.REMOVE)
                // Remove AFTER finding parent.
                prev.remove();
            // Parent was not pruned.
            result = FilterResult.CONTINUE;
        }
        // 'tail' current node, then proceed with siblings:
        if (result == FilterResult.CONTINUE || result == FilterResult.SKIP_CHILDREN) {
            result = filter.tail(node, depth);
            if (result == FilterResult.STOP)
                return result;
        }
        if (node == root)
            return result;
        // In case we need to remove it below.
        Node prev = node;
        node = node.nextSibling();
        if (result == FilterResult.REMOVE)
            // Remove AFTER finding sibling.
            prev.remove();
    }
    // root == null?
    return FilterResult.CONTINUE;
}
Also used : Node(org.jsoup.nodes.Node) FilterResult(org.jsoup.select.NodeFilter.FilterResult)

Example 60 with Node

use of org.jsoup.nodes.Node in project jsoup by jhy.

the class HtmlTreeBuilder method insert.

void insert(Token.Character characterToken) {
    final Node node;
    // will be doc if no current element; allows for whitespace to be inserted into the doc root object (not on the stack)
    Element el = currentElement();
    final String tagName = el.normalName();
    final String data = characterToken.getData();
    if (characterToken.isCData())
        node = new CDataNode(data);
    else if (isContentForTagData(tagName))
        node = new DataNode(data);
    else
        node = new TextNode(data);
    // doesn't use insertNode, because we don't foster these; and will always have a stack.
    el.appendChild(node);
}
Also used : DataNode(org.jsoup.nodes.DataNode) CDataNode(org.jsoup.nodes.CDataNode) TextNode(org.jsoup.nodes.TextNode) Node(org.jsoup.nodes.Node) DataNode(org.jsoup.nodes.DataNode) CDataNode(org.jsoup.nodes.CDataNode) Element(org.jsoup.nodes.Element) FormElement(org.jsoup.nodes.FormElement) TextNode(org.jsoup.nodes.TextNode) CDataNode(org.jsoup.nodes.CDataNode)

Aggregations

Node (org.jsoup.nodes.Node)75 TextNode (org.jsoup.nodes.TextNode)52 Element (org.jsoup.nodes.Element)48 Document (org.jsoup.nodes.Document)29 ArrayList (java.util.ArrayList)19 Elements (org.jsoup.select.Elements)13 Test (org.junit.jupiter.api.Test)8 IOException (java.io.IOException)7 Copy (de.geeksfactory.opacclient.objects.Copy)5 DetailedItem (de.geeksfactory.opacclient.objects.DetailedItem)5 HashMap (java.util.HashMap)5 DateTimeFormatter (org.joda.time.format.DateTimeFormatter)5 JSONException (org.json.JSONException)5 NotReachableException (de.geeksfactory.opacclient.networking.NotReachableException)4 Detail (de.geeksfactory.opacclient.objects.Detail)4 UnsupportedEncodingException (java.io.UnsupportedEncodingException)4 URI (java.net.URI)4 Matcher (java.util.regex.Matcher)4 NameValuePair (org.apache.http.NameValuePair)4 BasicNameValuePair (org.apache.http.message.BasicNameValuePair)4