Search in sources :

Example 1 with Node

use of org.jsoup.nodes.Node in project jsoup by jhy.

the class ElementsTest method traverse.

@Test
public void traverse() {
    Document doc = Jsoup.parse("<div><p>Hello</p></div><div>There</div>");
    final StringBuilder accum = new StringBuilder();
    doc.select("div").traverse(new NodeVisitor() {

        public void head(Node node, int depth) {
            accum.append("<" + node.nodeName() + ">");
        }

        public void tail(Node node, int depth) {
            accum.append("</" + node.nodeName() + ">");
        }
    });
    assertEquals("<div><p><#text></#text></p></div><div><#text></#text></div>", accum.toString());
}
Also used : Node(org.jsoup.nodes.Node) Document(org.jsoup.nodes.Document) Test(org.junit.Test)

Example 2 with Node

use of org.jsoup.nodes.Node in project jsoup by jhy.

the class NodeTraversor method traverse.

/**
     * Start a depth-first traverse of the root and all of its descendants.
     * @param root the root node point to traverse.
     */
public void traverse(Node root) {
    Node node = root;
    int depth = 0;
    while (node != null) {
        visitor.head(node, depth);
        if (node.childNodeSize() > 0) {
            node = node.childNode(0);
            depth++;
        } else {
            while (node.nextSibling() == null && depth > 0) {
                visitor.tail(node, depth);
                node = node.parentNode();
                depth--;
            }
            visitor.tail(node, depth);
            if (node == root)
                break;
            node = node.nextSibling();
        }
    }
}
Also used : Node(org.jsoup.nodes.Node)

Example 3 with Node

use of org.jsoup.nodes.Node in project sppanblog4springboot by whoismy8023.

the class HtmlFilter method truncateHTML.

/**
 * 使用Jsoup预览
 *
 * @param source 需要过滤的
 * @param dest   过滤后的对象
 * @param len    截取字符长度
 *               <p>
 *               Document dirtyDocument = Jsoup.parse(sb.toString());<br />
 *               Element source = dirtyDocument.body();<br />
 *               Document clean = Document.createShell(dirtyDocument.baseUri());<br />
 *               Element dest = clean.body();<br />
 *               int len = 6;<br />
 *               truncateHTML(source,dest,len);<br />
 *               System.out.println(dest.html());<br />
 */
private static void truncateHTML(Element source, Element dest, int len) {
    List<Node> sourceChildren = source.childNodes();
    for (Node sourceChild : sourceChildren) {
        if (sourceChild instanceof Element) {
            Element sourceEl = (Element) sourceChild;
            Element destChild = createSafeElement(sourceEl);
            int txt = dest.text().length();
            if (txt >= len) {
                break;
            } else {
                len = len - txt;
            }
            dest.appendChild(destChild);
            truncateHTML(sourceEl, destChild, len);
        } else if (sourceChild instanceof TextNode) {
            int destLeng = dest.text().length();
            if (destLeng >= len) {
                break;
            }
            TextNode sourceText = (TextNode) sourceChild;
            int txtLeng = sourceText.getWholeText().length();
            if ((destLeng + txtLeng) > len) {
                int tmp = len - destLeng;
                String txt = sourceText.getWholeText().substring(0, tmp);
                TextNode destText = new TextNode(txt, sourceChild.baseUri());
                dest.appendChild(destText);
                break;
            } else {
                TextNode destText = new TextNode(sourceText.getWholeText(), sourceChild.baseUri());
                dest.appendChild(destText);
            }
        }
    }
}
Also used : Node(org.jsoup.nodes.Node) TextNode(org.jsoup.nodes.TextNode) Element(org.jsoup.nodes.Element) TextNode(org.jsoup.nodes.TextNode)

Example 4 with Node

use of org.jsoup.nodes.Node in project flow by vaadin.

the class TemplateParser method collectIncludeNodes.

private static List<TextNode> collectIncludeNodes(Element element) {
    List<TextNode> includeNodes = new ArrayList<>();
    new NodeTraversor(new NodeVisitor() {

        @Override
        public void head(Node node, int depth) {
        // nop
        }

        @Override
        public void tail(Node node, int depth) {
            if (node instanceof TextNode) {
                TextNode textNode = (TextNode) node;
                String text = textNode.getWholeText();
                if (text.contains(INCLUDE_PREFIX)) {
                    includeNodes.add(textNode);
                }
            }
        }
    }).traverse(element);
    return includeNodes;
}
Also used : TextNode(org.jsoup.nodes.TextNode) TemplateNode(com.vaadin.flow.template.angular.TemplateNode) Node(org.jsoup.nodes.Node) ArrayList(java.util.ArrayList) TextNode(org.jsoup.nodes.TextNode) NodeTraversor(org.jsoup.select.NodeTraversor) NodeVisitor(org.jsoup.select.NodeVisitor)

Example 5 with Node

use of org.jsoup.nodes.Node in project flow by vaadin.

the class DefaultTemplateParser method removeCommentsRecursively.

private static void removeCommentsRecursively(Node node) {
    int i = 0;
    while (i < node.childNodes().size()) {
        Node child = node.childNode(i);
        if (child instanceof Comment) {
            child.remove();
        } else {
            removeCommentsRecursively(child);
            i++;
        }
    }
}
Also used : Comment(org.jsoup.nodes.Comment) Node(org.jsoup.nodes.Node)

Aggregations

Node (org.jsoup.nodes.Node)75 TextNode (org.jsoup.nodes.TextNode)52 Element (org.jsoup.nodes.Element)48 Document (org.jsoup.nodes.Document)29 ArrayList (java.util.ArrayList)19 Elements (org.jsoup.select.Elements)13 Test (org.junit.jupiter.api.Test)8 IOException (java.io.IOException)7 Copy (de.geeksfactory.opacclient.objects.Copy)5 DetailedItem (de.geeksfactory.opacclient.objects.DetailedItem)5 HashMap (java.util.HashMap)5 DateTimeFormatter (org.joda.time.format.DateTimeFormatter)5 JSONException (org.json.JSONException)5 NotReachableException (de.geeksfactory.opacclient.networking.NotReachableException)4 Detail (de.geeksfactory.opacclient.objects.Detail)4 UnsupportedEncodingException (java.io.UnsupportedEncodingException)4 URI (java.net.URI)4 Matcher (java.util.regex.Matcher)4 NameValuePair (org.apache.http.NameValuePair)4 BasicNameValuePair (org.apache.http.message.BasicNameValuePair)4