Search in sources :

Example 1 with SipNodes

use of com.virjar.sipsoup.model.SipNodes in project vscrawler by virjar.

the class AbstractSelectable method xpath.

/**
 * xpath抽取
 *
 * @param xpathEvaluator xpath表达式模型
 * @return xpath抽取结果
 */
public XpathNode xpath(XpathEvaluator xpathEvaluator) {
    SipNodes sipNodes = xpathEvaluator.evaluate(covert(XpathNode.class).createOrGetModel());
    // TODO rawText
    XpathNode xpathNode = new XpathNode(getBaseUrl(), getRawText());
    xpathNode.setModel(sipNodes);
    return xpathNode;
}
Also used : XpathNode(com.virjar.vscrawler.core.selector.combine.selectables.XpathNode) SipNodes(com.virjar.sipsoup.model.SipNodes)

Example 2 with SipNodes

use of com.virjar.sipsoup.model.SipNodes in project vscrawler by virjar.

the class AbstractSelectable method css.

public XpathNode css(String css) {
    XpathNode xpathNode = new XpathNode(getBaseUrl(), (String) null);
    SipNodes newModels = new SipNodes();
    for (SIPNode sipNode : covert(XpathNode.class).createOrGetModel()) {
        if (sipNode.isText()) {
            continue;
        }
        for (Element el : sipNode.getElement().select(css)) {
            newModels.add(SIPNode.e(el));
        }
    }
    xpathNode.setModel(newModels);
    return xpathNode;
}
Also used : Element(org.jsoup.nodes.Element) XpathNode(com.virjar.vscrawler.core.selector.combine.selectables.XpathNode) SipNodes(com.virjar.sipsoup.model.SipNodes) SIPNode(com.virjar.sipsoup.model.SIPNode)

Example 3 with SipNodes

use of com.virjar.sipsoup.model.SipNodes in project vscrawler by virjar.

the class XpathNode method createOrGetModel.

@Override
public SipNodes createOrGetModel() {
    if (model == null) {
        try {
            Document document = Jsoup.parse(getRawText(), getBaseUrl());
            if (document == null) {
                throw new RuntimeException();
            }
            model = new SipNodes(SIPNode.e(document));
        } catch (Exception e) {
            model = new SipNodes(SIPNode.t(getRawText()));
        }
    }
    return model;
}
Also used : SipNodes(com.virjar.sipsoup.model.SipNodes) Document(org.jsoup.nodes.Document)

Example 4 with SipNodes

use of com.virjar.sipsoup.model.SipNodes in project vscrawler by virjar.

the class Converters method registerXpath.

private static void registerXpath() {
    register(XpathNode.class, XpathNode.class, new NodeConvert<XpathNode, XpathNode>() {

        @Override
        public XpathNode convert(XpathNode from) {
            return from;
        }
    });
    register(JsonNode.class, XpathNode.class, new NodeConvert<JsonNode, XpathNode>() {

        @Override
        public XpathNode convert(JsonNode from) {
            throw new UnsupportedOperationException("can not cover json to xpath");
        }
    });
    register(RawNode.class, XpathNode.class, new NodeConvert<RawNode, XpathNode>() {

        @Override
        public XpathNode convert(RawNode from) {
            return new XpathNode(from.getBaseUrl(), from.getRawText());
        }
    });
    register(StringNode.class, XpathNode.class, new NodeConvert<StringNode, XpathNode>() {

        @Override
        public XpathNode convert(final StringNode from) {
            XpathNode ret = new XpathNode(from.getBaseUrl(), (String) null);
            ret.setModel(new SipNodes(Lists.newLinkedList(Iterables.transform(from.createOrGetModel(), new Function<String, SIPNode>() {

                @Override
                public SIPNode apply(String input) {
                    try {
                        Document document = Jsoup.parse(input, from.getBaseUrl());
                        if (document != null) {
                            return SIPNode.e(document);
                        }
                    } catch (Exception e) {
                    // do nothing
                    }
                    return SIPNode.t(input);
                }
            }))));
            return ret;
        }
    });
}
Also used : XpathNode(com.virjar.vscrawler.core.selector.combine.selectables.XpathNode) JsonNode(com.virjar.vscrawler.core.selector.combine.selectables.JsonNode) Document(org.jsoup.nodes.Document) StringNode(com.virjar.vscrawler.core.selector.combine.selectables.StringNode) SipNodes(com.virjar.sipsoup.model.SipNodes) RawNode(com.virjar.vscrawler.core.selector.combine.selectables.RawNode) SIPNode(com.virjar.sipsoup.model.SIPNode)

Example 5 with SipNodes

use of com.virjar.sipsoup.model.SipNodes in project vscrawler by virjar.

the class XpathNode method toMultiSelectable.

@Override
public List<AbstractSelectable> toMultiSelectable() {
    SipNodes sipNodes = createOrGetModel();
    List<AbstractSelectable> ret = Lists.newLinkedList();
    for (final SIPNode sipNode : sipNodes) {
        XpathNode xpathNode;
        if (sipNode.isText()) {
            xpathNode = new XpathNode(getBaseUrl(), sipNode.getTextVal());
        } else {
            xpathNode = new XpathNode(getBaseUrl(), new RawTextStringFactory() {

                @Override
                public String rawText() {
                    return sipNode.toString();
                }
            });
        }
        xpathNode.setModel(new SipNodes(sipNode));
        ret.add(xpathNode);
    }
    return ret;
}
Also used : RawTextStringFactory(com.virjar.vscrawler.core.selector.combine.RawTextStringFactory) SipNodes(com.virjar.sipsoup.model.SipNodes) AbstractSelectable(com.virjar.vscrawler.core.selector.combine.AbstractSelectable) SIPNode(com.virjar.sipsoup.model.SIPNode)

Aggregations

SipNodes (com.virjar.sipsoup.model.SipNodes)5 SIPNode (com.virjar.sipsoup.model.SIPNode)3 XpathNode (com.virjar.vscrawler.core.selector.combine.selectables.XpathNode)3 Document (org.jsoup.nodes.Document)2 AbstractSelectable (com.virjar.vscrawler.core.selector.combine.AbstractSelectable)1 RawTextStringFactory (com.virjar.vscrawler.core.selector.combine.RawTextStringFactory)1 JsonNode (com.virjar.vscrawler.core.selector.combine.selectables.JsonNode)1 RawNode (com.virjar.vscrawler.core.selector.combine.selectables.RawNode)1 StringNode (com.virjar.vscrawler.core.selector.combine.selectables.StringNode)1 Element (org.jsoup.nodes.Element)1