Search in sources :

Example 6 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project CoreNLP by stanfordnlp.

the class ChineseHcorefDemo method main.

public static void main(String[] args) throws Exception {
    long startTime = System.currentTimeMillis();
    String text = "俄罗斯 航空 公司 一 名 官员 在 9号 说 , " + "米洛舍维奇 的 儿子 马可·米洛舍维奇 9号 早上 持 外交 护照 从 俄国 首都 莫斯科 搭机 飞往 中国 大陆 北京 , " + "可是 就 在 稍后 就 返回 莫斯科 。 " + "这 名 俄国 航空 公司 官员 说 马可 是 因为 护照 问题 而 在 北京 机场 被 中共 遣返 莫斯科 。 " + "北京 机场 方面 的 这 项 举动 清楚 显示 中共 有意 放弃 在 总统 大选 落败 的 前 南斯拉夫 总统 米洛舍维奇 , " + "因此 他 在 南斯拉夫 受到 民众 厌恶 的 儿子 马可 才 会 在 北京 机场 被 中共 当局 送回 莫斯科 。 " + "马可 持 外交 护照 能够 顺利 搭机 离开 莫斯科 , 但是 却 在 北京 受阻 , 可 算是 踢到 了 铁板 。 " + "可是 这 项 消息 和 先前 外界 谣传 中共 当局 准备 提供 米洛舍维奇 和 他 的 家人 安全 庇护所 有 着 很 大 的 出入 ," + " 一般 认为 在 去年 米洛舍维奇 挥兵 攻打 科索沃 境内 阿尔巴尼亚 一 分离主义 分子 的 时候 , " + "强力 反对 北约 组织 攻击 南斯拉夫 的 中共 , 会 全力 保护 米洛舍维奇 和 他 的 家人 及 亲信 。 " + "可是 从 9号 马可 被 送回 莫斯科 一 事 看 起来 , 中共 很 可能 会 放弃 米洛舍维奇 。";
    args = new String[] { "-props", "edu/stanford/nlp/hcoref/properties/zh-coref-default.properties" };
    Annotation document = new Annotation(text);
    Properties props = StringUtils.argsToProperties(args);
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
    pipeline.annotate(document);
    System.out.println("---");
    System.out.println("coref chains");
    for (CorefChain cc : document.get(CorefCoreAnnotations.CorefChainAnnotation.class).values()) {
        System.out.println("\t" + cc);
    }
    for (CoreMap sentence : document.get(CoreAnnotations.SentencesAnnotation.class)) {
        System.out.println("---");
        System.out.println("mentions");
        for (Mention m : sentence.get(CorefCoreAnnotations.CorefMentionsAnnotation.class)) {
            System.out.println("\t" + m);
        }
    }
    long endTime = System.currentTimeMillis();
    long time = (endTime - startTime) / 1000;
    System.out.println("Running time " + time / 60 + "min " + time % 60 + "s");
}
Also used : CorefChain(edu.stanford.nlp.coref.data.CorefChain) Mention(edu.stanford.nlp.coref.data.Mention) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) CorefCoreAnnotations(edu.stanford.nlp.coref.CorefCoreAnnotations) Properties(java.util.Properties) CorefCoreAnnotations(edu.stanford.nlp.coref.CorefCoreAnnotations) CoreMap(edu.stanford.nlp.util.CoreMap) Annotation(edu.stanford.nlp.pipeline.Annotation) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP)

Example 7 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project CoreNLP by stanfordnlp.

the class MentionExtractor method loadStanfordProcessor.

/** Load Stanford Processor: skip unnecessary annotator */
protected static StanfordCoreNLP loadStanfordProcessor(Properties props) {
    boolean replicateCoNLL = Boolean.parseBoolean(props.getProperty(Constants.REPLICATECONLL_PROP, "false"));
    Properties pipelineProps = new Properties(props);
    StringBuilder annoSb = new StringBuilder("");
    if (!Constants.USE_GOLD_POS && !replicateCoNLL) {
        annoSb.append("pos, lemma");
    } else {
        annoSb.append("lemma");
    }
    if (Constants.USE_TRUECASE) {
        annoSb.append(", truecase");
    }
    if (!Constants.USE_GOLD_NE && !replicateCoNLL) {
        annoSb.append(", ner");
    }
    if (!Constants.USE_GOLD_PARSES && !replicateCoNLL) {
        annoSb.append(", parse");
    }
    String annoStr = annoSb.toString();
    SieveCoreferenceSystem.logger.info("MentionExtractor ignores specified annotators, using annotators=" + annoStr);
    pipelineProps.setProperty("annotators", annoStr);
    return new StanfordCoreNLP(pipelineProps, false);
}
Also used : Properties(java.util.Properties) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP)

Example 8 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project CoreNLP by stanfordnlp.

the class RothCONLL04Reader method main.

public static void main(String[] args) throws Exception {
    // just a simple test, to make sure stuff works
    Properties props = StringUtils.argsToProperties(args);
    RothCONLL04Reader reader = new RothCONLL04Reader();
    reader.setLoggerLevel(Level.INFO);
    reader.setProcessor(new StanfordCoreNLP(props));
    Annotation doc = reader.parse("/u/nlp/data/RothCONLL04/conll04.corp");
    System.out.println(AnnotationUtils.datasetToString(doc));
}
Also used : Properties(java.util.Properties) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP) Annotation(edu.stanford.nlp.pipeline.Annotation)

Example 9 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project CoreNLP by stanfordnlp.

the class OpenIEServlet method init.

/**
   * Set the properties to the paths they appear at on the servlet.
   * See build.xml for where these paths get copied.
   * @throws ServletException Thrown by the implementation
   */
public void init() throws ServletException {
    Properties commonProps = new Properties() {

        {
            setProperty("depparse.extradependencies", "ref_only_uncollapsed");
            setProperty("parse.extradependencies", "ref_only_uncollapsed");
            setProperty("openie.splitter.threshold", "0.10");
            setProperty("openie.optimze_for", "GENERAL");
            setProperty("openie.ignoreaffinity", "false");
            setProperty("openie.max_entailments_per_clause", "1000");
            setProperty("openie.triple.strict", "true");
        }
    };
    try {
        String dataDir = getServletContext().getRealPath("/WEB-INF/data");
        System.setProperty("de.jollyday.config", getServletContext().getRealPath("/WEB-INF/classes/holidays/jollyday.properties"));
        commonProps.setProperty("pos.model", dataDir + "/english-left3words-distsim.tagger");
        commonProps.setProperty("ner.model", dataDir + "/english.all.3class.distsim.crf.ser.gz," + dataDir + "/english.conll.4class.distsim.crf.ser.gz," + dataDir + "/english.muc.7class.distsim.crf.ser.gz");
        commonProps.setProperty("depparse.model", dataDir + "/english_SD.gz");
        commonProps.setProperty("parse.model", dataDir + "/englishPCFG.ser.gz");
        commonProps.setProperty("sutime.rules", dataDir + "/defs.sutime.txt," + dataDir + "/english.sutime.txt," + dataDir + "/english.hollidays.sutime.txt");
        commonProps.setProperty("openie.splitter.model", dataDir + "/clauseSplitterModel.ser.gz");
        commonProps.setProperty("openie.affinity_models", dataDir);
    } catch (NullPointerException e) {
        log.info("Could not load servlet context. Are you on the command line?");
    }
    if (this.pipeline == null) {
        Properties fullProps = new Properties(commonProps);
        fullProps.setProperty("annotators", "tokenize,ssplit,pos,lemma,depparse,ner,natlog,openie");
        this.pipeline = new StanfordCoreNLP(fullProps);
    }
    if (this.backoff == null) {
        Properties backoffProps = new Properties(commonProps);
        backoffProps.setProperty("annotators", "parse,natlog,openie");
        backoffProps.setProperty("enforceRequirements", "false");
        this.backoff = new StanfordCoreNLP(backoffProps);
    }
}
Also used : StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP)

Example 10 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project CoreNLP by stanfordnlp.

the class GetPatternsFromDataMultiClass method runPOSNERParseOnTokens.

public static Map<String, DataInstance> runPOSNERParseOnTokens(Map<String, DataInstance> sents, Properties propsoriginal) {
    PatternFactory.PatternType type = PatternFactory.PatternType.valueOf(propsoriginal.getProperty(Flags.patternType));
    Properties props = new Properties();
    List<String> anns = new ArrayList<>();
    anns.add("pos");
    anns.add("lemma");
    boolean useTargetParserParentRestriction = Boolean.parseBoolean(propsoriginal.getProperty(Flags.useTargetParserParentRestriction));
    boolean useTargetNERRestriction = Boolean.parseBoolean(propsoriginal.getProperty(Flags.useTargetNERRestriction));
    String posModelPath = props.getProperty(Flags.posModelPath);
    String numThreads = propsoriginal.getProperty(Flags.numThreads);
    if (useTargetParserParentRestriction) {
        anns.add("parse");
    } else if (type.equals(PatternFactory.PatternType.DEP))
        anns.add("depparse");
    if (useTargetNERRestriction) {
        anns.add("ner");
    }
    props.setProperty("annotators", StringUtils.join(anns, ","));
    props.setProperty("parse.maxlen", "80");
    props.setProperty("nthreads", numThreads);
    props.setProperty("threads", numThreads);
    if (posModelPath != null) {
        props.setProperty("pos.model", posModelPath);
    }
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props, false);
    Redwood.log(Redwood.DBG, "Annotating text");
    for (Map.Entry<String, DataInstance> en : sents.entrySet()) {
        List<CoreMap> temp = new ArrayList<>();
        CoreMap s = new ArrayCoreMap();
        s.set(CoreAnnotations.TokensAnnotation.class, en.getValue().getTokens());
        temp.add(s);
        Annotation doc = new Annotation(temp);
        try {
            pipeline.annotate(doc);
            if (useTargetParserParentRestriction)
                inferParentParseTag(s.get(TreeAnnotation.class));
        } catch (Exception e) {
            log.warn("Ignoring error: for sentence  " + StringUtils.joinWords(en.getValue().getTokens(), " "));
            log.warn(e);
        }
    }
    Redwood.log(Redwood.DBG, "Done annotating text");
    return sents;
}
Also used : StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP) TreeAnnotation(edu.stanford.nlp.trees.TreeCoreAnnotations.TreeAnnotation) Annotation(edu.stanford.nlp.pipeline.Annotation) GoldAnswerAnnotation(edu.stanford.nlp.ling.CoreAnnotations.GoldAnswerAnnotation) SQLException(java.sql.SQLException) InvocationTargetException(java.lang.reflect.InvocationTargetException) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations)

Aggregations

StanfordCoreNLP (edu.stanford.nlp.pipeline.StanfordCoreNLP)42 Properties (java.util.Properties)31 Annotation (edu.stanford.nlp.pipeline.Annotation)26 CoreMap (edu.stanford.nlp.util.CoreMap)19 CoreAnnotations (edu.stanford.nlp.ling.CoreAnnotations)16 SemanticGraph (edu.stanford.nlp.semgraph.SemanticGraph)7 CoreLabel (edu.stanford.nlp.ling.CoreLabel)6 SemanticGraphCoreAnnotations (edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations)5 Test (org.junit.Test)5 SentencesAnnotation (edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation)4 CollapsedDependenciesAnnotation (edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations.CollapsedDependenciesAnnotation)4 GoldAnswerAnnotation (edu.stanford.nlp.ling.CoreAnnotations.GoldAnswerAnnotation)3 IndexedWord (edu.stanford.nlp.ling.IndexedWord)3 SemanticGraphEdge (edu.stanford.nlp.semgraph.SemanticGraphEdge)3 TreeAnnotation (edu.stanford.nlp.trees.TreeCoreAnnotations.TreeAnnotation)3 PrintWriter (java.io.PrintWriter)3 ArrayList (java.util.ArrayList)3 CorefCoreAnnotations (edu.stanford.nlp.coref.CorefCoreAnnotations)2 CorefChain (edu.stanford.nlp.coref.data.CorefChain)2 RelationTriple (edu.stanford.nlp.ie.util.RelationTriple)2