Search in sources :

Example 31 with CoreMap

use of edu.stanford.nlp.util.CoreMap in project CoreNLP by stanfordnlp.

the class SimpleSentiment method classify.

/**
   * @see SimpleSentiment#classify(CoreMap)
   */
public SentimentClass classify(String text) {
    Annotation ann = new Annotation(text);
    pipeline.get().annotate(ann);
    CoreMap sentence = ann.get(CoreAnnotations.SentencesAnnotation.class).get(0);
    Counter<String> features = featurize(sentence);
    RVFDatum<SentimentClass, String> datum = new RVFDatum<>(features);
    return impl.classOf(datum);
}
Also used : SentimentClass(edu.stanford.nlp.simple.SentimentClass) RVFDatum(edu.stanford.nlp.ling.RVFDatum) CoreMap(edu.stanford.nlp.util.CoreMap) Annotation(edu.stanford.nlp.pipeline.Annotation)

Example 32 with CoreMap

use of edu.stanford.nlp.util.CoreMap in project CoreNLP by stanfordnlp.

the class SentenceAlgorithms method unescapeHTML.

/**
   * A funky little helper method to interpret each token of the sentence as an HTML string, and translate it back to text.
   * Note that this is <b>in place</b>.
   */
public void unescapeHTML() {
    // Change in the protobuf
    for (int i = 0; i < sentence.length(); ++i) {
        CoreNLPProtos.Token.Builder token = sentence.rawToken(i);
        token.setWord(StringUtils.unescapeHtml3(token.getWord()));
        token.setLemma(StringUtils.unescapeHtml3(token.getLemma()));
    }
    // Change in the annotation
    CoreMap cm = sentence.document.asAnnotation().get(CoreAnnotations.SentencesAnnotation.class).get(sentence.sentenceIndex());
    for (CoreLabel token : cm.get(CoreAnnotations.TokensAnnotation.class)) {
        token.setWord(StringUtils.unescapeHtml3(token.word()));
        token.setLemma(StringUtils.unescapeHtml3(token.lemma()));
    }
}
Also used : CoreLabel(edu.stanford.nlp.ling.CoreLabel) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) CoreMap(edu.stanford.nlp.util.CoreMap)

Example 33 with CoreMap

use of edu.stanford.nlp.util.CoreMap in project CoreNLP by stanfordnlp.

the class CorefMentionFinder method removeSpuriousMentionsEn.

protected void removeSpuriousMentionsEn(Annotation doc, List<List<Mention>> predictedMentions, Dictionaries dict) {
    List<CoreMap> sentences = doc.get(CoreAnnotations.SentencesAnnotation.class);
    for (int i = 0; i < predictedMentions.size(); i++) {
        CoreMap s = sentences.get(i);
        List<Mention> mentions = predictedMentions.get(i);
        List<CoreLabel> sent = s.get(CoreAnnotations.TokensAnnotation.class);
        Set<Mention> remove = Generics.newHashSet();
        for (Mention m : mentions) {
            String headPOS = m.headWord.get(CoreAnnotations.PartOfSpeechAnnotation.class);
            // non word such as 'hmm'
            if (dict.nonWords.contains(m.headString))
                remove.add(m);
            // check if the mention is noun and the next word is not noun
            if (dict.isAdjectivalDemonym(m.spanToString())) {
                if (!headPOS.startsWith("N") || (m.endIndex < sent.size() && sent.get(m.endIndex).tag().startsWith("N"))) {
                    remove.add(m);
                }
            }
            // stop list (e.g., U.S., there)
            if (inStopList(m))
                remove.add(m);
        }
        mentions.removeAll(remove);
    }
}
Also used : CoreLabel(edu.stanford.nlp.ling.CoreLabel) Mention(edu.stanford.nlp.coref.data.Mention) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) TreeCoreAnnotations(edu.stanford.nlp.trees.TreeCoreAnnotations) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) CoreMap(edu.stanford.nlp.util.CoreMap) ParserConstraint(edu.stanford.nlp.parser.common.ParserConstraint)

Example 34 with CoreMap

use of edu.stanford.nlp.util.CoreMap in project CoreNLP by stanfordnlp.

the class CorefMentionFinder method addGoldMentions.

// temporary for debug
protected static void addGoldMentions(List<CoreMap> sentences, List<Set<IntPair>> mentionSpanSetList, List<List<Mention>> predictedMentions, List<List<Mention>> allGoldMentions) {
    for (int i = 0, sz = sentences.size(); i < sz; i++) {
        List<Mention> mentions = predictedMentions.get(i);
        CoreMap sent = sentences.get(i);
        List<CoreLabel> tokens = sent.get(TokensAnnotation.class);
        Set<IntPair> mentionSpanSet = mentionSpanSetList.get(i);
        List<Mention> golds = allGoldMentions.get(i);
        for (Mention g : golds) {
            IntPair pair = new IntPair(g.startIndex, g.endIndex);
            if (!mentionSpanSet.contains(pair)) {
                int dummyMentionId = -1;
                Mention m = new Mention(dummyMentionId, g.startIndex, g.endIndex, tokens, sent.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class), sent.get(SemanticGraphCoreAnnotations.EnhancedDependenciesAnnotation.class) != null ? sent.get(SemanticGraphCoreAnnotations.EnhancedDependenciesAnnotation.class) : sent.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class), new ArrayList<>(tokens.subList(g.startIndex, g.endIndex)));
                mentions.add(m);
                mentionSpanSet.add(pair);
            }
        }
    }
}
Also used : CoreLabel(edu.stanford.nlp.ling.CoreLabel) Mention(edu.stanford.nlp.coref.data.Mention) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) CoreMap(edu.stanford.nlp.util.CoreMap) IntPair(edu.stanford.nlp.util.IntPair) ParserConstraint(edu.stanford.nlp.parser.common.ParserConstraint)

Example 35 with CoreMap

use of edu.stanford.nlp.util.CoreMap in project CoreNLP by stanfordnlp.

the class CorefMentionFinder method extractEnumerations.

protected static void extractEnumerations(CoreMap s, List<Mention> mentions, Set<IntPair> mentionSpanSet, Set<IntPair> namedEntitySpanSet) {
    List<CoreLabel> sent = s.get(CoreAnnotations.TokensAnnotation.class);
    Tree tree = s.get(TreeCoreAnnotations.TreeAnnotation.class);
    SemanticGraph basicDependency = s.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class);
    SemanticGraph enhancedDependency = s.get(SemanticGraphCoreAnnotations.EnhancedDependenciesAnnotation.class);
    if (enhancedDependency == null) {
        enhancedDependency = s.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class);
    }
    TregexPattern tgrepPattern = enumerationsMentionPattern;
    TregexMatcher matcher = tgrepPattern.matcher(tree);
    Map<IntPair, Tree> spanToMentionSubTree = Generics.newHashMap();
    while (matcher.find()) {
        matcher.getMatch();
        Tree m1 = matcher.getNode("m1");
        Tree m2 = matcher.getNode("m2");
        List<Tree> mLeaves = m1.getLeaves();
        int beginIdx = ((CoreLabel) mLeaves.get(0).label()).get(CoreAnnotations.IndexAnnotation.class) - 1;
        int endIdx = ((CoreLabel) mLeaves.get(mLeaves.size() - 1).label()).get(CoreAnnotations.IndexAnnotation.class);
        spanToMentionSubTree.put(new IntPair(beginIdx, endIdx), m1);
        mLeaves = m2.getLeaves();
        beginIdx = ((CoreLabel) mLeaves.get(0).label()).get(CoreAnnotations.IndexAnnotation.class) - 1;
        endIdx = ((CoreLabel) mLeaves.get(mLeaves.size() - 1).label()).get(CoreAnnotations.IndexAnnotation.class);
        spanToMentionSubTree.put(new IntPair(beginIdx, endIdx), m2);
    }
    for (Map.Entry<IntPair, Tree> spanMention : spanToMentionSubTree.entrySet()) {
        IntPair span = spanMention.getKey();
        if (!mentionSpanSet.contains(span) && !insideNE(span, namedEntitySpanSet)) {
            int dummyMentionId = -1;
            Mention m = new Mention(dummyMentionId, span.get(0), span.get(1), sent, basicDependency, enhancedDependency, new ArrayList<>(sent.subList(span.get(0), span.get(1))), spanMention.getValue());
            mentions.add(m);
            mentionSpanSet.add(span);
        }
    }
}
Also used : TregexPattern(edu.stanford.nlp.trees.tregex.TregexPattern) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) TreeCoreAnnotations(edu.stanford.nlp.trees.TreeCoreAnnotations) IntPair(edu.stanford.nlp.util.IntPair) ParserConstraint(edu.stanford.nlp.parser.common.ParserConstraint) CoreLabel(edu.stanford.nlp.ling.CoreLabel) Mention(edu.stanford.nlp.coref.data.Mention) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) TreeCoreAnnotations(edu.stanford.nlp.trees.TreeCoreAnnotations) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) Tree(edu.stanford.nlp.trees.Tree) SemanticGraph(edu.stanford.nlp.semgraph.SemanticGraph) TregexMatcher(edu.stanford.nlp.trees.tregex.TregexMatcher) Map(java.util.Map) CoreMap(edu.stanford.nlp.util.CoreMap)

Aggregations

CoreMap (edu.stanford.nlp.util.CoreMap)253 CoreAnnotations (edu.stanford.nlp.ling.CoreAnnotations)172 CoreLabel (edu.stanford.nlp.ling.CoreLabel)102 SemanticGraphCoreAnnotations (edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations)61 TreeCoreAnnotations (edu.stanford.nlp.trees.TreeCoreAnnotations)53 ArrayList (java.util.ArrayList)53 Annotation (edu.stanford.nlp.pipeline.Annotation)49 Tree (edu.stanford.nlp.trees.Tree)28 Properties (java.util.Properties)23 StanfordCoreNLP (edu.stanford.nlp.pipeline.StanfordCoreNLP)20 SemanticGraph (edu.stanford.nlp.semgraph.SemanticGraph)20 List (java.util.List)20 Mention (edu.stanford.nlp.coref.data.Mention)17 ArrayCoreMap (edu.stanford.nlp.util.ArrayCoreMap)17 CorefCoreAnnotations (edu.stanford.nlp.coref.CorefCoreAnnotations)13 ParserConstraint (edu.stanford.nlp.parser.common.ParserConstraint)12 SentencesAnnotation (edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation)11 MachineReadingAnnotations (edu.stanford.nlp.ie.machinereading.structure.MachineReadingAnnotations)9 IndexedWord (edu.stanford.nlp.ling.IndexedWord)9 IntPair (edu.stanford.nlp.util.IntPair)9