Search in sources :

Example 6 with GrammaticalRelation

use of edu.stanford.nlp.trees.GrammaticalRelation in project CoreNLP by stanfordnlp.

the class EnglishGrammaticalRelations method getPrep.

/**
   * The "prep" grammatical relation. Used to collapse prepositions.<p>
   * They will be turned into prep_word, where "word" is a preposition
   *
   * @param prepositionString The preposition to make a GrammaticalRelation out of
   * @return A grammatical relation for this preposition
   */
public static GrammaticalRelation getPrep(String prepositionString) {
    GrammaticalRelation result = preps.get(prepositionString);
    if (result == null) {
        synchronized (preps) {
            result = preps.get(prepositionString);
            if (result == null) {
                result = new GrammaticalRelation(Language.English, "prep", "prep_collapsed", PREPOSITIONAL_MODIFIER, prepositionString);
                preps.put(prepositionString, result);
                threadSafeAddRelation(result);
            }
        }
    }
    return result;
}
Also used : GrammaticalRelation(edu.stanford.nlp.trees.GrammaticalRelation)

Example 7 with GrammaticalRelation

use of edu.stanford.nlp.trees.GrammaticalRelation in project CoreNLP by stanfordnlp.

the class EnglishGrammaticalStructure method treatCC.

private static void treatCC(Collection<TypedDependency> list) {
    // Construct a map from tree nodes to the set of typed
    // dependencies in which the node appears as dependent.
    Map<IndexedWord, Set<TypedDependency>> map = Generics.newHashMap();
    // Construct a map of tree nodes being governor of a subject grammatical
    // relation to that relation
    Map<IndexedWord, TypedDependency> subjectMap = Generics.newHashMap();
    // Construct a set of TreeGraphNodes with a passive auxiliary on them
    Set<IndexedWord> withPassiveAuxiliary = Generics.newHashSet();
    // Construct a map of tree nodes being governor of an object grammatical
    // relation to that relation
    // Map<TreeGraphNode, TypedDependency> objectMap = new
    // HashMap<TreeGraphNode, TypedDependency>();
    List<IndexedWord> rcmodHeads = Generics.newArrayList();
    List<IndexedWord> prepcDep = Generics.newArrayList();
    for (TypedDependency typedDep : list) {
        if (!map.containsKey(typedDep.dep())) {
            // NB: Here and in other places below, we use a TreeSet (which extends
            // SortedSet) to guarantee that results are deterministic)
            map.put(typedDep.dep(), new TreeSet<>());
        }
        map.get(typedDep.dep()).add(typedDep);
        if (typedDep.reln().equals(AUX_PASSIVE_MODIFIER)) {
            withPassiveAuxiliary.add(typedDep.gov());
        }
        // look for subjects
        if (typedDep.reln().getParent() == NOMINAL_SUBJECT || typedDep.reln().getParent() == SUBJECT || typedDep.reln().getParent() == CLAUSAL_SUBJECT) {
            if (!subjectMap.containsKey(typedDep.gov())) {
                subjectMap.put(typedDep.gov(), typedDep);
            }
        }
        // look for rcmod relations
        if (typedDep.reln() == RELATIVE_CLAUSE_MODIFIER) {
            rcmodHeads.add(typedDep.gov());
        }
        // to avoid wrong propagation of dobj
        if (typedDep.reln().toString().startsWith("prepc")) {
            prepcDep.add(typedDep.dep());
        }
    }
    // log.info(map);
    // if (DEBUG) log.info("Subject map: " + subjectMap);
    // if (DEBUG) log.info("Object map: " + objectMap);
    // log.info(rcmodHeads);
    // create a new list of typed dependencies
    Collection<TypedDependency> newTypedDeps = new ArrayList<>(list);
    // find typed deps of form conj(gov,dep)
    for (TypedDependency td : list) {
        if (EnglishGrammaticalRelations.getConjs().contains(td.reln())) {
            IndexedWord gov = td.gov();
            IndexedWord dep = td.dep();
            // look at the dep in the conjunct
            Set<TypedDependency> gov_relations = map.get(gov);
            // log.info("gov " + gov);
            if (gov_relations != null) {
                for (TypedDependency td1 : gov_relations) {
                    // log.info("gov rel " + td1);
                    IndexedWord newGov = td1.gov();
                    // is possible to have overlapping newGov & dep
                    if (newGov.equals(dep)) {
                        continue;
                    }
                    GrammaticalRelation newRel = td1.reln();
                    if (newRel != ROOT) {
                        if (rcmodHeads.contains(gov) && rcmodHeads.contains(dep)) {
                            // to prevent wrong propagation in the case of long dependencies in relative clauses
                            if (newRel != DIRECT_OBJECT && newRel != NOMINAL_SUBJECT) {
                                if (DEBUG) {
                                    log.info("Adding new " + newRel + " dependency from " + newGov + " to " + dep + " (subj/obj case)");
                                }
                                newTypedDeps.add(new TypedDependency(newRel, newGov, dep));
                            }
                        } else {
                            if (DEBUG) {
                                log.info("Adding new " + newRel + " dependency from " + newGov + " to " + dep);
                            }
                            newTypedDeps.add(new TypedDependency(newRel, newGov, dep));
                        }
                    }
                }
            }
            // propagate subjects
            // look at the gov in the conjunct: if it is has a subject relation,
            // the dep is a verb and the dep doesn't have a subject relation
            // then we want to add a subject relation for the dep.
            // (By testing for the dep to be a verb, we are going to miss subject of
            // copula verbs! but
            // is it safe to relax this assumption?? i.e., just test for the subject
            // part)
            // CDM 2008: I also added in JJ, since participial verbs are often
            // tagged JJ
            String tag = dep.tag();
            if (subjectMap.containsKey(gov) && (tag.startsWith("VB") || tag.startsWith("JJ")) && !subjectMap.containsKey(dep)) {
                TypedDependency tdsubj = subjectMap.get(gov);
                // check for wrong nsubjpass: if the new verb is VB or VBZ or VBP or JJ, then
                // add nsubj (if it is tagged correctly, should do this for VBD too, but we don't)
                GrammaticalRelation relation = tdsubj.reln();
                if (relation == NOMINAL_PASSIVE_SUBJECT) {
                    if (isDefinitelyActive(tag)) {
                        relation = NOMINAL_SUBJECT;
                    }
                } else if (relation == CLAUSAL_PASSIVE_SUBJECT) {
                    if (isDefinitelyActive(tag)) {
                        relation = CLAUSAL_SUBJECT;
                    }
                } else if (relation == NOMINAL_SUBJECT) {
                    if (withPassiveAuxiliary.contains(dep)) {
                        relation = NOMINAL_PASSIVE_SUBJECT;
                    }
                } else if (relation == CLAUSAL_SUBJECT) {
                    if (withPassiveAuxiliary.contains(dep)) {
                        relation = CLAUSAL_PASSIVE_SUBJECT;
                    }
                }
                if (DEBUG) {
                    log.info("Adding new " + relation + " dependency from " + dep + " to " + tdsubj.dep() + " (subj propagation case)");
                }
                newTypedDeps.add(new TypedDependency(relation, dep, tdsubj.dep()));
            }
        // propagate objects
        // cdm july 2010: This bit of code would copy a dobj from the first
        // clause to a later conjoined clause if it didn't
        // contain its own dobj or prepc. But this is too aggressive and wrong
        // if the later clause is intransitive
        // (including passivized cases) and so I think we have to not have this
        // done always, and see no good "sometimes" heuristic.
        // IF WE WERE TO REINSTATE, SHOULD ALSO NOT ADD OBJ IF THERE IS A ccomp
        // (SBAR).
        // if (objectMap.containsKey(gov) &&
        // dep.tag().startsWith("VB") && ! objectMap.containsKey(dep)
        // && ! prepcDep.contains(gov)) {
        // TypedDependency tdobj = objectMap.get(gov);
        // if (DEBUG) {
        // log.info("Adding new " + tdobj.reln() + " dependency from "
        // + dep + " to " + tdobj.dep() + " (obj propagation case)");
        // }
        // newTypedDeps.add(new TypedDependency(tdobj.reln(), dep,
        // tdobj.dep()));
        // }
        }
    }
    list.clear();
    list.addAll(newTypedDeps);
}
Also used : GrammaticalRelation(edu.stanford.nlp.trees.GrammaticalRelation) IndexedWord(edu.stanford.nlp.ling.IndexedWord)

Example 8 with GrammaticalRelation

use of edu.stanford.nlp.trees.GrammaticalRelation in project CoreNLP by stanfordnlp.

the class EnglishGrammaticalStructure method determinePrepRelation.

// end collapsePrepAndPoss()
/** Work out prep relation name. pc is the dependency whose dep() is the
   *  preposition to do a name for. topPrep may be the same or different.
   *  Among the daughters of its gov is where to look for an auxpass.
   */
private static GrammaticalRelation determinePrepRelation(Map<IndexedWord, ? extends Set<TypedDependency>> map, List<IndexedWord> vmod, TypedDependency pc, TypedDependency topPrep, boolean pobj) {
    // handling the case of an "agent":
    // the governor of a "by" preposition must have an "auxpass" dependency
    // or be the dependent of a "vmod" relation
    // if it is the case, the "agent" variable becomes true
    boolean agent = false;
    String preposition = pc.dep().value().toLowerCase();
    if (preposition.equals("by")) {
        // look if we have an auxpass
        Set<TypedDependency> aux_pass_poss = map.get(topPrep.gov());
        if (aux_pass_poss != null) {
            for (TypedDependency td_pass : aux_pass_poss) {
                if (td_pass.reln() == AUX_PASSIVE_MODIFIER) {
                    agent = true;
                }
            }
        }
        // look if we have a vmod
        if (!vmod.isEmpty() && vmod.contains(topPrep.gov())) {
            agent = true;
        }
    }
    GrammaticalRelation reln;
    if (agent) {
        reln = AGENT;
    } else {
        // for pobj: we collapse into "prep"; for pcomp: we collapse into "prepc"
        if (pobj) {
            reln = EnglishGrammaticalRelations.getPrep(preposition);
        } else {
            reln = EnglishGrammaticalRelations.getPrepC(preposition);
        }
    }
    return reln;
}
Also used : GrammaticalRelation(edu.stanford.nlp.trees.GrammaticalRelation)

Example 9 with GrammaticalRelation

use of edu.stanford.nlp.trees.GrammaticalRelation in project CoreNLP by stanfordnlp.

the class EnglishGrammaticalStructure method collapseMultiWordPrep.

/**
   * Collapse multiword preposition of the following format:
   * prep|advmod|dep|amod(gov, mwp0) dep(mpw0,mwp1) pobj|pcomp(mwp1, compl) or
   * pobj|pcomp(mwp0, compl) -&gt; prep_mwp0_mwp1(gov, compl)
   * <p/>
   *
   * @param list List of typedDependencies to work on,
   * @param newTypedDeps List of typedDependencies that we construct
   * @param str_mwp0 First part of the multiword preposition to construct the collapsed
   *          preposition
   * @param str_mwp1 Second part of the multiword preposition to construct the
   *          collapsed preposition
   * @param w_mwp0 First part of the multiword preposition that we look for
   * @param w_mwp1 Second part of the multiword preposition that we look for
   */
private static void collapseMultiWordPrep(Collection<TypedDependency> list, Collection<TypedDependency> newTypedDeps, String str_mwp0, String str_mwp1, String w_mwp0, String w_mwp1) {
    // first find the multiword_preposition: dep(mpw[0], mwp[1])
    // the two words should be next to another in the sentence (difference of
    // indexes = 1)
    IndexedWord mwp0 = null;
    IndexedWord mwp1 = null;
    TypedDependency dep = null;
    for (TypedDependency td : list) {
        if (td.gov().value().equalsIgnoreCase(w_mwp0) && td.dep().value().equalsIgnoreCase(w_mwp1) && Math.abs(td.gov().index() - td.dep().index()) == 1) {
            mwp0 = td.gov();
            mwp1 = td.dep();
            dep = td;
        }
    }
    if (mwp0 == null) {
        return;
    }
    // now search for prep|advmod|dep|amod(gov, mwp0)
    IndexedWord governor = null;
    TypedDependency prep = null;
    for (TypedDependency td1 : list) {
        if ((td1.reln() == PREPOSITIONAL_MODIFIER || td1.reln() == ADVERBIAL_MODIFIER || td1.reln() == ADJECTIVAL_MODIFIER || td1.reln() == DEPENDENT || td1.reln() == MULTI_WORD_EXPRESSION) && td1.dep().equals(mwp0)) {
            // we found prep|advmod|dep|amod(gov, mwp0)
            prep = td1;
            governor = prep.gov();
        }
    }
    if (prep == null) {
        return;
    }
    // search for the complement: pobj|pcomp(mwp1,X)
    // or for pobj|pcomp(mwp0,X)
    // There may be more than one in weird constructions; if there are several,
    // take the one with the LOWEST index!
    TypedDependency pobj = null;
    TypedDependency newtd = null;
    for (TypedDependency td2 : list) {
        if ((td2.reln() == PREPOSITIONAL_OBJECT || td2.reln() == PREPOSITIONAL_COMPLEMENT) && (td2.gov().equals(mwp1) || td2.gov().equals(mwp0))) {
            if (pobj == null || pobj.dep().index() > td2.dep().index()) {
                pobj = td2;
                // create the new gr relation
                GrammaticalRelation gr;
                if (td2.reln() == PREPOSITIONAL_COMPLEMENT) {
                    gr = EnglishGrammaticalRelations.getPrepC(str_mwp0 + '_' + str_mwp1);
                } else {
                    gr = EnglishGrammaticalRelations.getPrep(str_mwp0 + '_' + str_mwp1);
                }
                if (governor != null) {
                    newtd = new TypedDependency(gr, governor, pobj.dep());
                }
            }
        }
    }
    if (pobj == null || newtd == null) {
        return;
    }
    if (DEBUG) {
        log.info("Removing " + prep + ", " + dep + ", and " + pobj);
        log.info("  and adding " + newtd);
    }
    prep.setReln(KILL);
    dep.setReln(KILL);
    pobj.setReln(KILL);
    newTypedDeps.add(newtd);
    // and promote possible orphans
    for (TypedDependency td1 : list) {
        if (td1.reln() != KILL) {
            if (td1.gov().equals(mwp0) || td1.gov().equals(mwp1)) {
                // one?
                if (td1.reln() == TEMPORAL_MODIFIER) {
                    // special case when an extra NP-TMP is buried in a PP for
                    // "during the same period last year"
                    td1.setGov(pobj.dep());
                } else {
                    td1.setGov(governor);
                }
            }
            if (!newTypedDeps.contains(td1)) {
                newTypedDeps.add(td1);
            }
        }
    }
    list.clear();
    list.addAll(newTypedDeps);
}
Also used : GrammaticalRelation(edu.stanford.nlp.trees.GrammaticalRelation) IndexedWord(edu.stanford.nlp.ling.IndexedWord)

Example 10 with GrammaticalRelation

use of edu.stanford.nlp.trees.GrammaticalRelation in project CoreNLP by stanfordnlp.

the class EnglishGrammaticalStructure method collapse3WP.

/**
   * Collapse 3-word preposition of the following format: <br/>
   * This will be the case when the preposition is analyzed as a NP <br/>
   * prep(gov, mwp0) <br/>
   * X(mwp0,mwp1) <br/>
   * X(mwp1,mwp2) <br/>
   * pobj|pcomp(mwp2, compl) <br/>
   * -&gt; prep_mwp[0]_mwp[1]_mwp[2](gov, compl)
   * <p/>
   *
   * It also takes flat annotation into account: <br/>
   * prep(gov,mwp0) <br/>
   * X(mwp0,mwp1) <br/>
   * X(mwp0,mwp2) <br/>
   * pobj|pcomp(mwp0, compl) <br/>
   * -&gt; prep_mwp[0]_mwp[1]_mwp[2](gov, compl)
   * <p/>
   *
   *
   * @param list List of typedDependencies to work on
   */
private static void collapse3WP(Collection<TypedDependency> list) {
    Collection<TypedDependency> newTypedDeps = new ArrayList<>();
    // first, loop over the prepositions for NP annotation
    for (String[] mwp : THREEWORD_PREPS) {
        newTypedDeps.clear();
        IndexedWord mwp0 = null;
        IndexedWord mwp1 = null;
        IndexedWord mwp2 = null;
        TypedDependency dep1 = null;
        TypedDependency dep2 = null;
        for (TypedDependency td : list) {
            if (td.gov().value().equalsIgnoreCase(mwp[0]) && td.dep().value().equalsIgnoreCase(mwp[1]) && Math.abs(td.gov().index() - td.dep().index()) == 1) {
                mwp0 = td.gov();
                mwp1 = td.dep();
                dep1 = td;
            }
        }
        for (TypedDependency td : list) {
            if (td.gov().equals(mwp1) && td.dep().value().equalsIgnoreCase(mwp[2]) && Math.abs(td.gov().index() - td.dep().index()) == 1) {
                mwp2 = td.dep();
                dep2 = td;
            }
        }
        if (dep1 != null && dep2 != null) {
            // now search for prep(gov, mwp0)
            IndexedWord governor = null;
            TypedDependency prep = null;
            for (TypedDependency td1 : list) {
                if (td1.reln() == PREPOSITIONAL_MODIFIER && td1.dep().equals(mwp0)) {
                    // we
                    // found
                    // prep(gov,
                    // mwp0)
                    prep = td1;
                    governor = prep.gov();
                }
            }
            // search for the complement: pobj|pcomp(mwp2,X)
            TypedDependency pobj = null;
            TypedDependency newtd = null;
            for (TypedDependency td2 : list) {
                if (td2.reln() == PREPOSITIONAL_OBJECT && td2.gov().equals(mwp2)) {
                    pobj = td2;
                    // create the new gr relation
                    GrammaticalRelation gr = EnglishGrammaticalRelations.getPrep(mwp[0] + '_' + mwp[1] + '_' + mwp[2]);
                    if (governor != null) {
                        newtd = new TypedDependency(gr, governor, pobj.dep());
                    }
                }
                if (td2.reln() == PREPOSITIONAL_COMPLEMENT && td2.gov().equals(mwp2)) {
                    pobj = td2;
                    // create the new gr relation
                    GrammaticalRelation gr = EnglishGrammaticalRelations.getPrepC(mwp[0] + '_' + mwp[1] + '_' + mwp[2]);
                    if (governor != null) {
                        newtd = new TypedDependency(gr, governor, pobj.dep());
                    }
                }
            }
            // and add the new one
            if (prep != null && pobj != null && newtd != null) {
                prep.setReln(KILL);
                dep1.setReln(KILL);
                dep2.setReln(KILL);
                pobj.setReln(KILL);
                newTypedDeps.add(newtd);
                // and promote possible orphans
                for (TypedDependency td1 : list) {
                    if (td1.reln() != KILL) {
                        if (td1.gov().equals(mwp0) || td1.gov().equals(mwp1) || td1.gov().equals(mwp2)) {
                            td1.setGov(governor);
                        }
                        if (!newTypedDeps.contains(td1)) {
                            newTypedDeps.add(td1);
                        }
                    }
                }
                list.clear();
                list.addAll(newTypedDeps);
            }
        }
    }
    // second, loop again looking at flat annotation
    for (String[] mwp : THREEWORD_PREPS) {
        newTypedDeps.clear();
        IndexedWord mwp0 = null;
        IndexedWord mwp1 = null;
        IndexedWord mwp2 = null;
        TypedDependency dep1 = null;
        TypedDependency dep2 = null;
        // indexes = 1)
        for (TypedDependency td : list) {
            if (td.gov().value().equalsIgnoreCase(mwp[0]) && td.dep().value().equalsIgnoreCase(mwp[1]) && Math.abs(td.gov().index() - td.dep().index()) == 1) {
                mwp0 = td.gov();
                mwp1 = td.dep();
                dep1 = td;
            }
        }
        // indexes = 2)
        for (TypedDependency td : list) {
            if (td.gov().equals(mwp0) && td.dep().value().equalsIgnoreCase(mwp[2]) && Math.abs(td.gov().index() - td.dep().index()) == 2) {
                mwp2 = td.dep();
                dep2 = td;
            }
        }
        if (dep1 != null && dep2 != null) {
            // now search for prep(gov, mwp0)
            IndexedWord governor = null;
            TypedDependency prep = null;
            for (TypedDependency td1 : list) {
                if (td1.dep().equals(mwp0) && td1.reln() == PREPOSITIONAL_MODIFIER) {
                    // we
                    // found
                    // prep(gov,
                    // mwp0)
                    prep = td1;
                    governor = prep.gov();
                }
            }
            // search for the complement: pobj|pcomp(mwp0,X)
            TypedDependency pobj = null;
            TypedDependency newtd = null;
            for (TypedDependency td2 : list) {
                if (td2.gov().equals(mwp0) && td2.reln() == PREPOSITIONAL_OBJECT) {
                    pobj = td2;
                    // create the new gr relation
                    GrammaticalRelation gr = EnglishGrammaticalRelations.getPrep(mwp[0] + '_' + mwp[1] + '_' + mwp[2]);
                    if (governor != null) {
                        newtd = new TypedDependency(gr, governor, pobj.dep());
                    }
                }
                if (td2.gov().equals(mwp0) && td2.reln() == PREPOSITIONAL_COMPLEMENT) {
                    pobj = td2;
                    // create the new gr relation
                    GrammaticalRelation gr = EnglishGrammaticalRelations.getPrepC(mwp[0] + '_' + mwp[1] + '_' + mwp[2]);
                    if (governor != null) {
                        newtd = new TypedDependency(gr, governor, pobj.dep());
                    }
                }
            }
            // and add the new one
            if (prep != null && pobj != null && newtd != null) {
                prep.setReln(KILL);
                dep1.setReln(KILL);
                dep2.setReln(KILL);
                pobj.setReln(KILL);
                newTypedDeps.add(newtd);
                // and promote possible orphans
                for (TypedDependency td1 : list) {
                    if (td1.reln() != KILL) {
                        if (td1.gov().equals(mwp0) || td1.gov().equals(mwp1) || td1.gov().equals(mwp2)) {
                            td1.setGov(governor);
                        }
                        if (!newTypedDeps.contains(td1)) {
                            newTypedDeps.add(td1);
                        }
                    }
                }
                list.clear();
                list.addAll(newTypedDeps);
            }
        }
    }
}
Also used : GrammaticalRelation(edu.stanford.nlp.trees.GrammaticalRelation) IndexedWord(edu.stanford.nlp.ling.IndexedWord)

Aggregations

GrammaticalRelation (edu.stanford.nlp.trees.GrammaticalRelation)49 IndexedWord (edu.stanford.nlp.ling.IndexedWord)38 SemanticGraph (edu.stanford.nlp.semgraph.SemanticGraph)13 SemanticGraphEdge (edu.stanford.nlp.semgraph.SemanticGraphEdge)13 CoreAnnotations (edu.stanford.nlp.ling.CoreAnnotations)11 CoreLabel (edu.stanford.nlp.ling.CoreLabel)11 SemanticGraphCoreAnnotations (edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations)9 ArrayList (java.util.ArrayList)5 SemgrexMatcher (edu.stanford.nlp.semgraph.semgrex.SemgrexMatcher)4 IntTuple (edu.stanford.nlp.util.IntTuple)4 Tree (edu.stanford.nlp.trees.Tree)3 Word (edu.stanford.nlp.ling.Word)2 ClassicCounter (edu.stanford.nlp.stats.ClassicCounter)2 TypedDependency (edu.stanford.nlp.trees.TypedDependency)2 CoreMap (edu.stanford.nlp.util.CoreMap)2 CorefCoreAnnotations (edu.stanford.nlp.coref.CorefCoreAnnotations)1 CorefChain (edu.stanford.nlp.coref.data.CorefChain)1 Dictionaries (edu.stanford.nlp.coref.data.Dictionaries)1 Mention (edu.stanford.nlp.coref.data.Mention)1 SpeakerInfo (edu.stanford.nlp.coref.data.SpeakerInfo)1