Search in sources :

Example 6 with IndexWord

use of net.didion.jwnl.data.IndexWord in project lucida by claritylab.

the class FocusFinder method getHeadWordOrPhrase.

/**
     * Extracts the head word or phrase from the given Tree node, which is assumed
     * to be an NP.  Whether a phrase should be the head is determined by looking up
     * in WordNet all the possible phrases that can be constructed from the immediate children 
     * of the input Tree and which include the right-most child.
     * 
     * @param tree the Tree node from which to extract the head word or phrase
     */
public static Tree getHeadWordOrPhrase(Tree tree) {
    TreeHelper.markHeadNode(tree);
    //Tree headChild = tree.getChild(tree.getHeadNodeChildIndex()); // can return null
    Tree headChild = tree.getHeadNode();
    if (!headChild.isPreterminal())
        return getHeadWordOrPhrase(headChild);
    List<Tree> pretermChildren = new ArrayList<Tree>();
    for (Tree child : tree.getChildren()) {
        if (child.isPreterminal() && !child.getLabel().equals("DT"))
            pretermChildren.add(child);
    }
    for (ListIterator<Tree> it = pretermChildren.listIterator(); it.hasNext(); ) {
        Tree t = it.next();
        StringBuilder phrase = new StringBuilder();
        List<Tree> nodes = new ArrayList<Tree>();
        nodes.add(t);
        phrase.append(t.getHeadWord() + " ");
        for (ListIterator<Tree> it2 = pretermChildren.listIterator(it.nextIndex()); it2.hasNext(); ) {
            Tree t2 = it2.next();
            phrase.append(t2.getHeadWord() + " ");
            nodes.add(t2);
        }
        String phr = phrase.toString().trim();
        int phrSpaces = 0;
        Matcher m = Pattern.compile(" ").matcher(phr);
        while (m.find()) phrSpaces++;
        try {
            IndexWord indexWord = Dictionary.getInstance().lookupIndexWord(POS.NOUN, phr);
            if (indexWord == null)
                throw new Exception("Failed to get index word");
            int wrdSpaces = 0;
            Matcher m2 = Pattern.compile(" ").matcher(indexWord.getLemma());
            while (m2.find()) wrdSpaces++;
            if (wrdSpaces != phrSpaces)
                continue;
        } catch (Exception e) {
            continue;
        }
        if (nodes.size() == 1)
            return nodes.get(0);
        else
            return Tree.newNode("NP", nodes);
    }
    return tree.getHeadNode();
}
Also used : Matcher(java.util.regex.Matcher) ArrayList(java.util.ArrayList) Tree(edu.cmu.lti.chineseNLP.util.Tree) IndexWord(net.didion.jwnl.data.IndexWord)

Example 7 with IndexWord

use of net.didion.jwnl.data.IndexWord in project lucida by claritylab.

the class WordNetAnswerTypeMapping method getAnswerType.

public static String getAnswerType(Term focusTerm) {
    if (focusTerm == null) {
        return null;
    }
    String focusText = focusTerm.getText();
    List<AnswerType> focusTypes = new ArrayList<AnswerType>();
    try {
        IndexWord indexWord = Dictionary.getInstance().lookupIndexWord(POS.NOUN, focusText);
        if (indexWord == null)
            throw new Exception("Failed to get index word");
        Synset[] senses = indexWord.getSenses();
        if (senses == null)
            throw new Exception("Failed to get synsets");
        for (Synset sense : senses) {
            AnswerType type = findWnMapMatch(sense, 0);
            if (type != null) {
                focusTypes.add(type);
            }
        }
    } catch (Exception e) {
        log.warn("Failed to get hypernyms for noun '" + focusText + "'");
    }
    if (focusTypes.size() == 0)
        return focusText.toLowerCase().replaceAll(" ", "_");
    Collections.sort(focusTypes, atypeComparator);
    return focusTypes.get(0).getType();
}
Also used : Synset(net.didion.jwnl.data.Synset) ArrayList(java.util.ArrayList) IndexWord(net.didion.jwnl.data.IndexWord)

Aggregations

IndexWord (net.didion.jwnl.data.IndexWord)7 JWNLException (net.didion.jwnl.JWNLException)4 Tree (edu.cmu.lti.chineseNLP.util.Tree)2 ArrayList (java.util.ArrayList)2 Matcher (java.util.regex.Matcher)2 Synset (net.didion.jwnl.data.Synset)2 Term (edu.cmu.lti.javelin.qa.Term)1 Feature (edu.cmu.minorthird.classify.Feature)1 IndexWordSet (net.didion.jwnl.data.IndexWordSet)1