Search in sources :

Example 6 with Morphology

use of edu.stanford.nlp.process.Morphology in project CoreNLP by stanfordnlp.

the class MaxentTagger method runTagger.

public <X extends HasWord> void runTagger(Iterable<List<X>> document, BufferedWriter writer, OutputStyle outputStyle) throws IOException {
    Timing t = new Timing();
    //Counts
    int numWords = 0;
    int numSentences = 0;
    boolean outputVerbosity = config.getOutputVerbosity();
    boolean outputLemmas = config.getOutputLemmas();
    if (outputStyle == OutputStyle.XML || outputStyle == OutputStyle.INLINE_XML) {
        writer.write("<?xml version=\"1.0\" encoding=\"" + config.getEncoding() + "\"?>\n");
        writer.write("<pos>\n");
    }
    if (config.getNThreads() != 1) {
        MulticoreWrapper<List<? extends HasWord>, List<? extends HasWord>> wrapper = new MulticoreWrapper<>(config.getNThreads(), new SentenceTaggingProcessor(this, outputLemmas));
        for (List<X> sentence : document) {
            wrapper.put(sentence);
            while (wrapper.peek()) {
                List<? extends HasWord> taggedSentence = wrapper.poll();
                numWords += taggedSentence.size();
                outputTaggedSentence(taggedSentence, outputLemmas, outputStyle, outputVerbosity, numSentences, "\n", writer);
                numSentences++;
            }
        }
        wrapper.join();
        while (wrapper.peek()) {
            List<? extends HasWord> taggedSentence = wrapper.poll();
            numWords += taggedSentence.size();
            outputTaggedSentence(taggedSentence, outputLemmas, outputStyle, outputVerbosity, numSentences, "\n", writer);
            numSentences++;
        }
    } else {
        Morphology morpha = (outputLemmas) ? new Morphology() : null;
        for (List<X> sentence : document) {
            numWords += sentence.size();
            tagAndOutputSentence(sentence, outputLemmas, morpha, outputStyle, outputVerbosity, numSentences, "\n", writer);
            numSentences++;
        }
    }
    if (outputStyle == OutputStyle.XML || outputStyle == OutputStyle.INLINE_XML) {
        writer.write("</pos>\n");
    }
    writer.flush();
    long millis = t.stop();
    printErrWordsPerSec(millis, numWords);
}
Also used : MulticoreWrapper(edu.stanford.nlp.util.concurrent.MulticoreWrapper) Morphology(edu.stanford.nlp.process.Morphology) Timing(edu.stanford.nlp.util.Timing)

Example 7 with Morphology

use of edu.stanford.nlp.process.Morphology in project CoreNLP by stanfordnlp.

the class TreeLemmatizer method transformTree.

@Override
public Tree transformTree(Tree t) {
    Morphology morphology = new Morphology();
    List<TaggedWord> tagged = null;
    int index = 0;
    for (Tree leaf : t.getLeaves()) {
        Label label = leaf.label();
        if (label == null) {
            continue;
        }
        String tag;
        if (!(label instanceof HasTag) || ((HasTag) label).tag() == null) {
            if (tagged == null) {
                tagged = t.taggedYield();
            }
            tag = tagged.get(index).tag();
        } else {
            tag = ((HasTag) label).tag();
        }
        if (!(label instanceof HasLemma)) {
            throw new IllegalArgumentException("Got a tree with labels which do not support lemma");
        }
        ((HasLemma) label).setLemma(morphology.lemma(label.value(), tag, true));
        ++index;
    }
    return t;
}
Also used : HasLemma(edu.stanford.nlp.ling.HasLemma) TaggedWord(edu.stanford.nlp.ling.TaggedWord) Morphology(edu.stanford.nlp.process.Morphology) Label(edu.stanford.nlp.ling.Label) HasTag(edu.stanford.nlp.ling.HasTag)

Aggregations

Morphology (edu.stanford.nlp.process.Morphology)7 CoreLabel (edu.stanford.nlp.ling.CoreLabel)2 TaggedWord (edu.stanford.nlp.ling.TaggedWord)2 Timing (edu.stanford.nlp.util.Timing)2 CoreAnnotations (edu.stanford.nlp.ling.CoreAnnotations)1 HasLemma (edu.stanford.nlp.ling.HasLemma)1 HasTag (edu.stanford.nlp.ling.HasTag)1 HasWord (edu.stanford.nlp.ling.HasWord)1 Label (edu.stanford.nlp.ling.Label)1 DocumentPreprocessor (edu.stanford.nlp.process.DocumentPreprocessor)1 Tree (edu.stanford.nlp.trees.Tree)1 CoreMap (edu.stanford.nlp.util.CoreMap)1 MulticoreWrapper (edu.stanford.nlp.util.concurrent.MulticoreWrapper)1 List (java.util.List)1