Search in sources :

Example 6 with MaxentTagger

use of edu.stanford.nlp.tagger.maxent.MaxentTagger in project CoreNLP by stanfordnlp.

the class TaggerDemo method main.

public static void main(String[] args) throws Exception {
    if (args.length != 2) {
        log.info("usage: java TaggerDemo modelFile fileToTag");
        return;
    }
    MaxentTagger tagger = new MaxentTagger(args[0]);
    List<List<HasWord>> sentences = MaxentTagger.tokenizeText(new BufferedReader(new FileReader(args[1])));
    for (List<HasWord> sentence : sentences) {
        List<TaggedWord> tSentence = tagger.tagSentence(sentence);
        System.out.println(SentenceUtils.listToString(tSentence, false));
    }
}
Also used : HasWord(edu.stanford.nlp.ling.HasWord) TaggedWord(edu.stanford.nlp.ling.TaggedWord) MaxentTagger(edu.stanford.nlp.tagger.maxent.MaxentTagger) BufferedReader(java.io.BufferedReader) FileReader(java.io.FileReader) List(java.util.List)

Example 7 with MaxentTagger

use of edu.stanford.nlp.tagger.maxent.MaxentTagger in project CoreNLP by stanfordnlp.

the class TaggerDemo2 method main.

public static void main(String[] args) throws Exception {
    if (args.length != 2) {
        log.info("usage: java TaggerDemo2 modelFile fileToTag");
        return;
    }
    MaxentTagger tagger = new MaxentTagger(args[0]);
    TokenizerFactory<CoreLabel> ptbTokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(), "untokenizable=noneKeep");
    BufferedReader r = new BufferedReader(new InputStreamReader(new FileInputStream(args[1]), "utf-8"));
    PrintWriter pw = new PrintWriter(new OutputStreamWriter(System.out, "utf-8"));
    DocumentPreprocessor documentPreprocessor = new DocumentPreprocessor(r);
    documentPreprocessor.setTokenizerFactory(ptbTokenizerFactory);
    for (List<HasWord> sentence : documentPreprocessor) {
        List<TaggedWord> tSentence = tagger.tagSentence(sentence);
        pw.println(SentenceUtils.listToString(tSentence, false));
    }
    // print the adjectives in one more sentence. This shows how to get at words and tags in a tagged sentence.
    List<HasWord> sent = SentenceUtils.toWordList("The", "slimy", "slug", "crawled", "over", "the", "long", ",", "green", "grass", ".");
    List<TaggedWord> taggedSent = tagger.tagSentence(sent);
    for (TaggedWord tw : taggedSent) {
        if (tw.tag().startsWith("JJ")) {
            pw.println(tw.word());
        }
    }
    pw.close();
}
Also used : HasWord(edu.stanford.nlp.ling.HasWord) CoreLabelTokenFactory(edu.stanford.nlp.process.CoreLabelTokenFactory) InputStreamReader(java.io.InputStreamReader) FileInputStream(java.io.FileInputStream) CoreLabel(edu.stanford.nlp.ling.CoreLabel) TaggedWord(edu.stanford.nlp.ling.TaggedWord) MaxentTagger(edu.stanford.nlp.tagger.maxent.MaxentTagger) BufferedReader(java.io.BufferedReader) OutputStreamWriter(java.io.OutputStreamWriter) DocumentPreprocessor(edu.stanford.nlp.process.DocumentPreprocessor) PrintWriter(java.io.PrintWriter)

Example 8 with MaxentTagger

use of edu.stanford.nlp.tagger.maxent.MaxentTagger in project CoreNLP by stanfordnlp.

the class AddTaggerToParser method main.

public static void main(String[] args) throws IOException, ClassNotFoundException {
    String taggerFile = null;
    String inputFile = null;
    String outputFile = null;
    double weight = 1.0;
    for (int argIndex = 0; argIndex < args.length; ) {
        if (args[argIndex].equalsIgnoreCase("-tagger")) {
            taggerFile = args[argIndex + 1];
            argIndex += 2;
        } else if (args[argIndex].equalsIgnoreCase("-input")) {
            inputFile = args[argIndex + 1];
            argIndex += 2;
        } else if (args[argIndex].equalsIgnoreCase("-output")) {
            outputFile = args[argIndex + 1];
            argIndex += 2;
        } else if (args[argIndex].equalsIgnoreCase("-weight")) {
            weight = Double.valueOf(args[argIndex + 1]);
            argIndex += 2;
        } else {
            throw new IllegalArgumentException("Unknown argument: " + args[argIndex]);
        }
    }
    LexicalizedParser parser = LexicalizedParser.loadModel(inputFile);
    MaxentTagger tagger = new MaxentTagger(taggerFile);
    parser.reranker = new TaggerReranker(tagger, parser.getOp());
    parser.saveParserToSerialized(outputFile);
}
Also used : MaxentTagger(edu.stanford.nlp.tagger.maxent.MaxentTagger)

Example 9 with MaxentTagger

use of edu.stanford.nlp.tagger.maxent.MaxentTagger in project CoreNLP by stanfordnlp.

the class ShiftReduceDemo method main.

public static void main(String[] args) {
    String modelPath = "edu/stanford/nlp/models/srparser/englishSR.ser.gz";
    String taggerPath = "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger";
    for (int argIndex = 0; argIndex < args.length; ) {
        switch(args[argIndex]) {
            case "-tagger":
                taggerPath = args[argIndex + 1];
                argIndex += 2;
                break;
            case "-model":
                modelPath = args[argIndex + 1];
                argIndex += 2;
                break;
            default:
                throw new RuntimeException("Unknown argument " + args[argIndex]);
        }
    }
    String text = "My dog likes to shake his stuffed chickadee toy.";
    MaxentTagger tagger = new MaxentTagger(taggerPath);
    ShiftReduceParser model = ShiftReduceParser.loadModel(modelPath);
    DocumentPreprocessor tokenizer = new DocumentPreprocessor(new StringReader(text));
    for (List<HasWord> sentence : tokenizer) {
        List<TaggedWord> tagged = tagger.tagSentence(sentence);
        Tree tree = model.apply(tagged);
        log.info(tree);
    }
}
Also used : HasWord(edu.stanford.nlp.ling.HasWord) TaggedWord(edu.stanford.nlp.ling.TaggedWord) MaxentTagger(edu.stanford.nlp.tagger.maxent.MaxentTagger) ShiftReduceParser(edu.stanford.nlp.parser.shiftreduce.ShiftReduceParser) StringReader(java.io.StringReader) Tree(edu.stanford.nlp.trees.Tree) DocumentPreprocessor(edu.stanford.nlp.process.DocumentPreprocessor)

Example 10 with MaxentTagger

use of edu.stanford.nlp.tagger.maxent.MaxentTagger in project CoreNLP by stanfordnlp.

the class POSTaggerAnnotator method loadModel.

private static MaxentTagger loadModel(String loc, boolean verbose) {
    Timing timer = null;
    if (verbose) {
        timer = new Timing();
        timer.doing("Loading POS Model [" + loc + ']');
    }
    MaxentTagger tagger = new MaxentTagger(loc);
    if (verbose) {
        timer.done();
    }
    return tagger;
}
Also used : MaxentTagger(edu.stanford.nlp.tagger.maxent.MaxentTagger)

Aggregations

MaxentTagger (edu.stanford.nlp.tagger.maxent.MaxentTagger)10 HasWord (edu.stanford.nlp.ling.HasWord)5 TaggedWord (edu.stanford.nlp.ling.TaggedWord)5 DocumentPreprocessor (edu.stanford.nlp.process.DocumentPreprocessor)4 BufferedReader (java.io.BufferedReader)3 DependencyParser (edu.stanford.nlp.parser.nndep.DependencyParser)2 ShiftReduceParser (edu.stanford.nlp.parser.shiftreduce.ShiftReduceParser)2 GrammaticalStructure (edu.stanford.nlp.trees.GrammaticalStructure)2 InputStreamReader (java.io.InputStreamReader)2 StringReader (java.io.StringReader)2 CoreLabel (edu.stanford.nlp.ling.CoreLabel)1 LexicalizedParser (edu.stanford.nlp.parser.lexparser.LexicalizedParser)1 CoreLabelTokenFactory (edu.stanford.nlp.process.CoreLabelTokenFactory)1 EnglishGrammaticalStructure (edu.stanford.nlp.trees.EnglishGrammaticalStructure)1 Tree (edu.stanford.nlp.trees.Tree)1 TypedDependency (edu.stanford.nlp.trees.TypedDependency)1 UniversalEnglishGrammaticalStructure (edu.stanford.nlp.trees.UniversalEnglishGrammaticalStructure)1 ChineseGrammaticalStructure (edu.stanford.nlp.trees.international.pennchinese.ChineseGrammaticalStructure)1 Timing (edu.stanford.nlp.util.Timing)1 MulticoreWrapper (edu.stanford.nlp.util.concurrent.MulticoreWrapper)1