Search in sources :

Example 66 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project CoreNLP by stanfordnlp.

the class StanfordCoreNlpDemoChinese method main.

public static void main(String[] args) throws IOException {
    // set up optional output files
    PrintWriter out;
    if (args.length > 1) {
        out = new PrintWriter(args[1]);
    } else {
        out = new PrintWriter(System.out);
    }
    Properties props = new Properties();
    props.load(IOUtils.readerFromString("StanfordCoreNLP-chinese.properties"));
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
    Annotation document;
    if (args.length > 0) {
        document = new Annotation(IOUtils.slurpFileNoExceptions(args[0]));
    } else {
        document = new Annotation("克林顿说,华盛顿将逐步落实对韩国的经济援助。金大中对克林顿的讲话报以掌声:克林顿总统在会谈中重申,他坚定地支持韩国摆脱经济危机。");
    }
    pipeline.annotate(document);
    List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
    int sentNo = 1;
    for (CoreMap sentence : sentences) {
        out.println("Sentence #" + sentNo + " tokens are:");
        for (CoreMap token : sentence.get(CoreAnnotations.TokensAnnotation.class)) {
            out.println(token.toShorterString("Text", "CharacterOffsetBegin", "CharacterOffsetEnd", "Index", "PartOfSpeech", "NamedEntityTag"));
        }
        out.println("Sentence #" + sentNo + " basic dependencies are:");
        out.println(sentence.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class).toString(SemanticGraph.OutputFormat.LIST));
        sentNo++;
    }
    // Access coreference.
    out.println("Coreference information");
    Map<Integer, CorefChain> corefChains = document.get(CorefCoreAnnotations.CorefChainAnnotation.class);
    if (corefChains == null) {
        return;
    }
    for (Map.Entry<Integer, CorefChain> entry : corefChains.entrySet()) {
        out.println("Chain " + entry.getKey());
        for (CorefChain.CorefMention m : entry.getValue().getMentionsInTextualOrder()) {
            // We need to subtract one since the indices count from 1 but the Lists start from 0
            List<CoreLabel> tokens = sentences.get(m.sentNum - 1).get(CoreAnnotations.TokensAnnotation.class);
            // We subtract two for end: one for 0-based indexing, and one because we want last token of mention not one following.
            out.println("  " + m + ":[" + tokens.get(m.startIndex - 1).beginPosition() + ", " + tokens.get(m.endIndex - 2).endPosition() + ')');
        }
    }
    IOUtils.closeIgnoringExceptions(out);
}
Also used : SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) Properties(java.util.Properties) CorefCoreAnnotations(edu.stanford.nlp.coref.CorefCoreAnnotations) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP) Annotation(edu.stanford.nlp.pipeline.Annotation) CoreLabel(edu.stanford.nlp.ling.CoreLabel) CorefChain(edu.stanford.nlp.coref.data.CorefChain) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) CorefCoreAnnotations(edu.stanford.nlp.coref.CorefCoreAnnotations) CoreMap(edu.stanford.nlp.util.CoreMap) Map(java.util.Map) CoreMap(edu.stanford.nlp.util.CoreMap) PrintWriter(java.io.PrintWriter)

Example 67 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project neo4j-nlp-stanfordnlp by graphaware.

the class DependencyParserTest method testStanfordNLPWithPredefinedProcessors.

@Test
public void testStanfordNLPWithPredefinedProcessors() throws Exception {
    StanfordCoreNLP pipeline = ((StanfordTextProcessor) textProcessor).getPipeline("default");
    String text = "Donald Trump flew yesterday to New York City";
    AnnotatedText at = textProcessor.annotateText(text, "en", PIPELINE_DEFAULT);
    Annotation document = new Annotation(text);
    pipeline.annotate(document);
    List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
    CoreMap sentence = sentences.get(0);
    System.out.println(sentence.toString());
    SemanticGraph graph = sentence.get(SemanticGraphCoreAnnotations.EnhancedDependenciesAnnotation.class);
    System.out.println(graph);
    List<SemanticGraphEdge> edges = graph.edgeListSorted();
    for (SemanticGraphEdge edge : edges) {
        System.out.println(edge.getRelation().getShortName());
        System.out.println(String.format("Source is : %s - Target is : %s - Relation is : %s", edge.getSource(), edge.getTarget(), edge.getRelation()));
    }
}
Also used : AnnotatedText(com.graphaware.nlp.domain.AnnotatedText) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) SemanticGraph(edu.stanford.nlp.semgraph.SemanticGraph) StanfordTextProcessor(com.graphaware.nlp.processor.stanford.StanfordTextProcessor) CoreMap(edu.stanford.nlp.util.CoreMap) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP) Annotation(edu.stanford.nlp.pipeline.Annotation) SemanticGraphEdge(edu.stanford.nlp.semgraph.SemanticGraphEdge) Test(org.junit.Test)

Example 68 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project neo4j-nlp-stanfordnlp by graphaware.

the class DependencyParserTest method testEnhancedDependencyParsingWithQuestion.

@Test
public void testEnhancedDependencyParsingWithQuestion() throws Exception {
    String text = "In what area was Frederic born in";
    StanfordCoreNLP pipeline = ((StanfordTextProcessor) textProcessor).getPipeline("default");
    Map<String, Object> customPipeline = new HashMap<>();
    customPipeline.put("textProcessor", "com.graphaware.nlp.processor.stanford.StanfordTextProcessor");
    customPipeline.put("name", "custom");
    customPipeline.put("stopWords", "start,starts");
    customPipeline.put("processingSteps", Collections.singletonMap("dependency", true));
    PipelineSpecification pipelineSpecification = PipelineSpecification.fromMap(customPipeline);
    ((StanfordTextProcessor) textProcessor).createPipeline(pipelineSpecification);
    textProcessor.annotateText(text, "en", pipelineSpecification);
    Annotation document = new Annotation(text);
    pipeline.annotate(document);
    List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
    for (CoreMap sentence : sentences) {
        System.out.println(sentence.toString());
        SemanticGraph graph = sentence.get(SemanticGraphCoreAnnotations.EnhancedDependenciesAnnotation.class);
        graph.getRoots().forEach(root -> {
            System.out.println(root);
        });
        System.out.println(graph);
        for (SemanticGraphEdge edge : graph.edgeListSorted()) {
            System.out.println(String.format("Source is : %s - Target is : %s - Relation is : %s", edge.getSource(), edge.getTarget(), edge.getRelation()));
        }
    }
}
Also used : SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) StanfordTextProcessor(com.graphaware.nlp.processor.stanford.StanfordTextProcessor) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP) Annotation(edu.stanford.nlp.pipeline.Annotation) SemanticGraphEdge(edu.stanford.nlp.semgraph.SemanticGraphEdge) PipelineSpecification(com.graphaware.nlp.dsl.request.PipelineSpecification) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) SemanticGraph(edu.stanford.nlp.semgraph.SemanticGraph) CoreMap(edu.stanford.nlp.util.CoreMap) Test(org.junit.Test)

Example 69 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project neo4j-nlp-stanfordnlp by graphaware.

the class DependencyParserTest method testEnhancedDependencyParsingWithComplexTest.

@Test
public void testEnhancedDependencyParsingWithComplexTest() throws Exception {
    String text = "Softfoot and Small Paul would kill the Old Beard, Dirk would do Blane, and Lark and his cousins would silence Bannen and old Dywen, to keep them from sniffing after their trail.";
    StanfordCoreNLP pipeline = ((StanfordTextProcessor) textProcessor).getPipeline("default");
    AnnotatedText at = textProcessor.annotateText(text, "en", PIPELINE_DEFAULT);
    Annotation document = new Annotation(text);
    pipeline.annotate(document);
    List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
    for (CoreMap sentence : sentences) {
        System.out.println(sentence.toString());
        SemanticGraph graph = sentence.get(SemanticGraphCoreAnnotations.EnhancedDependenciesAnnotation.class);
        System.out.println(graph);
        for (SemanticGraphEdge edge : graph.edgeListSorted()) {
            System.out.println(String.format("Source is : %s - Target is : %s - Relation is : %s", edge.getSource(), edge.getTarget(), edge.getRelation()));
        }
    }
}
Also used : AnnotatedText(com.graphaware.nlp.domain.AnnotatedText) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) SemanticGraph(edu.stanford.nlp.semgraph.SemanticGraph) StanfordTextProcessor(com.graphaware.nlp.processor.stanford.StanfordTextProcessor) CoreMap(edu.stanford.nlp.util.CoreMap) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP) Annotation(edu.stanford.nlp.pipeline.Annotation) SemanticGraphEdge(edu.stanford.nlp.semgraph.SemanticGraphEdge) Test(org.junit.Test)

Example 70 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project neo4j-nlp-stanfordnlp by graphaware.

the class StanfordTextProcessor method annotateText.

@Override
public AnnotatedText annotateText(String text, String lang, PipelineSpecification pipelineSpecification) {
    checkPipelineExistOrCreate(pipelineSpecification);
    AnnotatedText result = new AnnotatedText();
    Annotation document = new Annotation(text);
    StanfordCoreNLP pipeline = pipelines.get(pipelineSpecification.getName());
    // Add custom NER models
    if (pipelineSpecification.hasProcessingStep("customNER")) {
        String modelPath = getCustomModelsPaths(pipelineSpecification.getProcessingStepAsString("customNER"));
        pipeline = new PipelineBuilder(pipelineSpecification.getName()).tokenize().extractNEs(modelPath).defaultStopWordAnnotator().build();
        pipeline.getProperties().setProperty("ner.model", modelPath);
        LOG.info("Custom NER(s) set to: " + pipeline.getProperties().getProperty("ner.model"));
        System.out.println("Custom NER(s) set to: " + pipeline.getProperties().getProperty("ner.model"));
    }
    pipeline.annotate(document);
    List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
    final AtomicInteger sentenceSequence = new AtomicInteger(0);
    sentences.forEach((sentence) -> {
        int sentenceNumber = sentenceSequence.getAndIncrement();
        final Sentence newSentence = new Sentence(sentence.toString(), sentenceNumber);
        if (pipelineSpecification.hasProcessingStep(STEP_NER, true) || pipelineSpecification.hasProcessingStep("customNER")) {
            extractTokens(lang, sentence, newSentence, pipelineSpecification.getExcludedNER(), pipelineSpecification);
        }
        if (pipelineSpecification.hasProcessingStep(STEP_SENTIMENT, false)) {
            extractSentiment(sentence, newSentence);
        }
        if (pipelineSpecification.hasProcessingStep(STEP_PHRASE, false)) {
            extractPhrases(sentence, newSentence);
        }
        if (pipelineSpecification.hasProcessingStep(STEP_DEPENDENCY, false)) {
            extractDependencies(sentence, newSentence);
        }
        if (pipelineSpecification.hasProcessingStep(STEP_RELATIONS, false)) {
            extractRelationship(result, sentences, document);
        }
        result.addSentence(newSentence);
    });
    return result;
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) SemanticGraphCoreAnnotations(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations) RNNCoreAnnotations(edu.stanford.nlp.neural.rnn.RNNCoreAnnotations) TreeCoreAnnotations(edu.stanford.nlp.trees.TreeCoreAnnotations) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) SentimentCoreAnnotations(edu.stanford.nlp.sentiment.SentimentCoreAnnotations) CorefCoreAnnotations(edu.stanford.nlp.coref.CorefCoreAnnotations) CoreMap(edu.stanford.nlp.util.CoreMap) Annotation(edu.stanford.nlp.pipeline.Annotation) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP)

Aggregations

StanfordCoreNLP (edu.stanford.nlp.pipeline.StanfordCoreNLP)71 Properties (java.util.Properties)44 Annotation (edu.stanford.nlp.pipeline.Annotation)40 CoreAnnotations (edu.stanford.nlp.ling.CoreAnnotations)33 CoreMap (edu.stanford.nlp.util.CoreMap)33 Test (org.junit.Test)15 CoreLabel (edu.stanford.nlp.ling.CoreLabel)12 SemanticGraphCoreAnnotations (edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations)12 SemanticGraph (edu.stanford.nlp.semgraph.SemanticGraph)10 CorefCoreAnnotations (edu.stanford.nlp.coref.CorefCoreAnnotations)6 SemanticGraphEdge (edu.stanford.nlp.semgraph.SemanticGraphEdge)6 StanfordTextProcessor (com.graphaware.nlp.processor.stanford.StanfordTextProcessor)5 TreeCoreAnnotations (edu.stanford.nlp.trees.TreeCoreAnnotations)5 PrintWriter (java.io.PrintWriter)5 ArrayList (java.util.ArrayList)5 AnnotatedText (com.graphaware.nlp.domain.AnnotatedText)3 CorefChain (edu.stanford.nlp.coref.data.CorefChain)3 GoldAnswerAnnotation (edu.stanford.nlp.ling.CoreAnnotations.GoldAnswerAnnotation)3 SentencesAnnotation (edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation)3 TokenSequencePattern (edu.stanford.nlp.ling.tokensregex.TokenSequencePattern)3