Search in sources :

Example 16 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project Info-Evaluation by TechnionYP5777.

the class AnalyzeParagragh method InteractiveReasonFinding.

//This function makes the analyze process interactive with the user - he gets reasons to choose from and chooses the most fitiing one.
public LinkedList<ReasonPair> InteractiveReasonFinding() {
    LinkedList<ReasonPair> $ = new LinkedList<ReasonPair>();
    final Properties props = new Properties();
    props.put("annotators", "tokenize,ssplit, pos, regexner, parse,lemma,natlog,openie");
    final StanfordCoreNLP pipeLine = new StanfordCoreNLP(props);
    // inputText will be the text to evaluate in this example
    final String inputText = input + "";
    final Annotation document = new Annotation(inputText);
    // Finally we use the pipeline to annotate the document we created
    pipeLine.annotate(document);
    for (final CoreMap sentence : document.get(SentencesAnnotation.class)) for (RelationTriple ¢ : sentence.get(NaturalLogicAnnotations.RelationTriplesAnnotation.class)) $.add(new ReasonPair(¢.confidence, ¢.relationGloss() + " " + ¢.objectGloss()));
    return $;
}
Also used : RelationTriple(edu.stanford.nlp.ie.util.RelationTriple) ReasonPair(main.database.ReasonPair) Properties(java.util.Properties) CoreMap(edu.stanford.nlp.util.CoreMap) LinkedList(java.util.LinkedList) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP) SentencesAnnotation(edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation) Annotation(edu.stanford.nlp.pipeline.Annotation) CollapsedDependenciesAnnotation(edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations.CollapsedDependenciesAnnotation)

Example 17 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project textdb by TextDB.

the class NlpEntityOperator method extractNlpSpans.

/**
     * @param iField
     * @param attributeName
     * @return
     * @about This function takes an IField(TextField) and a String (the field's
     *        name) as input and uses the Stanford NLP package to process the
     *        field based on the input token type and nlpTypeIndicator. In the
     *        result spans, value represents the word itself and key represents
     *        the recognized token type
     * @overview First set up a pipeline of Annotators based on the
     *           nlpTypeIndicator. If the nlpTypeIndicator is "NE_ALL", we set
     *           up the NamedEntityTagAnnotator, if it's "POS", then only
     *           PartOfSpeechAnnotator is needed.
     *           <p>
     *           The pipeline has to be this order: TokenizerAnnotator,
     *           SentencesAnnotator, PartOfSpeechAnnotator, LemmaAnnotator and
     *           NamedEntityTagAnnotator.
     *           <p>
     *           In the pipeline, each token is wrapped as a CoreLabel and each
     *           sentence is wrapped as CoreMap. Each annotator adds its
     *           annotation to the CoreMap(sentence) or CoreLabel(token) object.
     *           <p>
     *           After the pipeline, scan each CoreLabel(token) for its
     *           NamedEntityAnnotation or PartOfSpeechAnnotator depends on the
     *           nlpTypeIndicator
     *           <p>
     *           For each Stanford NLP annotation, get it's corresponding
     *           inputnlpEntityType that used in this package, then check if it
     *           equals to the input token type. If yes, makes it a span and add
     *           to the return list.
     *           <p>
     *           The NLP package has annotations for the start and end position
     *           of a token and it perfectly matches the span design so we just
     *           use them.
     *           <p>
     *           For Example: With TextField value: "Microsoft, Google and
     *           Facebook are organizations while Donald Trump and Barack Obama
     *           are persons", with attributeName: Sentence1 and inputTokenType is
     *           Organization. Since the inputTokenType require us to use
     *           NamedEntity Annotator in the Stanford NLP package, the
     *           nlpTypeIndicator would be set to "NE". The pipeline would set
     *           up to cover the Named Entity Recognizer. Then get the value of
     *           NamedEntityTagAnnotation for each CoreLabel(token).If the value
     *           is the token type "Organization", then it meets the
     *           requirement. In this case "Microsoft","Google" and "Facebook"
     *           will satisfy the requirement. "Donald Trump" and "Barack Obama"
     *           would have token type "Person" and do not meet the requirement.
     *           For each qualified token, create a span accordingly and add it
     *           to the returned list. In this case, token "Microsoft" would be
     *           span: ["Sentence1", 0, 9, Organization, "Microsoft"]
     */
private List<Span> extractNlpSpans(IField iField, String attributeName) {
    List<Span> spanList = new ArrayList<>();
    String text = (String) iField.getValue();
    Properties props = new Properties();
    // Setup Stanford NLP pipeline based on nlpTypeIndicator
    StanfordCoreNLP pipeline = null;
    if (getNlpTypeIndicator(predicate.getNlpEntityType()).equals("POS")) {
        props.setProperty("annotators", "tokenize, ssplit, pos");
        if (posPipeline == null) {
            posPipeline = new StanfordCoreNLP(props);
        }
        pipeline = posPipeline;
    } else {
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, " + "ner");
        if (nerPipeline == null) {
            nerPipeline = new StanfordCoreNLP(props);
        }
        pipeline = nerPipeline;
    }
    Annotation documentAnnotation = new Annotation(text);
    pipeline.annotate(documentAnnotation);
    List<CoreMap> sentences = documentAnnotation.get(CoreAnnotations.SentencesAnnotation.class);
    for (CoreMap sentence : sentences) {
        for (CoreLabel token : sentence.get(CoreAnnotations.TokensAnnotation.class)) {
            String stanfordNlpConstant;
            // Extract annotations based on nlpTypeIndicator
            if (getNlpTypeIndicator(predicate.getNlpEntityType()).equals("POS")) {
                stanfordNlpConstant = token.get(CoreAnnotations.PartOfSpeechAnnotation.class);
            } else {
                stanfordNlpConstant = token.get(CoreAnnotations.NamedEntityTagAnnotation.class);
            }
            NlpEntityType nlpEntityType = mapNlpEntityType(stanfordNlpConstant);
            if (nlpEntityType == null) {
                continue;
            }
            if (predicate.getNlpEntityType().equals(NlpEntityType.NE_ALL) || predicate.getNlpEntityType().equals(nlpEntityType)) {
                int start = token.get(CoreAnnotations.CharacterOffsetBeginAnnotation.class);
                int end = token.get(CoreAnnotations.CharacterOffsetEndAnnotation.class);
                String word = token.get(CoreAnnotations.TextAnnotation.class);
                Span span = new Span(attributeName, start, end, nlpEntityType.toString(), word);
                if (spanList.size() >= 1 && (getNlpTypeIndicator(predicate.getNlpEntityType()).equals("NE_ALL"))) {
                    Span previousSpan = spanList.get(spanList.size() - 1);
                    if (previousSpan.getAttributeName().equals(span.getAttributeName()) && (span.getStart() - previousSpan.getEnd() <= 1) && previousSpan.getKey().equals(span.getKey())) {
                        Span newSpan = mergeTwoSpans(previousSpan, span);
                        span = newSpan;
                        spanList.remove(spanList.size() - 1);
                    }
                }
                spanList.add(span);
            }
        }
    }
    return spanList;
}
Also used : ArrayList(java.util.ArrayList) Properties(java.util.Properties) Span(edu.uci.ics.textdb.api.span.Span) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP) Annotation(edu.stanford.nlp.pipeline.Annotation) CoreLabel(edu.stanford.nlp.ling.CoreLabel) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) CoreMap(edu.stanford.nlp.util.CoreMap)

Example 18 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project textdb by TextDB.

the class NlpSentimentOperator method open.

@Override
public void open() throws TextDBException {
    if (cursor != CLOSED) {
        return;
    }
    if (inputOperator == null) {
        throw new DataFlowException(ErrorMessages.INPUT_OPERATOR_NOT_SPECIFIED);
    }
    inputOperator.open();
    Schema inputSchema = inputOperator.getOutputSchema();
    // check if input schema is present
    if (!inputSchema.containsField(predicate.getInputAttributeName())) {
        throw new RuntimeException(String.format("input attribute %s is not in the input schema %s", predicate.getInputAttributeName(), inputSchema.getAttributeNames()));
    }
    // check if attribute type is valid
    AttributeType inputAttributeType = inputSchema.getAttribute(predicate.getInputAttributeName()).getAttributeType();
    boolean isValidType = inputAttributeType.equals(AttributeType.STRING) || inputAttributeType.equals(AttributeType.TEXT);
    if (!isValidType) {
        throw new RuntimeException(String.format("input attribute %s must have type String or Text, its actual type is %s", predicate.getInputAttributeName(), inputAttributeType));
    }
    // generate output schema by transforming the input schema
    outputSchema = transformSchema(inputOperator.getOutputSchema());
    cursor = OPENED;
    // setup NLP sentiment analysis pipeline
    Properties props = new Properties();
    props.setProperty("annotators", "tokenize, ssplit, parse, sentiment");
    sentimentPipeline = new StanfordCoreNLP(props);
}
Also used : AttributeType(edu.uci.ics.textdb.api.schema.AttributeType) Schema(edu.uci.ics.textdb.api.schema.Schema) DataFlowException(edu.uci.ics.textdb.api.exception.DataFlowException) Properties(java.util.Properties) StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP)

Example 19 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project cogcomp-nlp by CogComp.

the class StanfordOpenIEHandler method initialize.

@Override
public void initialize(ResourceManager rm) {
    Properties props = new Properties();
    props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse,depparse,natlog,openie");
    this.pipeline = new StanfordCoreNLP(props);
}
Also used : StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP)

Example 20 with StanfordCoreNLP

use of edu.stanford.nlp.pipeline.StanfordCoreNLP in project cogcomp-nlp by CogComp.

the class StanfordTrueCaseHandler method initialize.

@Override
public void initialize(ResourceManager rm) {
    Properties props = new Properties();
    props.put("annotators", "tokenize, ssplit, pos, lemma, truecase");
    props.put("truecase.model", DefaultPaths.DEFAULT_TRUECASE_MODEL);
    props.put("truecase.bias", TrueCaseAnnotator.DEFAULT_MODEL_BIAS);
    props.put("truecase.mixedcasefile", DefaultPaths.DEFAULT_TRUECASE_DISAMBIGUATION_LIST);
    this.pipeline = new StanfordCoreNLP(props);
}
Also used : StanfordCoreNLP(edu.stanford.nlp.pipeline.StanfordCoreNLP)

Aggregations

StanfordCoreNLP (edu.stanford.nlp.pipeline.StanfordCoreNLP)42 Properties (java.util.Properties)31 Annotation (edu.stanford.nlp.pipeline.Annotation)26 CoreMap (edu.stanford.nlp.util.CoreMap)19 CoreAnnotations (edu.stanford.nlp.ling.CoreAnnotations)16 SemanticGraph (edu.stanford.nlp.semgraph.SemanticGraph)7 CoreLabel (edu.stanford.nlp.ling.CoreLabel)6 SemanticGraphCoreAnnotations (edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations)5 Test (org.junit.Test)5 SentencesAnnotation (edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation)4 CollapsedDependenciesAnnotation (edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations.CollapsedDependenciesAnnotation)4 GoldAnswerAnnotation (edu.stanford.nlp.ling.CoreAnnotations.GoldAnswerAnnotation)3 IndexedWord (edu.stanford.nlp.ling.IndexedWord)3 SemanticGraphEdge (edu.stanford.nlp.semgraph.SemanticGraphEdge)3 TreeAnnotation (edu.stanford.nlp.trees.TreeCoreAnnotations.TreeAnnotation)3 PrintWriter (java.io.PrintWriter)3 ArrayList (java.util.ArrayList)3 CorefCoreAnnotations (edu.stanford.nlp.coref.CorefCoreAnnotations)2 CorefChain (edu.stanford.nlp.coref.data.CorefChain)2 RelationTriple (edu.stanford.nlp.ie.util.RelationTriple)2