Search in sources :

Example 6 with RelationMention

use of edu.stanford.nlp.ie.machinereading.structure.RelationMention in project CoreNLP by stanfordnlp.

the class RelationExtractorAnnotator method annotate.

@Override
public void annotate(Annotation annotation) {
    // extract entities and relations
    Annotation output = mr.annotate(annotation);
    // transfer entities/relations back to the original annotation
    List<CoreMap> outputSentences = output.get(SentencesAnnotation.class);
    List<CoreMap> origSentences = annotation.get(SentencesAnnotation.class);
    for (int i = 0; i < outputSentences.size(); i++) {
        CoreMap outSent = outputSentences.get(i);
        CoreMap origSent = origSentences.get(i);
        // set entities
        List<EntityMention> entities = outSent.get(MachineReadingAnnotations.EntityMentionsAnnotation.class);
        origSent.set(MachineReadingAnnotations.EntityMentionsAnnotation.class, entities);
        if (verbose && entities != null) {
            log.info("Extracted the following entities:");
            for (EntityMention e : entities) {
                log.info("\t" + e);
            }
        }
        // set relations
        List<RelationMention> relations = outSent.get(MachineReadingAnnotations.RelationMentionsAnnotation.class);
        origSent.set(MachineReadingAnnotations.RelationMentionsAnnotation.class, relations);
        if (verbose && relations != null) {
            log.info("Extracted the following relations:");
            for (RelationMention r : relations) {
                if (!r.getType().equals(RelationMention.UNRELATED)) {
                    log.info(r);
                }
            }
        }
    }
}
Also used : MachineReadingAnnotations(edu.stanford.nlp.ie.machinereading.structure.MachineReadingAnnotations) EntityMention(edu.stanford.nlp.ie.machinereading.structure.EntityMention) RelationMention(edu.stanford.nlp.ie.machinereading.structure.RelationMention) CoreMap(edu.stanford.nlp.util.CoreMap) SentencesAnnotation(edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation) CoreAnnotation(edu.stanford.nlp.ling.CoreAnnotation) RelationMentionsAnnotation(edu.stanford.nlp.ie.machinereading.structure.MachineReadingAnnotations.RelationMentionsAnnotation)

Example 7 with RelationMention

use of edu.stanford.nlp.ie.machinereading.structure.RelationMention in project CoreNLP by stanfordnlp.

the class StanfordCoreNLPITest method testRelationExtractor.

public void testRelationExtractor() throws Exception {
    // Check the regexner is integrated with the StanfordCoreNLP
    Properties props = new Properties();
    props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner,parse,relation");
    // props.setProperty("sup.relation.model", "/home/sonalg/javanlp/tmp/roth_relation_model_pipeline.ser");
    String text = "Barack Obama, a Yale professor, is president.";
    Annotation document = new Annotation(text);
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
    pipeline.annotate(document);
    CoreMap sentence = document.get(CoreAnnotations.SentencesAnnotation.class).get(0);
    List<RelationMention> rel = sentence.get(MachineReadingAnnotations.RelationMentionsAnnotation.class);
    assertEquals(rel.get(0).getType(), "Work_For");
// StringWriter stringWriter = new StringWriter();
// pipeline.prettyPrint(document, new PrintWriter(stringWriter));
// String result = stringWriter.getBuffer().toString();
// System.out.println(result);
}
Also used : MachineReadingAnnotations(edu.stanford.nlp.ie.machinereading.structure.MachineReadingAnnotations) RelationMention(edu.stanford.nlp.ie.machinereading.structure.RelationMention) Properties(java.util.Properties) CoreMap(edu.stanford.nlp.util.CoreMap)

Example 8 with RelationMention

use of edu.stanford.nlp.ie.machinereading.structure.RelationMention in project CoreNLP by stanfordnlp.

the class AceReader method readDocument.

/**
 * Reads in a single ACE*.apf.xml file and convert it to RelationSentence
 * objects. However, you probably should call parse() instead.
 *
 * @param prefix prefix of ACE filename to read (e.g.
 *          "/u/mcclosky/scr/data/ACE2005/english_test/bc/CNN_CF_20030827.1630.01"
 *          ) (no ".apf.xml" extension)
 * @return list of RelationSentence objects
 */
private List<CoreMap> readDocument(String prefix, Annotation corpus) throws IOException, SAXException, ParserConfigurationException {
    logger.info("Reading document: " + prefix);
    List<CoreMap> results = new ArrayList<>();
    AceDocument aceDocument;
    if (aceVersion.equals("ACE2004")) {
        aceDocument = AceDocument.parseDocument(prefix, false, aceVersion);
    } else {
        aceDocument = AceDocument.parseDocument(prefix, false);
    }
    String docId = aceDocument.getId();
    // map entity mention ID strings to their EntityMention counterparts
    Map<String, EntityMention> entityMentionMap = Generics.newHashMap();
    /*
    for (int sentenceIndex = 0; sentenceIndex < aceDocument.getSentenceCount(); sentenceIndex++) {
      List<AceToken> tokens = aceDocument.getSentence(sentenceIndex);
      StringBuilder b = new StringBuilder();
      for(AceToken t: tokens) b.append(t.getLiteral() + " " );
      logger.info("SENTENCE: " + b.toString());
    }
    */
    int tokenOffset = 0;
    for (int sentenceIndex = 0; sentenceIndex < aceDocument.getSentenceCount(); sentenceIndex++) {
        List<AceToken> tokens = aceDocument.getSentence(sentenceIndex);
        List<CoreLabel> words = new ArrayList<>();
        StringBuilder textContent = new StringBuilder();
        for (int i = 0; i < tokens.size(); i++) {
            CoreLabel l = new CoreLabel();
            l.setWord(tokens.get(i).getLiteral());
            l.set(CoreAnnotations.ValueAnnotation.class, l.word());
            l.set(CoreAnnotations.CharacterOffsetBeginAnnotation.class, tokens.get(i).getByteStart());
            l.set(CoreAnnotations.CharacterOffsetEndAnnotation.class, tokens.get(i).getByteEnd());
            words.add(l);
            if (i > 0)
                textContent.append(" ");
            textContent.append(tokens.get(i).getLiteral());
        }
        // skip "sentences" that are really just SGML tags (which come from using the RobustTokenizer)
        if (words.size() == 1) {
            String word = words.get(0).word();
            if (word.startsWith("<") && word.endsWith(">")) {
                tokenOffset += tokens.size();
                continue;
            }
        }
        CoreMap sentence = new Annotation(textContent.toString());
        sentence.set(CoreAnnotations.DocIDAnnotation.class, docId);
        sentence.set(CoreAnnotations.TokensAnnotation.class, words);
        logger.info("Reading sentence: \"" + textContent + "\"");
        List<AceEntityMention> entityMentions = aceDocument.getEntityMentions(sentenceIndex);
        List<AceRelationMention> relationMentions = aceDocument.getRelationMentions(sentenceIndex);
        List<AceEventMention> eventMentions = aceDocument.getEventMentions(sentenceIndex);
        // convert entity mentions
        for (AceEntityMention aceEntityMention : entityMentions) {
            String corefID = "";
            for (String entityID : aceDocument.getKeySetEntities()) {
                AceEntity e = aceDocument.getEntity(entityID);
                if (e.getMentions().contains(aceEntityMention)) {
                    corefID = entityID;
                    break;
                }
            }
            EntityMention convertedMention = convertAceEntityMention(aceEntityMention, docId, sentence, tokenOffset, corefID);
            // EntityMention convertedMention = convertAceEntityMention(aceEntityMention, docId, sentence, tokenOffset);
            entityCounts.incrementCount(convertedMention.getType());
            logger.info("CONVERTED MENTION HEAD SPAN: " + convertedMention.getHead());
            logger.info("CONVERTED ENTITY MENTION: " + convertedMention);
            AnnotationUtils.addEntityMention(sentence, convertedMention);
            entityMentionMap.put(aceEntityMention.getId(), convertedMention);
        // TODO: make Entity objects as needed
        }
        // convert relation mentions
        for (AceRelationMention aceRelationMention : relationMentions) {
            RelationMention convertedMention = convertAceRelationMention(aceRelationMention, docId, sentence, entityMentionMap);
            if (convertedMention != null) {
                relationCounts.incrementCount(convertedMention.getType());
                logger.info("CONVERTED RELATION MENTION: " + convertedMention);
                AnnotationUtils.addRelationMention(sentence, convertedMention);
            }
        // TODO: make Relation objects
        }
        // convert EventMentions
        for (AceEventMention aceEventMention : eventMentions) {
            EventMention convertedMention = convertAceEventMention(aceEventMention, docId, sentence, entityMentionMap, tokenOffset);
            if (convertedMention != null) {
                eventCounts.incrementCount(convertedMention.getType());
                logger.info("CONVERTED EVENT MENTION: " + convertedMention);
                AnnotationUtils.addEventMention(sentence, convertedMention);
            }
        // TODO: make Event objects
        }
        results.add(sentence);
        tokenOffset += tokens.size();
    }
    return results;
}
Also used : EventMention(edu.stanford.nlp.ie.machinereading.structure.EventMention) AceEventMention(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceEventMention) AceRelationMention(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceRelationMention) RelationMention(edu.stanford.nlp.ie.machinereading.structure.RelationMention) ArrayList(java.util.ArrayList) AceEntity(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceEntity) AceEventMention(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceEventMention) Annotation(edu.stanford.nlp.pipeline.Annotation) AceDocument(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceDocument) CoreLabel(edu.stanford.nlp.ling.CoreLabel) EntityMention(edu.stanford.nlp.ie.machinereading.structure.EntityMention) AceEntityMention(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceEntityMention) AceToken(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceToken) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations) AceRelationMention(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceRelationMention) AceEntityMention(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceEntityMention) CoreMap(edu.stanford.nlp.util.CoreMap)

Example 9 with RelationMention

use of edu.stanford.nlp.ie.machinereading.structure.RelationMention in project CoreNLP by stanfordnlp.

the class AceReader method convertAceRelationMention.

private RelationMention convertAceRelationMention(AceRelationMention aceRelationMention, String docId, CoreMap sentence, Map<String, EntityMention> entityMap) {
    List<AceRelationMentionArgument> args = Arrays.asList(aceRelationMention.getArgs());
    List<ExtractionObject> convertedArgs = new ArrayList<>();
    List<String> argNames = new ArrayList<>();
    // the arguments are already stored in semantic order. Make sure we preserve the same ordering!
    int left = Integer.MAX_VALUE;
    int right = Integer.MIN_VALUE;
    for (AceRelationMentionArgument arg : args) {
        ExtractionObject o = entityMap.get(arg.getContent().getId());
        if (o == null) {
            logger.severe("READER ERROR: Failed to find relation argument with id " + arg.getContent().getId());
            logger.severe("This happens because a few relation mentions illegally span multiple sentences. Will ignore this mention.");
            return null;
        }
        convertedArgs.add(o);
        argNames.add(arg.getRole());
        if (o.getExtentTokenStart() < left)
            left = o.getExtentTokenStart();
        if (o.getExtentTokenEnd() > right)
            right = o.getExtentTokenEnd();
    }
    if (argNames.size() != 2 || !argNames.get(0).equalsIgnoreCase("arg-1") || !argNames.get(1).equalsIgnoreCase("arg-2")) {
        logger.severe("READER ERROR: Invalid succession of arguments in relation mention: " + argNames);
        logger.severe("ACE relations must have two arguments. Will ignore this mention.");
        return null;
    }
    RelationMention relation = new RelationMention(aceRelationMention.getId(), sentence, new Span(left, right), aceRelationMention.getParent().getType(), aceRelationMention.getParent().getSubtype(), convertedArgs, null);
    return relation;
}
Also used : AceRelationMentionArgument(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceRelationMentionArgument) AceRelationMention(edu.stanford.nlp.ie.machinereading.domains.ace.reader.AceRelationMention) RelationMention(edu.stanford.nlp.ie.machinereading.structure.RelationMention) ExtractionObject(edu.stanford.nlp.ie.machinereading.structure.ExtractionObject) ArrayList(java.util.ArrayList) Span(edu.stanford.nlp.ie.machinereading.structure.Span)

Example 10 with RelationMention

use of edu.stanford.nlp.ie.machinereading.structure.RelationMention in project CoreNLP by stanfordnlp.

the class RothCONLL04Reader method readSentence.

private Annotation readSentence(String docId, Iterator<String> lineIterator) {
    Annotation sentence = new Annotation("");
    sentence.set(CoreAnnotations.DocIDAnnotation.class, docId);
    sentence.set(MachineReadingAnnotations.EntityMentionsAnnotation.class, new ArrayList<>());
    // we'll need to set things like the tokens and textContent after we've
    // fully read the sentence
    // contains the full text that we've read so far
    StringBuilder textContent = new StringBuilder();
    // how many tokens we've seen so far
    int tokenCount = 0;
    List<CoreLabel> tokens = new ArrayList<>();
    // when we've seen two blank lines in a row, this sentence is over (one
    // blank line separates the sentence and the relations
    int numBlankLinesSeen = 0;
    String sentenceID = null;
    // keeps tracks of entities we've seen so far for use by relations
    Map<String, EntityMention> indexToEntityMention = new HashMap<>();
    while (lineIterator.hasNext() && numBlankLinesSeen < 2) {
        String currentLine = lineIterator.next();
        currentLine = currentLine.replace("COMMA", ",");
        List<String> pieces = StringUtils.split(currentLine);
        String identifier;
        int size = pieces.size();
        switch(size) {
            case // blank line between sentences or relations
            1:
                numBlankLinesSeen++;
                break;
            case // relation
            3:
                String type = pieces.get(2);
                List<ExtractionObject> args = new ArrayList<>();
                EntityMention entity1 = indexToEntityMention.get(pieces.get(0));
                EntityMention entity2 = indexToEntityMention.get(pieces.get(1));
                if (entity1 == null || entity2 == null) {
                    throw new NullPointerException("Error: a relation was marked between two words where one of the words was not a named entity.  Line causing this error: '" + currentLine + "'");
                }
                args.add(entity1);
                args.add(entity2);
                Span span = new Span(entity1.getExtentTokenStart(), entity2.getExtentTokenEnd());
                // identifier = "relation" + sentenceID + "-" + sentence.getAllRelations().size();
                identifier = RelationMention.makeUniqueId();
                RelationMention relationMention = new RelationMention(identifier, sentence, span, type, null, args);
                AnnotationUtils.addRelationMention(sentence, relationMention);
                break;
            case // token
            9:
                /*
         * Roth token lines look like this:
         *
         * 19 Peop 9 O NNP/NNP Jamal/Ghosheh O O O
         */
                // Entities may be multiple words joined by '/'; we split these up
                List<String> words = StringUtils.split(pieces.get(5), "/");
                // List<String> postags = StringUtils.split(pieces.get(4),"/");
                String text = StringUtils.join(words, " ");
                identifier = "entity" + pieces.get(0) + '-' + pieces.get(2);
                // entity type of the word/expression
                String nerTag = getNormalizedNERTag(pieces.get(1));
                if (sentenceID == null)
                    sentenceID = pieces.get(0);
                if (!nerTag.equals("O")) {
                    Span extentSpan = new Span(tokenCount, tokenCount + words.size());
                    // Temporarily sets the head span to equal the extent span.
                    // This is so the entity has a head (in particular, getValue() works) even if preprocessSentences isn't called.
                    // The head span is later modified if preprocessSentences is called.
                    EntityMention entity = new EntityMention(identifier, sentence, extentSpan, extentSpan, nerTag, null, null);
                    AnnotationUtils.addEntityMention(sentence, entity);
                    // we can get by using these indices as strings since we only use them
                    // as a hash key
                    String index = pieces.get(2);
                    indexToEntityMention.put(index, entity);
                }
                // int i =0;
                for (String word : words) {
                    CoreLabel label = new CoreLabel();
                    label.setWord(word);
                    // label.setTag(postags.get(i));
                    label.set(CoreAnnotations.TextAnnotation.class, word);
                    label.set(CoreAnnotations.ValueAnnotation.class, word);
                    // we don't set TokenBeginAnnotation or TokenEndAnnotation since we're
                    // not keeping track of character offsets
                    tokens.add(label);
                // i++;
                }
                textContent.append(text);
                textContent.append(' ');
                tokenCount += words.size();
                break;
        }
    }
    sentence.set(CoreAnnotations.TextAnnotation.class, textContent.toString());
    sentence.set(CoreAnnotations.ValueAnnotation.class, textContent.toString());
    sentence.set(CoreAnnotations.TokensAnnotation.class, tokens);
    sentence.set(CoreAnnotations.SentenceIDAnnotation.class, sentenceID);
    return sentence;
}
Also used : RelationMention(edu.stanford.nlp.ie.machinereading.structure.RelationMention) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) Span(edu.stanford.nlp.ie.machinereading.structure.Span) Annotation(edu.stanford.nlp.pipeline.Annotation) MachineReadingAnnotations(edu.stanford.nlp.ie.machinereading.structure.MachineReadingAnnotations) CoreLabel(edu.stanford.nlp.ling.CoreLabel) EntityMention(edu.stanford.nlp.ie.machinereading.structure.EntityMention) ExtractionObject(edu.stanford.nlp.ie.machinereading.structure.ExtractionObject) CoreAnnotations(edu.stanford.nlp.ling.CoreAnnotations)

Aggregations

RelationMention (edu.stanford.nlp.ie.machinereading.structure.RelationMention)15 CoreMap (edu.stanford.nlp.util.CoreMap)9 EntityMention (edu.stanford.nlp.ie.machinereading.structure.EntityMention)8 MachineReadingAnnotations (edu.stanford.nlp.ie.machinereading.structure.MachineReadingAnnotations)8 CoreAnnotations (edu.stanford.nlp.ling.CoreAnnotations)8 CoreLabel (edu.stanford.nlp.ling.CoreLabel)6 RelationTriple (edu.stanford.nlp.ie.util.RelationTriple)4 SemanticGraphCoreAnnotations (edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations)4 SentimentCoreAnnotations (edu.stanford.nlp.sentiment.SentimentCoreAnnotations)4 Tree (edu.stanford.nlp.trees.Tree)4 TreeCoreAnnotations (edu.stanford.nlp.trees.TreeCoreAnnotations)4 ArrayList (java.util.ArrayList)4 CorefCoreAnnotations (edu.stanford.nlp.coref.CorefCoreAnnotations)3 CorefChain (edu.stanford.nlp.coref.data.CorefChain)3 ExtractionObject (edu.stanford.nlp.ie.machinereading.structure.ExtractionObject)3 Span (edu.stanford.nlp.ie.machinereading.structure.Span)3 CoreAnnotation (edu.stanford.nlp.ling.CoreAnnotation)3 RNNCoreAnnotations (edu.stanford.nlp.neural.rnn.RNNCoreAnnotations)3 Annotation (edu.stanford.nlp.pipeline.Annotation)3 Mention (edu.stanford.nlp.coref.data.Mention)2