Search in sources :

Example 16 with Sentence

use of com.graphaware.nlp.domain.Sentence in project neo4j-nlp by graphaware.

the class AnnotatedTextPersistenceTest method createAnnotatedTextFor.

private AnnotatedText createAnnotatedTextFor(String text, String expectedTokenForPOS, String expectedPOS) {
    AnnotatedText annotatedText = new AnnotatedText();
    annotatedText.setText(text);
    AtomicInteger inc = new AtomicInteger();
    for (String s : text.split("\\.")) {
        Sentence sentence = new Sentence(s, inc.get());
        for (String token : s.split(" ")) {
            Tag tag = new Tag(token, "en");
            if (token.equals(expectedTokenForPOS)) {
                tag.setPos(Collections.singletonList(expectedPOS));
            }
            sentence.addTagOccurrence(0, 20, token, sentence.addTag(tag));
        }
        inc.incrementAndGet();
        annotatedText.addSentence(sentence);
    }
    return annotatedText;
}
Also used : AnnotatedText(com.graphaware.nlp.domain.AnnotatedText) AtomicInteger(java.util.concurrent.atomic.AtomicInteger) Tag(com.graphaware.nlp.domain.Tag) Sentence(com.graphaware.nlp.domain.Sentence)

Example 17 with Sentence

use of com.graphaware.nlp.domain.Sentence in project neo4j-nlp by graphaware.

the class AnnotatedTextTest method testFilter.

@Test
public void testFilter() {
    AnnotatedText annotatedText = new AnnotatedText();
    Sentence sentence = new Sentence(SHORT_TEXT_1, 0);
    sentence.addTag(getTag("BBC", null));
    sentence.addTag(getTag("China", "LOCATION"));
    annotatedText.addSentence(sentence);
    assertTrue(annotatedText.filter("BBC"));
    assertTrue(annotatedText.filter("China/LOCATION"));
}
Also used : AnnotatedText(com.graphaware.nlp.domain.AnnotatedText) Sentence(com.graphaware.nlp.domain.Sentence) Test(org.junit.Test)

Example 18 with Sentence

use of com.graphaware.nlp.domain.Sentence in project neo4j-nlp by graphaware.

the class StubTextProcessor method annotateText.

@Override
public AnnotatedText annotateText(String text, PipelineSpecification pipelineSpecification) {
    this.lastPipelineUsed = pipelineSpecification.getName();
    AnnotatedText annotatedText = new AnnotatedText();
    String[] sentencesSplit = text.split("\\.");
    int sentenceNumber = 0;
    for (String stext : sentencesSplit) {
        String[] parts = stext.split(" ");
        int pos = 0;
        final Sentence sentence = new Sentence(stext, sentenceNumber);
        for (String token : parts) {
            Tag tag = new Tag(token, pipelineSpecification.getLanguage());
            if (!pipelineSpecification.getExcludedNER().contains("test")) {
                tag.setNe(Collections.singletonList("test"));
            }
            tag.setPos(Collections.singletonList("TESTVB"));
            int begin = pos;
            pos += token.length() + 1;
            sentence.addTagOccurrence(begin, pos, token, sentence.addTag(tag));
        }
        if (pipelineSpecification.hasProcessingStep("phrase")) {
            Phrase phrase = new Phrase(stext);
            sentence.addPhraseOccurrence(0, stext.length(), phrase);
        }
        annotatedText.addSentence(sentence);
        sentenceNumber++;
    }
    return annotatedText;
}
Also used : AnnotatedText(com.graphaware.nlp.domain.AnnotatedText) Tag(com.graphaware.nlp.domain.Tag) Phrase(com.graphaware.nlp.domain.Phrase) Sentence(com.graphaware.nlp.domain.Sentence)

Aggregations

Sentence (com.graphaware.nlp.domain.Sentence)18 AnnotatedText (com.graphaware.nlp.domain.AnnotatedText)14 Tag (com.graphaware.nlp.domain.Tag)9 Test (org.junit.Test)9 TestAnnotatedText (com.graphaware.nlp.util.TestAnnotatedText)7 TagUtils.newTag (com.graphaware.nlp.util.TagUtils.newTag)5 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)2 Phrase (com.graphaware.nlp.domain.Phrase)1 AtomicReference (java.util.concurrent.atomic.AtomicReference)1 Node (org.neo4j.graphdb.Node)1