Search in sources :

Example 1 with NewlineEndOfSentenceMarker

use of com.joliciel.talismane.rawText.NewlineEndOfSentenceMarker in project talismane by joliciel-informatique.

the class NewlineEndOfSentenceMarkerTest method testApply.

@Test
public void testApply() throws Exception {
    NewlineEndOfSentenceMarker filter = new NewlineEndOfSentenceMarker(1000);
    AnnotatedText text = new AnnotatedText("1\r\n2\r\n");
    filter.annotate(text);
    LOG.debug(text.getAnnotations().toString());
    List<Annotation<RawTextSentenceBreakMarker>> sentenceBreaks = text.getAnnotations(RawTextSentenceBreakMarker.class);
    assertEquals(2, sentenceBreaks.size());
    List<Annotation<RawTextSkipMarker>> skips = text.getAnnotations(RawTextSkipMarker.class);
    assertEquals(2, skips.size());
    assertEquals(1, sentenceBreaks.get(0).getStart());
    assertEquals(3, sentenceBreaks.get(0).getEnd());
    assertEquals(1, skips.get(0).getStart());
    assertEquals(3, skips.get(0).getEnd());
    assertEquals(4, sentenceBreaks.get(1).getStart());
    assertEquals(6, sentenceBreaks.get(1).getEnd());
    assertEquals(4, skips.get(1).getStart());
    assertEquals(6, skips.get(1).getEnd());
    text = new AnnotatedText("1\r2\r");
    filter.annotate(text);
    LOG.debug(text.getAnnotations().toString());
    sentenceBreaks = text.getAnnotations(RawTextSentenceBreakMarker.class);
    assertEquals(2, sentenceBreaks.size());
    skips = text.getAnnotations(RawTextSkipMarker.class);
    assertEquals(2, skips.size());
    assertEquals(1, sentenceBreaks.get(0).getStart());
    assertEquals(2, sentenceBreaks.get(0).getEnd());
    assertEquals(1, skips.get(0).getStart());
    assertEquals(2, skips.get(0).getEnd());
    assertEquals(3, sentenceBreaks.get(1).getStart());
    assertEquals(4, sentenceBreaks.get(1).getEnd());
    assertEquals(3, skips.get(1).getStart());
    assertEquals(4, skips.get(1).getEnd());
    text = new AnnotatedText("1\r2\r");
    filter.annotate(text);
    LOG.debug(text.getAnnotations().toString());
    sentenceBreaks = text.getAnnotations(RawTextSentenceBreakMarker.class);
    assertEquals(2, sentenceBreaks.size());
    skips = text.getAnnotations(RawTextSkipMarker.class);
    assertEquals(2, skips.size());
    assertEquals(1, sentenceBreaks.get(0).getStart());
    assertEquals(2, sentenceBreaks.get(0).getEnd());
    assertEquals(1, skips.get(0).getStart());
    assertEquals(2, skips.get(0).getEnd());
    assertEquals(3, sentenceBreaks.get(1).getStart());
    assertEquals(4, sentenceBreaks.get(1).getEnd());
    assertEquals(3, skips.get(1).getStart());
    assertEquals(4, skips.get(1).getEnd());
}
Also used : NewlineEndOfSentenceMarker(com.joliciel.talismane.rawText.NewlineEndOfSentenceMarker) AnnotatedText(com.joliciel.talismane.AnnotatedText) RawTextSkipMarker(com.joliciel.talismane.rawText.RawTextMarker.RawTextSkipMarker) Annotation(com.joliciel.talismane.Annotation) RawTextSentenceBreakMarker(com.joliciel.talismane.rawText.RawTextMarker.RawTextSentenceBreakMarker) TalismaneTest(com.joliciel.talismane.TalismaneTest) Test(org.junit.Test)

Aggregations

AnnotatedText (com.joliciel.talismane.AnnotatedText)1 Annotation (com.joliciel.talismane.Annotation)1 TalismaneTest (com.joliciel.talismane.TalismaneTest)1 NewlineEndOfSentenceMarker (com.joliciel.talismane.rawText.NewlineEndOfSentenceMarker)1 RawTextSentenceBreakMarker (com.joliciel.talismane.rawText.RawTextMarker.RawTextSentenceBreakMarker)1 RawTextSkipMarker (com.joliciel.talismane.rawText.RawTextMarker.RawTextSkipMarker)1 Test (org.junit.Test)1