Search in sources :

Example 51 with BreakIterator

use of java.text.BreakIterator in project lucene-solr by apache.

the class TestWholeBreakIterator method testSliceStart.

public void testSliceStart() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new WholeBreakIterator();
    assertSameBreaks("000a", 3, 1, expected, actual);
    assertSameBreaks("000ab", 3, 2, expected, actual);
    assertSameBreaks("000abc", 3, 3, expected, actual);
    assertSameBreaks("000", 3, 0, expected, actual);
}
Also used : BreakIterator(java.text.BreakIterator)

Example 52 with BreakIterator

use of java.text.BreakIterator in project lucene-solr by apache.

the class TestWholeBreakIterator method testSliceEnd.

public void testSliceEnd() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new WholeBreakIterator();
    assertSameBreaks("a000", 0, 1, expected, actual);
    assertSameBreaks("ab000", 0, 1, expected, actual);
    assertSameBreaks("abc000", 0, 1, expected, actual);
    assertSameBreaks("000", 0, 0, expected, actual);
}
Also used : BreakIterator(java.text.BreakIterator)

Example 53 with BreakIterator

use of java.text.BreakIterator in project lucene-solr by apache.

the class TestWholeBreakIterator method testSliceMiddle.

public void testSliceMiddle() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new WholeBreakIterator();
    assertSameBreaks("000a000", 3, 1, expected, actual);
    assertSameBreaks("000ab000", 3, 2, expected, actual);
    assertSameBreaks("000abc000", 3, 3, expected, actual);
    assertSameBreaks("000000", 3, 0, expected, actual);
}
Also used : BreakIterator(java.text.BreakIterator)

Example 54 with BreakIterator

use of java.text.BreakIterator in project lucene-solr by apache.

the class TestWholeBreakIterator method testSingleSentences.

/** For single sentences, we know WholeBreakIterator should break the same as a sentence iterator */
public void testSingleSentences() throws Exception {
    BreakIterator expected = BreakIterator.getSentenceInstance(Locale.ROOT);
    BreakIterator actual = new WholeBreakIterator();
    assertSameBreaks("a", expected, actual);
    assertSameBreaks("ab", expected, actual);
    assertSameBreaks("abc", expected, actual);
    assertSameBreaks("", expected, actual);
}
Also used : BreakIterator(java.text.BreakIterator)

Example 55 with BreakIterator

use of java.text.BreakIterator in project lucene-solr by apache.

the class TestCharArrayIterator method testConsumeWordInstance.

public void testConsumeWordInstance() {
    // we use the default locale, as it's randomized by LuceneTestCase
    BreakIterator bi = BreakIterator.getWordInstance(Locale.getDefault());
    CharArrayIterator ci = CharArrayIterator.newWordInstance();
    for (int i = 0; i < 10000; i++) {
        char[] text = TestUtil.randomUnicodeString(random()).toCharArray();
        ci.setText(text, 0, text.length);
        consume(bi, ci);
    }
}
Also used : BreakIterator(java.text.BreakIterator)

Aggregations

BreakIterator (java.text.BreakIterator)59 ArrayList (java.util.ArrayList)10 Locale (java.util.Locale)6 IntPair (edu.illinois.cs.cogcomp.core.datastructures.IntPair)3 BytesRef (org.apache.lucene.util.BytesRef)3 Snippet (org.apache.lucene.search.highlight.Snippet)2 Intent (android.content.Intent)1 TagElement (com.google.devtools.j2objc.ast.TagElement)1 Pair (edu.illinois.cs.cogcomp.core.datastructures.Pair)1 TextAnnotation (edu.illinois.cs.cogcomp.core.datastructures.textannotation.TextAnnotation)1 IOException (java.io.IOException)1 Iterator (java.util.Iterator)1 PriorityQueue (java.util.PriorityQueue)1 JComponent (javax.swing.JComponent)1 Text (org.apache.hadoop.io.Text)1 Analyzer (org.apache.lucene.analysis.Analyzer)1 IndexSearcher (org.apache.lucene.search.IndexSearcher)1 Encoder (org.apache.lucene.search.highlight.Encoder)1 CustomSeparatorBreakIterator (org.apache.lucene.search.postingshighlight.CustomSeparatorBreakIterator)1 CustomPassageFormatter (org.apache.lucene.search.uhighlight.CustomPassageFormatter)1