Search in sources :

Example 1 with JFlexSymbolMatcher

use of org.opensolaris.opengrok.analysis.JFlexSymbolMatcher in project OpenGrok by OpenGrok.

the class PerlSymbolTokenizerTest method testOffsetAttribute.

/**
 * Helper method for {@link #testOffsetAttribute()} that runs the test on
 * one single implementation class with the specified input text and
 * expected tokens.
 */
private void testOffsetAttribute(Class<? extends JFlexSymbolMatcher> klass, String inputText, String[] expectedTokens) throws Exception {
    JFlexSymbolMatcher matcher = klass.getConstructor(Reader.class).newInstance(new StringReader(inputText));
    JFlexTokenizer tokenizer = new JFlexTokenizer(matcher);
    CharTermAttribute term = tokenizer.addAttribute(CharTermAttribute.class);
    OffsetAttribute offset = tokenizer.addAttribute(OffsetAttribute.class);
    int count = 0;
    while (tokenizer.incrementToken()) {
        assertTrue("too many tokens", count < expectedTokens.length);
        String expected = expectedTokens[count];
        // 0-based offset to accord with String[]
        assertEquals("term" + count, expected, term.toString());
        assertEquals("start" + count, inputText.indexOf(expected), offset.startOffset());
        assertEquals("end" + count, inputText.indexOf(expected) + expected.length(), offset.endOffset());
        count++;
    }
    assertEquals("wrong number of tokens", expectedTokens.length, count);
}
Also used : JFlexTokenizer(org.opensolaris.opengrok.analysis.JFlexTokenizer) CharTermAttribute(org.apache.lucene.analysis.tokenattributes.CharTermAttribute) JFlexSymbolMatcher(org.opensolaris.opengrok.analysis.JFlexSymbolMatcher) StringReader(java.io.StringReader) OffsetAttribute(org.apache.lucene.analysis.tokenattributes.OffsetAttribute) Reader(java.io.Reader) InputStreamReader(java.io.InputStreamReader) StringReader(java.io.StringReader) BufferedReader(java.io.BufferedReader)

Aggregations

BufferedReader (java.io.BufferedReader)1 InputStreamReader (java.io.InputStreamReader)1 Reader (java.io.Reader)1 StringReader (java.io.StringReader)1 CharTermAttribute (org.apache.lucene.analysis.tokenattributes.CharTermAttribute)1 OffsetAttribute (org.apache.lucene.analysis.tokenattributes.OffsetAttribute)1 JFlexSymbolMatcher (org.opensolaris.opengrok.analysis.JFlexSymbolMatcher)1 JFlexTokenizer (org.opensolaris.opengrok.analysis.JFlexTokenizer)1