Search in sources :

Example 1 with JFlexSymbolMatcher

use of org.opengrok.indexer.analysis.JFlexSymbolMatcher in project OpenGrok by OpenGrok.

the class PerlSymbolTokenizerTest method testOffsetAttribute.

/**
 * Helper method for {@link #testOffsetAttribute()} that runs the test on
 * one single implementation class with the specified input text and
 * expected tokens.
 */
private void testOffsetAttribute(Class<? extends JFlexSymbolMatcher> klass, String inputText, String[] expectedTokens) throws Exception {
    JFlexSymbolMatcher matcher = klass.getConstructor(Reader.class).newInstance(new StringReader(inputText));
    JFlexTokenizer tokenizer = new JFlexTokenizer(matcher);
    CharTermAttribute term = tokenizer.addAttribute(CharTermAttribute.class);
    OffsetAttribute offset = tokenizer.addAttribute(OffsetAttribute.class);
    int count = 0;
    while (tokenizer.incrementToken()) {
        assertTrue(count < expectedTokens.length, "too many tokens");
        String expected = expectedTokens[count];
        // 0-based offset to accord with String[]
        assertEquals(expected, term.toString(), "term" + count);
        assertEquals(inputText.indexOf(expected), offset.startOffset(), "start" + count);
        assertEquals(inputText.indexOf(expected) + expected.length(), offset.endOffset(), "end" + count);
        count++;
    }
    assertEquals(expectedTokens.length, count, "wrong number of tokens");
}
Also used : JFlexTokenizer(org.opengrok.indexer.analysis.JFlexTokenizer) CharTermAttribute(org.apache.lucene.analysis.tokenattributes.CharTermAttribute) JFlexSymbolMatcher(org.opengrok.indexer.analysis.JFlexSymbolMatcher) StringReader(java.io.StringReader) OffsetAttribute(org.apache.lucene.analysis.tokenattributes.OffsetAttribute) Reader(java.io.Reader) StringReader(java.io.StringReader)

Aggregations

Reader (java.io.Reader)1 StringReader (java.io.StringReader)1 CharTermAttribute (org.apache.lucene.analysis.tokenattributes.CharTermAttribute)1 OffsetAttribute (org.apache.lucene.analysis.tokenattributes.OffsetAttribute)1 JFlexSymbolMatcher (org.opengrok.indexer.analysis.JFlexSymbolMatcher)1 JFlexTokenizer (org.opengrok.indexer.analysis.JFlexTokenizer)1