Search in sources :

Example 6 with TokenizerChain

use of org.apache.jackrabbit.oak.plugins.index.lucene.util.TokenizerChain in project jackrabbit-oak by apache.

the class NodeStateAnalyzerFactoryTest method analyzerByComposition_CharFilter.

@Test
public void analyzerByComposition_CharFilter() throws Exception {
    NodeBuilder nb = EMPTY_NODE.builder();
    nb.child(ANL_TOKENIZER).setProperty(ANL_NAME, "whitespace");
    NodeBuilder filters = nb.child(ANL_CHAR_FILTERS);
    filters.setProperty(OAK_CHILD_ORDER, ImmutableList.of("htmlStrip", "mapping"), NAMES);
    filters.child("mapping").setProperty(ANL_NAME, "mapping");
    // name is optional. Derived from nodeName
    filters.child("htmlStrip");
    TokenizerChain analyzer = (TokenizerChain) factory.createInstance(nb.getNodeState());
    assertEquals(2, analyzer.getCharFilters().length);
    // check the order
    assertEquals(HTMLStripCharFilterFactory.class.getName(), analyzer.getCharFilters()[0].getClassArg());
    assertEquals(MappingCharFilterFactory.class.getName(), analyzer.getCharFilters()[1].getClassArg());
}
Also used : TokenizerChain(org.apache.jackrabbit.oak.plugins.index.lucene.util.TokenizerChain) HTMLStripCharFilterFactory(org.apache.lucene.analysis.charfilter.HTMLStripCharFilterFactory) MappingCharFilterFactory(org.apache.lucene.analysis.charfilter.MappingCharFilterFactory) NodeBuilder(org.apache.jackrabbit.oak.spi.state.NodeBuilder) Test(org.junit.Test)

Aggregations

TokenizerChain (org.apache.jackrabbit.oak.plugins.index.lucene.util.TokenizerChain)6 NodeBuilder (org.apache.jackrabbit.oak.spi.state.NodeBuilder)4 Test (org.junit.Test)4 StopFilterFactory (org.apache.lucene.analysis.core.StopFilterFactory)2 PathHierarchyTokenizerFactory (org.apache.lucene.analysis.path.PathHierarchyTokenizerFactory)2 Analyzer (org.apache.lucene.analysis.Analyzer)1 HTMLStripCharFilterFactory (org.apache.lucene.analysis.charfilter.HTMLStripCharFilterFactory)1 MappingCharFilterFactory (org.apache.lucene.analysis.charfilter.MappingCharFilterFactory)1 LowerCaseFilterFactory (org.apache.lucene.analysis.core.LowerCaseFilterFactory)1 WhitespaceTokenizerFactory (org.apache.lucene.analysis.core.WhitespaceTokenizerFactory)1 LimitTokenCountAnalyzer (org.apache.lucene.analysis.miscellaneous.LimitTokenCountAnalyzer)1 PerFieldAnalyzerWrapper (org.apache.lucene.analysis.miscellaneous.PerFieldAnalyzerWrapper)1 CharFilterFactory (org.apache.lucene.analysis.util.CharFilterFactory)1 TokenFilterFactory (org.apache.lucene.analysis.util.TokenFilterFactory)1 TokenizerFactory (org.apache.lucene.analysis.util.TokenizerFactory)1