Search in sources :

Example 6 with Automaton

use of org.apache.lucene.util.automaton.Automaton in project lucene-solr by apache.

the class TestGraphTokenizers method testSynOverMultipleHoles.

public void testSynOverMultipleHoles() throws Exception {
    final TokenStream ts = new CannedTokenStream(new Token[] { token("a", 1, 1), token("x", 0, 3), token("b", 3, 1) });
    final Automaton a1 = join(s2a("a"), SEP_A, HOLE_A, SEP_A, HOLE_A, SEP_A, s2a("b"));
    final Automaton a2 = join(s2a("x"), SEP_A, s2a("b"));
    assertSameLanguage(Operations.union(a1, a2), ts);
}
Also used : Automaton(org.apache.lucene.util.automaton.Automaton)

Example 7 with Automaton

use of org.apache.lucene.util.automaton.Automaton in project lucene-solr by apache.

the class TestGraphTokenizers method testSynOverHole.

public void testSynOverHole() throws Exception {
    final TokenStream ts = new CannedTokenStream(new Token[] { token("a", 1, 1), token("X", 0, 2), token("b", 2, 1) });
    final Automaton a1 = Operations.union(join(s2a("a"), SEP_A, HOLE_A), s2a("X"));
    final Automaton expected = Operations.concatenate(a1, join(SEP_A, s2a("b")));
    assertSameLanguage(expected, ts);
}
Also used : Automaton(org.apache.lucene.util.automaton.Automaton)

Example 8 with Automaton

use of org.apache.lucene.util.automaton.Automaton in project lucene-solr by apache.

the class PrefixQuery method toAutomaton.

/** Build an automaton accepting all terms with the specified prefix. */
public static Automaton toAutomaton(BytesRef prefix) {
    final int numStatesAndTransitions = prefix.length + 1;
    final Automaton automaton = new Automaton(numStatesAndTransitions, numStatesAndTransitions);
    int lastState = automaton.createState();
    for (int i = 0; i < prefix.length; i++) {
        int state = automaton.createState();
        automaton.addTransition(lastState, state, prefix.bytes[prefix.offset + i] & 0xff);
        lastState = state;
    }
    automaton.setAccept(lastState, true);
    automaton.addTransition(lastState, lastState, 0, 255);
    automaton.finishState();
    assert automaton.isDeterministic();
    return automaton;
}
Also used : Automaton(org.apache.lucene.util.automaton.Automaton)

Example 9 with Automaton

use of org.apache.lucene.util.automaton.Automaton in project lucene-solr by apache.

the class TestAutomatonQuery method testNFA.

/**
   * Test that a nondeterministic automaton works correctly. (It should will be
   * determinized)
   */
public void testNFA() throws IOException {
    // accept this or three, the union is an NFA (two transitions for 't' from
    // initial state)
    Automaton nfa = Operations.union(Automata.makeString("this"), Automata.makeString("three"));
    assertAutomatonHits(2, nfa);
}
Also used : Automaton(org.apache.lucene.util.automaton.Automaton)

Example 10 with Automaton

use of org.apache.lucene.util.automaton.Automaton in project lucene-solr by apache.

the class TestAutomatonQueryUnicode method testSortOrder.

/**
   * Test that AutomatonQuery interacts with lucene's sort order correctly.
   * 
   * This expression matches something either starting with the arabic
   * presentation forms block, or a supplementary character.
   */
public void testSortOrder() throws IOException {
    Automaton a = new RegExp("((𩬅)|ﮔ).*").toAutomaton();
    assertAutomatonHits(2, a);
}
Also used : Automaton(org.apache.lucene.util.automaton.Automaton) RegExp(org.apache.lucene.util.automaton.RegExp)

Aggregations

Automaton (org.apache.lucene.util.automaton.Automaton)57 TokenStreamToAutomaton (org.apache.lucene.analysis.TokenStreamToAutomaton)17 IntsRef (org.apache.lucene.util.IntsRef)13 BytesRef (org.apache.lucene.util.BytesRef)12 ArrayList (java.util.ArrayList)11 Directory (org.apache.lucene.store.Directory)8 HashSet (java.util.HashSet)7 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)7 Document (org.apache.lucene.document.Document)6 CompiledAutomaton (org.apache.lucene.util.automaton.CompiledAutomaton)6 Transition (org.apache.lucene.util.automaton.Transition)6 TokenStream (org.apache.lucene.analysis.TokenStream)5 BytesRefBuilder (org.apache.lucene.util.BytesRefBuilder)5 CharsRefBuilder (org.apache.lucene.util.CharsRefBuilder)5 CharacterRunAutomaton (org.apache.lucene.util.automaton.CharacterRunAutomaton)5 Analyzer (org.apache.lucene.analysis.Analyzer)4 IntsRefBuilder (org.apache.lucene.util.IntsRefBuilder)4 FiniteStringsIterator (org.apache.lucene.util.automaton.FiniteStringsIterator)4 LevenshteinAutomata (org.apache.lucene.util.automaton.LevenshteinAutomata)4 RegExp (org.apache.lucene.util.automaton.RegExp)4