Search in sources :

Example 1 with SpecialTokenRegistry

use of com.yahoo.prelude.query.parser.SpecialTokenRegistry in project vespa by vespa-engine.

the class ParserEnvironment method fromExecutionContext.

public static ParserEnvironment fromExecutionContext(Execution.Context context) {
    ParserEnvironment env = new ParserEnvironment();
    if (context == null) {
        return env;
    }
    if (context.getIndexFacts() != null) {
        env.setIndexFacts(context.getIndexFacts());
    }
    if (context.getLinguistics() != null) {
        env.setLinguistics(context.getLinguistics());
    }
    SpecialTokenRegistry registry = context.getTokenRegistry();
    if (registry != null) {
        env.setSpecialTokens(registry.getSpecialTokens("default"));
    }
    return env;
}
Also used : SpecialTokenRegistry(com.yahoo.prelude.query.parser.SpecialTokenRegistry)

Example 2 with SpecialTokenRegistry

use of com.yahoo.prelude.query.parser.SpecialTokenRegistry in project vespa by vespa-engine.

the class RewriterFeaturesTestCase method testConvertStringToQTree.

@Test
public final void testConvertStringToQTree() {
    Execution placeholder = new Execution(Context.createContextStub());
    SpecialTokenRegistry tokenRegistry = new SpecialTokenRegistry(new SpecialtokensConfig(new SpecialtokensConfig.Builder().tokenlist(new Tokenlist.Builder().name("default").tokens(new Tokens.Builder().token(ASCII_ELLIPSIS)))));
    placeholder.context().setTokenRegistry(tokenRegistry);
    Query query = new Query();
    query.getModel().setExecution(placeholder);
    Item parsed = RewriterFeatures.convertStringToQTree(query, "a b c " + ASCII_ELLIPSIS);
    assertSame(AndItem.class, parsed.getClass());
    assertEquals(4, ((CompositeItem) parsed).getItemCount());
    assertEquals(ASCII_ELLIPSIS, ((CompositeItem) parsed).getItem(3).toString());
}
Also used : CompositeItem(com.yahoo.prelude.query.CompositeItem) Item(com.yahoo.prelude.query.Item) AndItem(com.yahoo.prelude.query.AndItem) CompositeItem(com.yahoo.prelude.query.CompositeItem) Execution(com.yahoo.search.searchchain.Execution) SpecialTokenRegistry(com.yahoo.prelude.query.parser.SpecialTokenRegistry) Query(com.yahoo.search.Query) SpecialtokensConfig(com.yahoo.vespa.configdefinition.SpecialtokensConfig) Tokenlist(com.yahoo.vespa.configdefinition.SpecialtokensConfig.Tokenlist) Tokens(com.yahoo.vespa.configdefinition.SpecialtokensConfig.Tokenlist.Tokens) Test(org.junit.Test)

Example 3 with SpecialTokenRegistry

use of com.yahoo.prelude.query.parser.SpecialTokenRegistry in project vespa by vespa-engine.

the class TokenizerTestCase method testSpecialTokenConfigurationOther.

@Test
public void testSpecialTokenConfigurationOther() {
    String tokenFile = "file:src/test/java/com/yahoo/prelude/query/parser/test/specialtokens.cfg";
    SpecialTokenRegistry r = new SpecialTokenRegistry(tokenFile);
    assertEquals("Special tokens configured", 6, r.getSpecialTokens("default").size());
    assertEquals("Special tokens configured", 4, r.getSpecialTokens("other").size());
    Tokenizer tokenizer = new Tokenizer(new SimpleLinguistics());
    tokenizer.setSpecialTokens(r.getSpecialTokens("other"));
    List<?> tokens = tokenizer.tokenize("with space,!!!*** [huh] or ------ " + "know, &&&%%% b.s.d.");
    assertEquals(new Token(WORD, "with"), tokens.get(0));
    assertEquals(new Token(SPACE, " "), tokens.get(1));
    assertEquals(new Token(WORD, "space"), tokens.get(2));
    assertEquals(new Token(COMMA, ","), tokens.get(3));
    assertEquals(new Token(WORD, "!!!***"), tokens.get(4));
    assertEquals(new Token(SPACE, " "), tokens.get(5));
    assertEquals(new Token(WORD, "[huh]"), tokens.get(6));
    assertEquals(new Token(SPACE, " "), tokens.get(7));
    assertEquals(new Token(WORD, "or"), tokens.get(8));
    assertEquals(new Token(SPACE, " "), tokens.get(9));
    assertEquals(new Token(WORD, "------"), tokens.get(10));
    assertEquals(new Token(SPACE, " "), tokens.get(11));
    assertEquals(new Token(WORD, "know"), tokens.get(12));
    assertEquals(new Token(COMMA, ","), tokens.get(13));
    assertEquals(new Token(SPACE, " "), tokens.get(14));
    assertEquals(new Token(WORD, "&&&%%%"), tokens.get(15));
    assertEquals(new Token(SPACE, " "), tokens.get(16));
    assertEquals(new Token(WORD, "b"), tokens.get(17));
    assertEquals(new Token(DOT, "."), tokens.get(18));
    assertEquals(new Token(WORD, "s"), tokens.get(19));
    assertEquals(new Token(DOT, "."), tokens.get(20));
    assertEquals(new Token(WORD, "d"), tokens.get(21));
    assertEquals(new Token(DOT, "."), tokens.get(22));
    assertTrue(((Token) tokens.get(10)).isSpecial());
}
Also used : SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) SpecialTokenRegistry(com.yahoo.prelude.query.parser.SpecialTokenRegistry) Token(com.yahoo.prelude.query.parser.Token) Tokenizer(com.yahoo.prelude.query.parser.Tokenizer) Test(org.junit.Test)

Example 4 with SpecialTokenRegistry

use of com.yahoo.prelude.query.parser.SpecialTokenRegistry in project vespa by vespa-engine.

the class TokenizerTestCase method testSpecialTokenConfigurationDefault.

@Test
public void testSpecialTokenConfigurationDefault() {
    String tokenFile = "file:src/test/java/com/yahoo/prelude/query/parser/test/specialtokens.cfg";
    SpecialTokenRegistry r = new SpecialTokenRegistry(tokenFile);
    assertEquals("Special tokens configured", 6, r.getSpecialTokens("default").size());
    assertEquals("Special tokens configured", 4, r.getSpecialTokens("other").size());
    Tokenizer tokenizer = new Tokenizer(new SimpleLinguistics());
    tokenizer.setSpecialTokens(r.getSpecialTokens("default"));
    List<?> tokens = tokenizer.tokenize("with space, c++ or .... know, not b.s.d.");
    assertEquals(new Token(WORD, "with space"), tokens.get(0));
    assertEquals(new Token(COMMA, ","), tokens.get(1));
    assertEquals(new Token(SPACE, " "), tokens.get(2));
    assertEquals(new Token(WORD, "c++"), tokens.get(3));
    assertEquals(new Token(SPACE, " "), tokens.get(4));
    assertEquals(new Token(WORD, "or"), tokens.get(5));
    assertEquals(new Token(SPACE, " "), tokens.get(6));
    assertEquals(new Token(WORD, "...."), tokens.get(7));
    assertEquals(new Token(SPACE, " "), tokens.get(8));
    assertEquals(new Token(WORD, "know"), tokens.get(9));
    assertEquals(new Token(COMMA, ","), tokens.get(10));
    assertEquals(new Token(SPACE, " "), tokens.get(11));
    assertEquals(new Token(WORD, "not"), tokens.get(12));
    assertEquals(new Token(SPACE, " "), tokens.get(13));
    assertEquals(new Token(WORD, "b.s.d."), tokens.get(14));
}
Also used : SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) SpecialTokenRegistry(com.yahoo.prelude.query.parser.SpecialTokenRegistry) Token(com.yahoo.prelude.query.parser.Token) Tokenizer(com.yahoo.prelude.query.parser.Tokenizer) Test(org.junit.Test)

Example 5 with SpecialTokenRegistry

use of com.yahoo.prelude.query.parser.SpecialTokenRegistry in project vespa by vespa-engine.

the class TokenizerTestCase method testSpecialTokenConfigurationMissing.

@Test
public void testSpecialTokenConfigurationMissing() {
    String tokenFile = "file:source/bogus/specialtokens.cfg";
    SpecialTokenRegistry r = new SpecialTokenRegistry(tokenFile);
    Tokenizer tokenizer = new Tokenizer(new SimpleLinguistics());
    tokenizer.setSpecialTokens(r.getSpecialTokens("other"));
    List<?> tokens = tokenizer.tokenize("c++");
    assertEquals(new Token(WORD, "c"), tokens.get(0));
    assertEquals(new Token(PLUS, "+"), tokens.get(1));
    assertEquals(new Token(PLUS, "+"), tokens.get(2));
}
Also used : SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) SpecialTokenRegistry(com.yahoo.prelude.query.parser.SpecialTokenRegistry) Token(com.yahoo.prelude.query.parser.Token) Tokenizer(com.yahoo.prelude.query.parser.Tokenizer) Test(org.junit.Test)

Aggregations

SpecialTokenRegistry (com.yahoo.prelude.query.parser.SpecialTokenRegistry)5 Test (org.junit.Test)4 SimpleLinguistics (com.yahoo.language.simple.SimpleLinguistics)3 Token (com.yahoo.prelude.query.parser.Token)3 Tokenizer (com.yahoo.prelude.query.parser.Tokenizer)3 AndItem (com.yahoo.prelude.query.AndItem)1 CompositeItem (com.yahoo.prelude.query.CompositeItem)1 Item (com.yahoo.prelude.query.Item)1 Query (com.yahoo.search.Query)1 Execution (com.yahoo.search.searchchain.Execution)1 SpecialtokensConfig (com.yahoo.vespa.configdefinition.SpecialtokensConfig)1 Tokenlist (com.yahoo.vespa.configdefinition.SpecialtokensConfig.Tokenlist)1 Tokens (com.yahoo.vespa.configdefinition.SpecialtokensConfig.Tokenlist.Tokens)1