Search in sources :

Example 31 with SimpleLinguistics

use of com.yahoo.language.simple.SimpleLinguistics in project vespa by vespa-engine.

the class DefaultFieldNameTestCase method requireThatDefaultFieldNameIsAppliedWhenArgumentIsMissing.

@Test
public void requireThatDefaultFieldNameIsAppliedWhenArgumentIsMissing() throws ParseException {
    IndexingInput input = new IndexingInput("input");
    InputExpression exp = (InputExpression) Expression.newInstance(new ScriptParserContext(new SimpleLinguistics()).setInputStream(input).setDefaultFieldName("foo"));
    assertEquals("foo", exp.getFieldName());
}
Also used : SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) InputExpression(com.yahoo.vespa.indexinglanguage.expressions.InputExpression) ScriptParserContext(com.yahoo.vespa.indexinglanguage.ScriptParserContext) Test(org.junit.Test)

Example 32 with SimpleLinguistics

use of com.yahoo.language.simple.SimpleLinguistics in project vespa by vespa-engine.

the class NGramTestCase method requireThatHashCodeAndEqualsAreImplemented.

@Test
public void requireThatHashCodeAndEqualsAreImplemented() {
    Linguistics linguistics = new SimpleLinguistics();
    NGramExpression exp = new NGramExpression(linguistics, 69);
    assertFalse(exp.equals(new Object()));
    assertFalse(exp.equals(new NGramExpression(Mockito.mock(Linguistics.class), 96)));
    assertFalse(exp.equals(new NGramExpression(linguistics, 96)));
    assertEquals(exp, new NGramExpression(linguistics, 69));
    assertEquals(exp.hashCode(), new NGramExpression(new SimpleLinguistics(), 69).hashCode());
}
Also used : SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) Linguistics(com.yahoo.language.Linguistics) SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) Test(org.junit.Test)

Example 33 with SimpleLinguistics

use of com.yahoo.language.simple.SimpleLinguistics in project vespa by vespa-engine.

the class NGramTestCase method testNGrams.

@Test
public void testNGrams() {
    ExecutionContext context = new ExecutionContext(new SimpleTestAdapter());
    context.setValue(new StringFieldValue("en gul Bille sang... "));
    new NGramExpression(new SimpleLinguistics(), 3).execute(context);
    StringFieldValue value = (StringFieldValue) context.getValue();
    assertEquals("Grams are pure annotations - field value is unchanged", "en gul Bille sang... ", value.getString());
    SpanTree gramTree = value.getSpanTree(SpanTrees.LINGUISTICS);
    assertNotNull(gramTree);
    SpanList grams = (SpanList) gramTree.getRoot();
    Iterator<SpanNode> i = grams.childIterator();
    // en
    assertSpan(0, 2, true, i, gramTree);
    // <space>
    assertSpan(2, 1, false, i, gramTree);
    // gul
    assertSpan(3, 3, true, i, gramTree);
    // <space>
    assertSpan(6, 1, false, i, gramTree);
    // Bil
    assertSpan(7, 3, true, i, gramTree, "bil");
    assertSpan(8, 3, true, i, gramTree);
    assertSpan(9, 3, true, i, gramTree);
    // <space>
    assertSpan(12, 1, false, i, gramTree);
    assertSpan(13, 3, true, i, gramTree);
    assertSpan(14, 3, true, i, gramTree);
    // <...space>
    assertSpan(17, 4, false, i, gramTree);
    assertFalse(i.hasNext());
}
Also used : SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) SimpleTestAdapter(com.yahoo.vespa.indexinglanguage.SimpleTestAdapter) StringFieldValue(com.yahoo.document.datatypes.StringFieldValue) Test(org.junit.Test)

Example 34 with SimpleLinguistics

use of com.yahoo.language.simple.SimpleLinguistics in project vespa by vespa-engine.

the class TokenizerTestCase method testSpecialTokenConfigurationOther.

@Test
public void testSpecialTokenConfigurationOther() {
    String tokenFile = "file:src/test/java/com/yahoo/prelude/query/parser/test/specialtokens.cfg";
    SpecialTokenRegistry r = new SpecialTokenRegistry(tokenFile);
    assertEquals("Special tokens configured", 6, r.getSpecialTokens("default").size());
    assertEquals("Special tokens configured", 4, r.getSpecialTokens("other").size());
    Tokenizer tokenizer = new Tokenizer(new SimpleLinguistics());
    tokenizer.setSpecialTokens(r.getSpecialTokens("other"));
    List<?> tokens = tokenizer.tokenize("with space,!!!*** [huh] or ------ " + "know, &&&%%% b.s.d.");
    assertEquals(new Token(WORD, "with"), tokens.get(0));
    assertEquals(new Token(SPACE, " "), tokens.get(1));
    assertEquals(new Token(WORD, "space"), tokens.get(2));
    assertEquals(new Token(COMMA, ","), tokens.get(3));
    assertEquals(new Token(WORD, "!!!***"), tokens.get(4));
    assertEquals(new Token(SPACE, " "), tokens.get(5));
    assertEquals(new Token(WORD, "[huh]"), tokens.get(6));
    assertEquals(new Token(SPACE, " "), tokens.get(7));
    assertEquals(new Token(WORD, "or"), tokens.get(8));
    assertEquals(new Token(SPACE, " "), tokens.get(9));
    assertEquals(new Token(WORD, "------"), tokens.get(10));
    assertEquals(new Token(SPACE, " "), tokens.get(11));
    assertEquals(new Token(WORD, "know"), tokens.get(12));
    assertEquals(new Token(COMMA, ","), tokens.get(13));
    assertEquals(new Token(SPACE, " "), tokens.get(14));
    assertEquals(new Token(WORD, "&&&%%%"), tokens.get(15));
    assertEquals(new Token(SPACE, " "), tokens.get(16));
    assertEquals(new Token(WORD, "b"), tokens.get(17));
    assertEquals(new Token(DOT, "."), tokens.get(18));
    assertEquals(new Token(WORD, "s"), tokens.get(19));
    assertEquals(new Token(DOT, "."), tokens.get(20));
    assertEquals(new Token(WORD, "d"), tokens.get(21));
    assertEquals(new Token(DOT, "."), tokens.get(22));
    assertTrue(((Token) tokens.get(10)).isSpecial());
}
Also used : SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) SpecialTokenRegistry(com.yahoo.prelude.query.parser.SpecialTokenRegistry) Token(com.yahoo.prelude.query.parser.Token) Tokenizer(com.yahoo.prelude.query.parser.Tokenizer) Test(org.junit.Test)

Example 35 with SimpleLinguistics

use of com.yahoo.language.simple.SimpleLinguistics in project vespa by vespa-engine.

the class TokenizerTestCase method testSpecialTokenCombination.

@Test
public void testSpecialTokenCombination() {
    Tokenizer tokenizer = new Tokenizer(new SimpleLinguistics());
    tokenizer.setSpecialTokens(createSpecialTokens());
    List<?> tokens = tokenizer.tokenize("c#, c++ or .net know, not tcp/ip");
    assertEquals(new Token(WORD, "c#"), tokens.get(0));
    assertEquals(new Token(COMMA, ","), tokens.get(1));
    assertEquals(new Token(SPACE, " "), tokens.get(2));
    assertEquals(new Token(WORD, "c++"), tokens.get(3));
    assertEquals(new Token(SPACE, " "), tokens.get(4));
    assertEquals(new Token(WORD, "or"), tokens.get(5));
    assertEquals(new Token(SPACE, " "), tokens.get(6));
    assertEquals(new Token(WORD, ".net"), tokens.get(7));
    assertEquals(new Token(SPACE, " "), tokens.get(8));
    assertEquals(new Token(WORD, "know"), tokens.get(9));
    assertEquals(new Token(COMMA, ","), tokens.get(10));
    assertEquals(new Token(SPACE, " "), tokens.get(11));
    assertEquals(new Token(WORD, "not"), tokens.get(12));
    assertEquals(new Token(SPACE, " "), tokens.get(13));
    assertEquals(new Token(WORD, "tcp/ip"), tokens.get(14));
}
Also used : SimpleLinguistics(com.yahoo.language.simple.SimpleLinguistics) Token(com.yahoo.prelude.query.parser.Token) Tokenizer(com.yahoo.prelude.query.parser.Tokenizer) Test(org.junit.Test)

Aggregations

SimpleLinguistics (com.yahoo.language.simple.SimpleLinguistics)42 Test (org.junit.Test)37 Token (com.yahoo.prelude.query.parser.Token)17 Tokenizer (com.yahoo.prelude.query.parser.Tokenizer)17 Linguistics (com.yahoo.language.Linguistics)10 Index (com.yahoo.prelude.Index)7 IndexFacts (com.yahoo.prelude.IndexFacts)7 StringFieldValue (com.yahoo.document.datatypes.StringFieldValue)6 AnnotatorConfig (com.yahoo.vespa.indexinglanguage.linguistics.AnnotatorConfig)5 SpecialTokenRegistry (com.yahoo.prelude.query.parser.SpecialTokenRegistry)3 Query (com.yahoo.search.Query)3 Execution (com.yahoo.search.searchchain.Execution)3 SimpleTestAdapter (com.yahoo.vespa.indexinglanguage.SimpleTestAdapter)3 InputExpression (com.yahoo.vespa.indexinglanguage.expressions.InputExpression)3 Pair (com.yahoo.collections.Pair)2 FieldValue (com.yahoo.document.datatypes.FieldValue)2 IntegerFieldValue (com.yahoo.document.datatypes.IntegerFieldValue)2 RendererRegistry (com.yahoo.search.rendering.RendererRegistry)2 ArithmeticExpression (com.yahoo.vespa.indexinglanguage.expressions.ArithmeticExpression)2 AttributeExpression (com.yahoo.vespa.indexinglanguage.expressions.AttributeExpression)2