Search in sources :

Example 1 with DelimitedUTF8StringBinaryTokenizerFactory

use of org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizerFactory in project asterixdb by apache.

the class LSMInvertedIndexTestUtils method createWordInvIndexTestContext.

public static LSMInvertedIndexTestContext createWordInvIndexTestContext(LSMInvertedIndexTestHarness harness, InvertedIndexType invIndexType) throws IOException, HyracksDataException {
    ISerializerDeserializer[] fieldSerdes = getNonHashedIndexFieldSerdes(invIndexType);
    ITokenFactory tokenFactory = new UTF8WordTokenFactory();
    IBinaryTokenizerFactory tokenizerFactory = new DelimitedUTF8StringBinaryTokenizerFactory(true, false, tokenFactory);
    LSMInvertedIndexTestContext testCtx = LSMInvertedIndexTestContext.create(harness, fieldSerdes, fieldSerdes.length - 1, tokenizerFactory, invIndexType, null, null, null, null, null, null);
    return testCtx;
}
Also used : DelimitedUTF8StringBinaryTokenizerFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizerFactory) IBinaryTokenizerFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.IBinaryTokenizerFactory) ITokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.ITokenFactory) ISerializerDeserializer(org.apache.hyracks.api.dataflow.value.ISerializerDeserializer) UTF8WordTokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.UTF8WordTokenFactory) HashedUTF8WordTokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.HashedUTF8WordTokenFactory)

Example 2 with DelimitedUTF8StringBinaryTokenizerFactory

use of org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizerFactory in project asterixdb by apache.

the class LSMInvertedIndexTestUtils method createHashedWordInvIndexTestContext.

public static LSMInvertedIndexTestContext createHashedWordInvIndexTestContext(LSMInvertedIndexTestHarness harness, InvertedIndexType invIndexType) throws IOException, HyracksDataException {
    ISerializerDeserializer[] fieldSerdes = getHashedIndexFieldSerdes(invIndexType);
    ITokenFactory tokenFactory = new HashedUTF8WordTokenFactory();
    IBinaryTokenizerFactory tokenizerFactory = new DelimitedUTF8StringBinaryTokenizerFactory(true, false, tokenFactory);
    LSMInvertedIndexTestContext testCtx = LSMInvertedIndexTestContext.create(harness, fieldSerdes, fieldSerdes.length - 1, tokenizerFactory, invIndexType, null, null, null, null, null, null);
    return testCtx;
}
Also used : DelimitedUTF8StringBinaryTokenizerFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizerFactory) HashedUTF8WordTokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.HashedUTF8WordTokenFactory) IBinaryTokenizerFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.IBinaryTokenizerFactory) ITokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.ITokenFactory) ISerializerDeserializer(org.apache.hyracks.api.dataflow.value.ISerializerDeserializer)

Aggregations

ISerializerDeserializer (org.apache.hyracks.api.dataflow.value.ISerializerDeserializer)2 DelimitedUTF8StringBinaryTokenizerFactory (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizerFactory)2 HashedUTF8WordTokenFactory (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.HashedUTF8WordTokenFactory)2 IBinaryTokenizerFactory (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.IBinaryTokenizerFactory)2 ITokenFactory (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.ITokenFactory)2 UTF8WordTokenFactory (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.UTF8WordTokenFactory)1