Search in sources :

Example 1 with DelimitedUTF8StringBinaryTokenizer

use of org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizer in project asterixdb by apache.

the class HashedWordTokensDescriptor method createEvaluatorFactory.

@Override
public IScalarEvaluatorFactory createEvaluatorFactory(final IScalarEvaluatorFactory[] args) {
    return new IScalarEvaluatorFactory() {

        private static final long serialVersionUID = 1L;

        @Override
        public IScalarEvaluator createScalarEvaluator(IHyracksTaskContext ctx) throws HyracksDataException {
            ITokenFactory tokenFactory = new HashedUTF8WordTokenFactory();
            IBinaryTokenizer tokenizer = new DelimitedUTF8StringBinaryTokenizer(true, true, tokenFactory);
            return new WordTokensEvaluator(args, ctx, tokenizer, BuiltinType.AINT32);
        }
    };
}
Also used : HashedUTF8WordTokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.HashedUTF8WordTokenFactory) DelimitedUTF8StringBinaryTokenizer(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizer) IHyracksTaskContext(org.apache.hyracks.api.context.IHyracksTaskContext) IBinaryTokenizer(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.IBinaryTokenizer) WordTokensEvaluator(org.apache.asterix.runtime.evaluators.common.WordTokensEvaluator) ITokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.ITokenFactory) IScalarEvaluatorFactory(org.apache.hyracks.algebricks.runtime.base.IScalarEvaluatorFactory)

Example 2 with DelimitedUTF8StringBinaryTokenizer

use of org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizer in project asterixdb by apache.

the class WordTokensDescriptor method createEvaluatorFactory.

@Override
public IScalarEvaluatorFactory createEvaluatorFactory(final IScalarEvaluatorFactory[] args) {
    return new IScalarEvaluatorFactory() {

        private static final long serialVersionUID = 1L;

        @Override
        public IScalarEvaluator createScalarEvaluator(IHyracksTaskContext ctx) throws HyracksDataException {
            ITokenFactory tokenFactory = new UTF8WordTokenFactory();
            IBinaryTokenizer tokenizer = new DelimitedUTF8StringBinaryTokenizer(true, true, tokenFactory);
            return new WordTokensEvaluator(args, ctx, tokenizer, BuiltinType.ASTRING);
        }
    };
}
Also used : DelimitedUTF8StringBinaryTokenizer(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizer) IHyracksTaskContext(org.apache.hyracks.api.context.IHyracksTaskContext) IBinaryTokenizer(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.IBinaryTokenizer) WordTokensEvaluator(org.apache.asterix.runtime.evaluators.common.WordTokensEvaluator) ITokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.ITokenFactory) IScalarEvaluatorFactory(org.apache.hyracks.algebricks.runtime.base.IScalarEvaluatorFactory) UTF8WordTokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.UTF8WordTokenFactory)

Example 3 with DelimitedUTF8StringBinaryTokenizer

use of org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizer in project asterixdb by apache.

the class CountHashedWordTokensDescriptor method createEvaluatorFactory.

@Override
public IScalarEvaluatorFactory createEvaluatorFactory(final IScalarEvaluatorFactory[] args) {
    return new IScalarEvaluatorFactory() {

        private static final long serialVersionUID = 1L;

        @Override
        public IScalarEvaluator createScalarEvaluator(IHyracksTaskContext ctx) throws HyracksDataException {
            ITokenFactory tokenFactory = new HashedUTF8WordTokenFactory();
            IBinaryTokenizer tokenizer = new DelimitedUTF8StringBinaryTokenizer(false, true, tokenFactory);
            return new WordTokensEvaluator(args, ctx, tokenizer, BuiltinType.AINT32);
        }
    };
}
Also used : HashedUTF8WordTokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.HashedUTF8WordTokenFactory) DelimitedUTF8StringBinaryTokenizer(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizer) IHyracksTaskContext(org.apache.hyracks.api.context.IHyracksTaskContext) IBinaryTokenizer(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.IBinaryTokenizer) WordTokensEvaluator(org.apache.asterix.runtime.evaluators.common.WordTokensEvaluator) ITokenFactory(org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.ITokenFactory) IScalarEvaluatorFactory(org.apache.hyracks.algebricks.runtime.base.IScalarEvaluatorFactory)

Aggregations

WordTokensEvaluator (org.apache.asterix.runtime.evaluators.common.WordTokensEvaluator)3 IScalarEvaluatorFactory (org.apache.hyracks.algebricks.runtime.base.IScalarEvaluatorFactory)3 IHyracksTaskContext (org.apache.hyracks.api.context.IHyracksTaskContext)3 DelimitedUTF8StringBinaryTokenizer (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.DelimitedUTF8StringBinaryTokenizer)3 IBinaryTokenizer (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.IBinaryTokenizer)3 ITokenFactory (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.ITokenFactory)3 HashedUTF8WordTokenFactory (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.HashedUTF8WordTokenFactory)2 UTF8WordTokenFactory (org.apache.hyracks.storage.am.lsm.invertedindex.tokenizers.UTF8WordTokenFactory)1