Search in sources :

Example 11 with KuduPredicate

use of org.apache.kudu.client.KuduPredicate in project hive by apache.

the class TestKuduPredicateHandler method testNullablePredicates.

@Test
public void testNullablePredicates() throws Exception {
    PrimitiveTypeInfo typeInfo = toHiveType(Type.STRING, null);
    ExprNodeDesc colExpr = new ExprNodeColumnDesc(typeInfo, "null", null, false);
    List<ExprNodeDesc> children = Lists.newArrayList();
    children.add(colExpr);
    for (GenericUDF udf : NULLABLE_UDFS) {
        ExprNodeGenericFuncDesc predicateExpr = new ExprNodeGenericFuncDesc(typeInfo, udf, children);
        // Verify KuduPredicateHandler.decompose
        HiveStoragePredicateHandler.DecomposedPredicate decompose = KuduPredicateHandler.decompose(predicateExpr, SCHEMA);
        // See note in KuduPredicateHandler.newAnalyzer.
        assertNull(decompose);
        List<KuduPredicate> predicates = expressionToPredicates(predicateExpr);
        assertEquals(1, predicates.size());
        scanWithPredicates(predicates);
    }
}
Also used : HiveStoragePredicateHandler(org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler) GenericUDF(org.apache.hadoop.hive.ql.udf.generic.GenericUDF) ExprNodeColumnDesc(org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc) ExprNodeGenericFuncDesc(org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) ExprNodeDesc(org.apache.hadoop.hive.ql.plan.ExprNodeDesc) PrimitiveTypeInfo(org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo) KuduPredicate(org.apache.kudu.client.KuduPredicate) Test(org.junit.Test)

Example 12 with KuduPredicate

use of org.apache.kudu.client.KuduPredicate in project hive by apache.

the class KuduInputFormat method computeSplits.

private List<KuduInputSplit> computeSplits(Configuration conf) throws IOException {
    try (KuduClient client = KuduHiveUtils.getKuduClient(conf)) {
        // Hive depends on FileSplits so we get the dummy Path for the Splits.
        Job job = Job.getInstance(conf);
        JobContext jobContext = ShimLoader.getHadoopShims().newJobContext(job);
        Path[] paths = FileInputFormat.getInputPaths(jobContext);
        Path dummyPath = paths[0];
        String tableName = conf.get(KUDU_TABLE_NAME_KEY);
        if (StringUtils.isEmpty(tableName)) {
            throw new IllegalArgumentException(KUDU_TABLE_NAME_KEY + " is not set.");
        }
        if (!client.tableExists(tableName)) {
            throw new IllegalArgumentException("Kudu table does not exist: " + tableName);
        }
        KuduTable table = client.openTable(tableName);
        List<KuduPredicate> predicates = KuduPredicateHandler.getPredicates(conf, table.getSchema());
        KuduScanToken.KuduScanTokenBuilder tokenBuilder = client.newScanTokenBuilder(table).setProjectedColumnNames(getProjectedColumns(conf));
        for (KuduPredicate predicate : predicates) {
            tokenBuilder.addPredicate(predicate);
        }
        List<KuduScanToken> tokens = tokenBuilder.build();
        List<KuduInputSplit> splits = new ArrayList<>(tokens.size());
        for (KuduScanToken token : tokens) {
            List<String> locations = new ArrayList<>(token.getTablet().getReplicas().size());
            for (LocatedTablet.Replica replica : token.getTablet().getReplicas()) {
                locations.add(replica.getRpcHost());
            }
            splits.add(new KuduInputSplit(token, dummyPath, locations.toArray(new String[0])));
        }
        return splits;
    }
}
Also used : Path(org.apache.hadoop.fs.Path) KuduScanToken(org.apache.kudu.client.KuduScanToken) ArrayList(java.util.ArrayList) KuduTable(org.apache.kudu.client.KuduTable) LocatedTablet(org.apache.kudu.client.LocatedTablet) KuduPredicate(org.apache.kudu.client.KuduPredicate) KuduClient(org.apache.kudu.client.KuduClient) JobContext(org.apache.hadoop.mapreduce.JobContext) Job(org.apache.hadoop.mapreduce.Job)

Example 13 with KuduPredicate

use of org.apache.kudu.client.KuduPredicate in project hive by apache.

the class TestKuduPredicateHandler method scanWithPredicates.

private void scanWithPredicates(List<KuduPredicate> predicates) throws KuduException {
    // Scan the table with the predicate to be sure there are no exceptions.
    KuduClient client = harness.getClient();
    KuduTable table = client.openTable(TABLE_NAME);
    KuduScanner.KuduScannerBuilder builder = client.newScannerBuilder(table);
    for (KuduPredicate predicate : predicates) {
        builder.addPredicate(predicate);
    }
    KuduScanner scanner = builder.build();
    while (scanner.hasMoreRows()) {
        scanner.nextRows();
    }
}
Also used : KuduScanner(org.apache.kudu.client.KuduScanner) KuduClient(org.apache.kudu.client.KuduClient) KuduTable(org.apache.kudu.client.KuduTable) KuduPredicate(org.apache.kudu.client.KuduPredicate)

Example 14 with KuduPredicate

use of org.apache.kudu.client.KuduPredicate in project apex-malhar by apache.

the class AbstractKuduPartitionScanner method preparePlanForScanners.

/**
 * The main logic which takes the parsed in query and builds the Kudud scan tokens specific to this query.
 * It makes sure that these scan tokens are sorted before the actual scan tokens that are to be executed in the
 * current physical instance of the operator are shortlisted. Since the kudu scan taken builder gives the scan
 * tokens for the query and does not differentiate between a distributed system and a single instance system, this
 * method takes the plan as generated by the Kudu scan token builder and then chooses only those segments that were
 * decided to be the responsibility of this operator at partitioning time.
 * @param parsedQuery The parsed query instance
 * @return A list of partition scan metadata objects that are applicable for this instance of the physical operator
 * i.e. the operator owning this instance of the scanner.
 * @throws IOException If the scan assignment cannot be serialized
 */
public List<KuduPartitionScanAssignmentMeta> preparePlanForScanners(SQLToKuduPredicatesTranslator parsedQuery) throws IOException {
    List<KuduPredicate> predicateList = parsedQuery.getKuduSQLParseTreeListener().getKuduPredicateList();
    // we will have atleast one connection
    ApexKuduConnection apexKuduConnection = verifyConnectionStaleness(0);
    KuduScanToken.KuduScanTokenBuilder builder = apexKuduConnection.getKuduClient().newScanTokenBuilder(apexKuduConnection.getKuduTable());
    builder = builder.setProjectedColumnNames(new ArrayList<>(parsedQuery.getKuduSQLParseTreeListener().getListOfColumnsUsed()));
    for (KuduPredicate aPredicate : predicateList) {
        builder = builder.addPredicate(aPredicate);
    }
    builder.setFaultTolerant(parentOperator.isFaultTolerantScanner());
    Map<String, String> optionsUsedForThisQuery = parentOperator.getOptionsEnabledForCurrentQuery();
    if (optionsUsedForThisQuery.containsKey(KuduSQLParseTreeListener.READ_SNAPSHOT_TIME)) {
        try {
            long readSnapShotTime = Long.valueOf(optionsUsedForThisQuery.get(KuduSQLParseTreeListener.READ_SNAPSHOT_TIME));
            builder = builder.readMode(AsyncKuduScanner.ReadMode.READ_AT_SNAPSHOT);
            builder = builder.snapshotTimestampMicros(readSnapShotTime);
            LOG.info("Using read snapshot for this query as " + readSnapShotTime);
        } catch (Exception ex) {
            LOG.error("Cannot parse the Read snaptshot time " + ex.getMessage(), ex);
        }
    }
    List<KuduScanToken> allPossibleScanTokens = builder.build();
    // Make sure we deal with a sorted list of scan tokens
    Collections.sort(// Make sure we deal with a sorted list of scan tokens
    allPossibleScanTokens, new Comparator<KuduScanToken>() {

        @Override
        public int compare(KuduScanToken left, KuduScanToken right) {
            return left.compareTo(right);
        }
    });
    LOG.info(" Query will scan " + allPossibleScanTokens.size() + " tablets");
    if (LOG.isDebugEnabled()) {
        LOG.debug(" Predicates scheduled for this query are " + predicateList.size());
        for (int i = 0; i < allPossibleScanTokens.size(); i++) {
            LOG.debug("A tablet scheduled for all operators scanning is " + allPossibleScanTokens.get(i).getTablet());
        }
    }
    List<KuduPartitionScanAssignmentMeta> partitionPieForThisOperator = parentOperator.getPartitionPieAssignment();
    List<KuduPartitionScanAssignmentMeta> returnOfAssignments = new ArrayList<>();
    int totalScansForThisQuery = allPossibleScanTokens.size();
    int counterForPartAssignments = 0;
    for (KuduPartitionScanAssignmentMeta aPartofThePie : partitionPieForThisOperator) {
        if (aPartofThePie.getOrdinal() < totalScansForThisQuery) {
            // a given query plan might have less scantokens
            KuduPartitionScanAssignmentMeta aMetaForThisQuery = new KuduPartitionScanAssignmentMeta();
            aMetaForThisQuery.setTotalSize(totalScansForThisQuery);
            aMetaForThisQuery.setOrdinal(counterForPartAssignments);
            counterForPartAssignments += 1;
            aMetaForThisQuery.setCurrentQuery(parsedQuery.getSqlExpresssion());
            // we pick up only those ordinals that are part of the original partition pie assignment
            KuduScanToken aTokenForThisOperator = allPossibleScanTokens.get(aPartofThePie.getOrdinal());
            aMetaForThisQuery.setSerializedKuduScanToken(aTokenForThisOperator.serialize());
            returnOfAssignments.add(aMetaForThisQuery);
            LOG.debug("Added query scan for this operator " + aMetaForThisQuery + " with scan tablet as " + allPossibleScanTokens.get(aPartofThePie.getOrdinal()).getTablet());
        }
    }
    LOG.info(" A total of " + returnOfAssignments.size() + " have been scheduled for this operator");
    return returnOfAssignments;
}
Also used : KuduScanToken(org.apache.kudu.client.KuduScanToken) ApexKuduConnection(org.apache.apex.malhar.kudu.ApexKuduConnection) ArrayList(java.util.ArrayList) KuduPredicate(org.apache.kudu.client.KuduPredicate) IOException(java.io.IOException)

Example 15 with KuduPredicate

use of org.apache.kudu.client.KuduPredicate in project presto by prestodb.

the class KuduClientSession method addConstraintPredicates.

/**
 * translates TupleDomain to KuduPredicates.
 *
 * @return false if TupleDomain or one of its domains is none
 */
private boolean addConstraintPredicates(KuduTable table, KuduScanToken.KuduScanTokenBuilder builder, TupleDomain<ColumnHandle> constraintSummary) {
    if (constraintSummary.isNone()) {
        return false;
    } else if (!constraintSummary.isAll()) {
        Schema schema = table.getSchema();
        for (TupleDomain.ColumnDomain<ColumnHandle> columnDomain : constraintSummary.getColumnDomains().get()) {
            int position = ((KuduColumnHandle) columnDomain.getColumn()).getOrdinalPosition();
            ColumnSchema columnSchema = schema.getColumnByIndex(position);
            Domain domain = columnDomain.getDomain();
            if (domain.isNone()) {
                return false;
            } else if (domain.isAll()) {
            // no restriction
            } else if (domain.isOnlyNull()) {
                builder.addPredicate(KuduPredicate.newIsNullPredicate(columnSchema));
            } else if (domain.getValues().isAll() && domain.isNullAllowed()) {
                builder.addPredicate(KuduPredicate.newIsNotNullPredicate(columnSchema));
            } else if (domain.isSingleValue()) {
                KuduPredicate predicate = createEqualsPredicate(columnSchema, domain.getSingleValue());
                builder.addPredicate(predicate);
            } else {
                ValueSet valueSet = domain.getValues();
                if (valueSet instanceof EquatableValueSet) {
                    DiscreteValues discreteValues = valueSet.getDiscreteValues();
                    KuduPredicate predicate = createInListPredicate(columnSchema, discreteValues);
                    builder.addPredicate(predicate);
                } else if (valueSet instanceof SortedRangeSet) {
                    Ranges ranges = ((SortedRangeSet) valueSet).getRanges();
                    Range span = ranges.getSpan();
                    Marker low = span.getLow();
                    if (!low.isLowerUnbounded()) {
                        KuduPredicate.ComparisonOp op = (low.getBound() == Marker.Bound.ABOVE) ? KuduPredicate.ComparisonOp.GREATER : KuduPredicate.ComparisonOp.GREATER_EQUAL;
                        KuduPredicate predicate = createComparisonPredicate(columnSchema, op, low.getValue());
                        builder.addPredicate(predicate);
                    }
                    Marker high = span.getHigh();
                    if (!high.isUpperUnbounded()) {
                        KuduPredicate.ComparisonOp op = (low.getBound() == Marker.Bound.BELOW) ? KuduPredicate.ComparisonOp.LESS : KuduPredicate.ComparisonOp.LESS_EQUAL;
                        KuduPredicate predicate = createComparisonPredicate(columnSchema, op, high.getValue());
                        builder.addPredicate(predicate);
                    }
                } else {
                    throw new IllegalStateException("Unexpected domain: " + domain);
                }
            }
        }
    }
    return true;
}
Also used : Ranges(com.facebook.presto.common.predicate.Ranges) Schema(org.apache.kudu.Schema) ColumnSchema(org.apache.kudu.ColumnSchema) EquatableValueSet(com.facebook.presto.common.predicate.EquatableValueSet) ColumnSchema(org.apache.kudu.ColumnSchema) Marker(com.facebook.presto.common.predicate.Marker) Range(com.facebook.presto.common.predicate.Range) KuduPredicate(org.apache.kudu.client.KuduPredicate) SortedRangeSet(com.facebook.presto.common.predicate.SortedRangeSet) DiscreteValues(com.facebook.presto.common.predicate.DiscreteValues) Domain(com.facebook.presto.common.predicate.Domain) TupleDomain(com.facebook.presto.common.predicate.TupleDomain) EquatableValueSet(com.facebook.presto.common.predicate.EquatableValueSet) ValueSet(com.facebook.presto.common.predicate.ValueSet)

Aggregations

KuduPredicate (org.apache.kudu.client.KuduPredicate)16 ColumnSchema (org.apache.kudu.ColumnSchema)10 HiveStoragePredicateHandler (org.apache.hadoop.hive.ql.metadata.HiveStoragePredicateHandler)7 ExprNodeColumnDesc (org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc)7 ExprNodeDesc (org.apache.hadoop.hive.ql.plan.ExprNodeDesc)7 ExprNodeGenericFuncDesc (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)7 PrimitiveTypeInfo (org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo)7 Test (org.junit.Test)7 ArrayList (java.util.ArrayList)6 ExprNodeConstantDesc (org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc)6 KuduScanner (org.apache.kudu.client.KuduScanner)6 IOException (java.io.IOException)5 GoraException (org.apache.gora.util.GoraException)4 GenericUDFOPNot (org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPNot)4 KuduException (org.apache.kudu.client.KuduException)4 GenericUDF (org.apache.hadoop.hive.ql.udf.generic.GenericUDF)3 GenericUDFOPEqualOrGreaterThan (org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrGreaterThan)3 RowResult (org.apache.kudu.client.RowResult)3 GenericUDFOPAnd (org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPAnd)2 GenericUDFOPEqualOrLessThan (org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPEqualOrLessThan)2