Search in sources :

Example 11 with ExpressionTypeManager

use of io.confluent.ksql.execution.util.ExpressionTypeManager in project ksql by confluentinc.

the class SelectionUtil method buildProjectionSchema.

/*
   * The algorithm behind this method feels unnecessarily complicated and is begging
   * for someone to come along and improve it, but until that time here is
   * a description of what's going on.
   *
   * Essentially, we need to build a logical schema that mirrors the physical
   * schema until https://github.com/confluentinc/ksql/issues/6374 is addressed.
   * That means that the keys must be ordered in the same way as the parent schema
   * (e.g. if the source schema was K1 INT KEY, K2 INT KEY and the projection is
   * SELECT K2, K1 this method will produce an output schema that is K1, K2
   * despite the way that the keys were ordered in the projection) - see
   * https://github.com/confluentinc/ksql/pull/7477 for context on the bug.
   *
   * But we cannot simply select all the keys and then the values, we must maintain
   * the interleaving of key and values because transient queries return all columns
   * to the user as "value columns". If someone issues a SELECT VALUE, * FROM FOO
   * it is expected that VALUE shows up _before_ the key fields. This means we need to
   * reorder the key columns within the list of projections without affecting the
   * relative order the keys/values.
   *
   * To spice things up even further, there's the possibility that the same key is
   * aliased multiple times (SELECT K1 AS X, K2 AS Y FROM ...), which is not supported
   * but is verified later when building the final projection - so we maintain it here.
   *
   * Now on to the algorithm itself: we make two passes through the list of projections.
   * The first pass builds a mapping from source key to all the projections for that key.
   * We will use this mapping to sort the keys in the second pass. This mapping is two
   * dimensional to address the possibility of the same key with multiple aliases.
   *
   * The second pass goes through the list of projections again and builds the logical schema,
   * but this time if we encounter a projection that references a key column, we instead take
   * it from the list we built in the first pass (in order defined by the parent schema).
   */
public static LogicalSchema buildProjectionSchema(final LogicalSchema parentSchema, final List<SelectExpression> projection, final FunctionRegistry functionRegistry) {
    final ExpressionTypeManager expressionTypeManager = new ExpressionTypeManager(parentSchema, functionRegistry);
    // keyExpressions[i] represents the expressions found in projection
    // that are associated with parentSchema's key at index i
    final List<List<SelectExpression>> keyExpressions = new ArrayList<>(parentSchema.key().size());
    for (int i = 0; i < parentSchema.key().size(); i++) {
        keyExpressions.add(new ArrayList<>());
    }
    // first pass to construct keyExpressions, keyExpressionMembership
    // is just a convenience data structure so that we don't have to do
    // the isKey check in the second iteration below
    final Set<SelectExpression> keyExpressionMembership = new HashSet<>();
    for (final SelectExpression select : projection) {
        final Expression expression = select.getExpression();
        if (expression instanceof ColumnReferenceExp) {
            final ColumnName name = ((ColumnReferenceExp) expression).getColumnName();
            parentSchema.findColumn(name).filter(c -> c.namespace() == Namespace.KEY).ifPresent(c -> {
                keyExpressions.get(c.index()).add(select);
                keyExpressionMembership.add(select);
            });
        }
    }
    // second pass, which iterates the projections but ignores any key expressions,
    // instead taking them from the ordered keyExpressions list
    final Builder builder = LogicalSchema.builder();
    int currKeyIdx = 0;
    for (final SelectExpression select : projection) {
        if (keyExpressionMembership.contains(select)) {
            while (keyExpressions.get(currKeyIdx).isEmpty()) {
                currKeyIdx++;
            }
            final SelectExpression keyExp = keyExpressions.get(currKeyIdx).remove(0);
            final SqlType type = expressionTypeManager.getExpressionSqlType(keyExp.getExpression());
            builder.keyColumn(keyExp.getAlias(), type);
        } else {
            final Expression expression = select.getExpression();
            final SqlType type = expressionTypeManager.getExpressionSqlType(expression);
            if (type == null) {
                throw new IllegalArgumentException("Can't infer a type of null. Please explicitly cast " + "it to a required type, e.g. CAST(null AS VARCHAR).");
            }
            builder.valueColumn(select.getAlias(), type);
        }
    }
    return builder.build();
}
Also used : IntStream(java.util.stream.IntStream) Expression(io.confluent.ksql.execution.expression.tree.Expression) ColumnName(io.confluent.ksql.name.ColumnName) FunctionRegistry(io.confluent.ksql.function.FunctionRegistry) UnqualifiedColumnReferenceExp(io.confluent.ksql.execution.expression.tree.UnqualifiedColumnReferenceExp) Set(java.util.Set) LogicalSchema(io.confluent.ksql.schema.ksql.LogicalSchema) Collectors(java.util.stream.Collectors) SelectExpression(io.confluent.ksql.execution.plan.SelectExpression) SelectItem(io.confluent.ksql.parser.tree.SelectItem) Namespace(io.confluent.ksql.schema.ksql.Column.Namespace) Builder(io.confluent.ksql.schema.ksql.LogicalSchema.Builder) ArrayList(java.util.ArrayList) HashSet(java.util.HashSet) List(java.util.List) SingleColumn(io.confluent.ksql.parser.tree.SingleColumn) Stream(java.util.stream.Stream) ExpressionTypeManager(io.confluent.ksql.execution.util.ExpressionTypeManager) Optional(java.util.Optional) AllColumns(io.confluent.ksql.parser.tree.AllColumns) ColumnReferenceExp(io.confluent.ksql.execution.expression.tree.ColumnReferenceExp) Column(io.confluent.ksql.schema.ksql.Column) SqlType(io.confluent.ksql.schema.ksql.types.SqlType) ExpressionTypeManager(io.confluent.ksql.execution.util.ExpressionTypeManager) Builder(io.confluent.ksql.schema.ksql.LogicalSchema.Builder) ArrayList(java.util.ArrayList) SelectExpression(io.confluent.ksql.execution.plan.SelectExpression) UnqualifiedColumnReferenceExp(io.confluent.ksql.execution.expression.tree.UnqualifiedColumnReferenceExp) ColumnReferenceExp(io.confluent.ksql.execution.expression.tree.ColumnReferenceExp) ColumnName(io.confluent.ksql.name.ColumnName) Expression(io.confluent.ksql.execution.expression.tree.Expression) SelectExpression(io.confluent.ksql.execution.plan.SelectExpression) ArrayList(java.util.ArrayList) List(java.util.List) SqlType(io.confluent.ksql.schema.ksql.types.SqlType) HashSet(java.util.HashSet)

Example 12 with ExpressionTypeManager

use of io.confluent.ksql.execution.util.ExpressionTypeManager in project ksql by confluentinc.

the class PartitionByParamsFactory method buildSchema.

private static LogicalSchema buildSchema(final LogicalSchema sourceSchema, final List<Expression> partitionBys, final FunctionRegistry functionRegistry, final List<PartitionByColumn> partitionByCols) {
    final ExpressionTypeManager expressionTypeManager = new ExpressionTypeManager(sourceSchema, functionRegistry);
    final List<SqlType> keyTypes = partitionBys.stream().map(expressionTypeManager::getExpressionSqlType).collect(Collectors.toList());
    if (isPartitionByNull(partitionBys)) {
        final Builder builder = LogicalSchema.builder();
        builder.valueColumns(sourceSchema.value());
        return builder.build();
    } else {
        final Builder builder = LogicalSchema.builder();
        for (int i = 0; i < partitionBys.size(); i++) {
            builder.keyColumn(partitionByCols.get(i).name, keyTypes.get(i));
        }
        builder.valueColumns(sourceSchema.value());
        for (int i = 0; i < partitionBys.size(); i++) {
            if (partitionByCols.get(i).shouldAppend) {
                // New key column added, copy in to value schema:
                builder.valueColumn(partitionByCols.get(i).name, keyTypes.get(i));
            }
        }
        return builder.build();
    }
}
Also used : ExpressionTypeManager(io.confluent.ksql.execution.util.ExpressionTypeManager) Builder(io.confluent.ksql.schema.ksql.LogicalSchema.Builder) SqlType(io.confluent.ksql.schema.ksql.types.SqlType)

Example 13 with ExpressionTypeManager

use of io.confluent.ksql.execution.util.ExpressionTypeManager in project ksql by confluentinc.

the class StreamFlatMapBuilder method buildSchema.

public static LogicalSchema buildSchema(final LogicalSchema inputSchema, final List<FunctionCall> tableFunctions, final FunctionRegistry functionRegistry) {
    final LogicalSchema.Builder schemaBuilder = LogicalSchema.builder();
    final List<Column> cols = inputSchema.value();
    // We copy all the original columns to the output schema
    schemaBuilder.keyColumns(inputSchema.key());
    for (final Column col : cols) {
        schemaBuilder.valueColumn(col);
    }
    final ExpressionTypeManager expressionTypeManager = new ExpressionTypeManager(inputSchema, functionRegistry);
    // And add new columns representing the exploded values at the end
    for (int i = 0; i < tableFunctions.size(); i++) {
        final FunctionCall functionCall = tableFunctions.get(i);
        final ColumnName colName = ColumnNames.synthesisedSchemaColumn(i);
        final SqlType fieldType = expressionTypeManager.getExpressionSqlType(functionCall);
        schemaBuilder.valueColumn(colName, fieldType);
    }
    return schemaBuilder.build();
}
Also used : ExpressionTypeManager(io.confluent.ksql.execution.util.ExpressionTypeManager) ColumnName(io.confluent.ksql.name.ColumnName) Column(io.confluent.ksql.schema.ksql.Column) LogicalSchema(io.confluent.ksql.schema.ksql.LogicalSchema) SqlType(io.confluent.ksql.schema.ksql.types.SqlType) FunctionCall(io.confluent.ksql.execution.expression.tree.FunctionCall)

Aggregations

ExpressionTypeManager (io.confluent.ksql.execution.util.ExpressionTypeManager)13 SqlType (io.confluent.ksql.schema.ksql.types.SqlType)11 Expression (io.confluent.ksql.execution.expression.tree.Expression)8 KsqlException (io.confluent.ksql.util.KsqlException)6 Column (io.confluent.ksql.schema.ksql.Column)5 LogicalSchema (io.confluent.ksql.schema.ksql.LogicalSchema)5 List (java.util.List)5 Optional (java.util.Optional)5 FunctionCall (io.confluent.ksql.execution.expression.tree.FunctionCall)4 UnqualifiedColumnReferenceExp (io.confluent.ksql.execution.expression.tree.UnqualifiedColumnReferenceExp)4 Builder (io.confluent.ksql.schema.ksql.LogicalSchema.Builder)4 ImmutableList (com.google.common.collect.ImmutableList)3 ImmutableMap (com.google.common.collect.ImmutableMap)3 Iterables (com.google.common.collect.Iterables)3 SelectExpression (io.confluent.ksql.execution.plan.SelectExpression)3 ColumnName (io.confluent.ksql.name.ColumnName)3 Streams (com.google.common.collect.Streams)2 Immutable (com.google.errorprone.annotations.Immutable)2 SuppressFBWarnings (edu.umd.cs.findbugs.annotations.SuppressFBWarnings)2 ColumnReferenceExp (io.confluent.ksql.execution.expression.tree.ColumnReferenceExp)2