Search in sources :

Example 1 with NestedColumn

use of org.apache.flink.table.planner.plan.utils.NestedColumn in project flink by apache.

the class PushProjectIntoTableSourceScanRule method onMatch.

@Override
public void onMatch(RelOptRuleCall call) {
    final LogicalProject project = call.rel(0);
    final LogicalTableScan scan = call.rel(1);
    final TableSourceTable sourceTable = scan.getTable().unwrap(TableSourceTable.class);
    final boolean supportsNestedProjection = supportsNestedProjection(sourceTable.tableSource());
    final int[] refFields = RexNodeExtractor.extractRefInputFields(project.getProjects());
    if (!supportsNestedProjection && refFields.length == scan.getRowType().getFieldCount()) {
        // There is no top-level projection and nested projections aren't supported.
        return;
    }
    final FlinkTypeFactory typeFactory = unwrapTypeFactory(scan);
    final ResolvedSchema schema = sourceTable.contextResolvedTable().getResolvedSchema();
    final RowType producedType = createProducedType(schema, sourceTable.tableSource());
    final NestedSchema projectedSchema = NestedProjectionUtil.build(getProjections(project, scan), typeFactory.buildRelNodeRowType(producedType));
    if (!supportsNestedProjection) {
        for (NestedColumn column : projectedSchema.columns().values()) {
            column.markLeaf();
        }
    }
    final List<SourceAbilitySpec> abilitySpecs = new ArrayList<>();
    final RowType newProducedType = performPushDown(sourceTable, projectedSchema, producedType, abilitySpecs);
    final DynamicTableSource newTableSource = sourceTable.tableSource().copy();
    final SourceAbilityContext context = SourceAbilityContext.from(scan);
    abilitySpecs.forEach(spec -> spec.apply(newTableSource, context));
    final RelDataType newRowType = typeFactory.buildRelNodeRowType(newProducedType);
    final TableSourceTable newSource = sourceTable.copy(newTableSource, newRowType, abilitySpecs.toArray(new SourceAbilitySpec[0]));
    final LogicalTableScan newScan = new LogicalTableScan(scan.getCluster(), scan.getTraitSet(), scan.getHints(), newSource);
    final LogicalProject newProject = project.copy(project.getTraitSet(), newScan, rewriteProjections(call, newSource, projectedSchema), project.getRowType());
    if (ProjectRemoveRule.isTrivial(newProject)) {
        call.transformTo(newScan);
    } else {
        call.transformTo(newProject);
    }
}
Also used : SourceAbilitySpec(org.apache.flink.table.planner.plan.abilities.source.SourceAbilitySpec) ArrayList(java.util.ArrayList) RowType(org.apache.flink.table.types.logical.RowType) NestedColumn(org.apache.flink.table.planner.plan.utils.NestedColumn) RelDataType(org.apache.calcite.rel.type.RelDataType) LogicalTableScan(org.apache.calcite.rel.logical.LogicalTableScan) FlinkTypeFactory(org.apache.flink.table.planner.calcite.FlinkTypeFactory) SourceAbilityContext(org.apache.flink.table.planner.plan.abilities.source.SourceAbilityContext) LogicalProject(org.apache.calcite.rel.logical.LogicalProject) TableSourceTable(org.apache.flink.table.planner.plan.schema.TableSourceTable) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) DynamicTableSource(org.apache.flink.table.connector.source.DynamicTableSource) NestedSchema(org.apache.flink.table.planner.plan.utils.NestedSchema)

Example 2 with NestedColumn

use of org.apache.flink.table.planner.plan.utils.NestedColumn in project flink by apache.

the class PushProjectIntoTableSourceScanRule method performPushDown.

private RowType performPushDown(TableSourceTable source, NestedSchema projectedSchema, RowType producedType, List<SourceAbilitySpec> abilitySpecs) {
    final int numPhysicalColumns;
    final List<NestedColumn> projectedMetadataColumns;
    if (supportsMetadata(source.tableSource())) {
        final List<String> declaredMetadataKeys = createRequiredMetadataKeys(source.contextResolvedTable().getResolvedSchema(), source.tableSource());
        numPhysicalColumns = producedType.getFieldCount() - declaredMetadataKeys.size();
        projectedMetadataColumns = IntStream.range(0, declaredMetadataKeys.size()).mapToObj(i -> producedType.getFieldNames().get(numPhysicalColumns + i)).map(fieldName -> projectedSchema.columns().get(fieldName)).filter(Objects::nonNull).collect(Collectors.toList());
    } else {
        numPhysicalColumns = producedType.getFieldCount();
        projectedMetadataColumns = Collections.emptyList();
    }
    final int[][] physicalProjections;
    if (supportsProjectionPushDown(source.tableSource())) {
        projectedMetadataColumns.forEach(metaColumn -> projectedSchema.columns().remove(metaColumn.name()));
        physicalProjections = NestedProjectionUtil.convertToIndexArray(projectedSchema);
        projectedMetadataColumns.forEach(metaColumn -> projectedSchema.columns().put(metaColumn.name(), metaColumn));
    } else {
        physicalProjections = IntStream.range(0, numPhysicalColumns).mapToObj(columnIndex -> new int[] { columnIndex }).toArray(int[][]::new);
    }
    final int[][] projectedFields = Stream.concat(Stream.of(physicalProjections), projectedMetadataColumns.stream().map(NestedColumn::indexInOriginSchema).map(columnIndex -> new int[] { columnIndex })).toArray(int[][]::new);
    int newIndex = physicalProjections.length;
    for (NestedColumn metaColumn : projectedMetadataColumns) {
        metaColumn.setIndexOfLeafInNewSchema(newIndex++);
    }
    if (supportsProjectionPushDown(source.tableSource())) {
        final RowType projectedPhysicalType = (RowType) Projection.of(physicalProjections).project(producedType);
        abilitySpecs.add(new ProjectPushDownSpec(physicalProjections, projectedPhysicalType));
    }
    final RowType newProducedType = (RowType) Projection.of(projectedFields).project(producedType);
    if (supportsMetadata(source.tableSource())) {
        final List<String> projectedMetadataKeys = projectedMetadataColumns.stream().map(NestedColumn::name).collect(Collectors.toList());
        abilitySpecs.add(new ReadingMetadataSpec(projectedMetadataKeys, newProducedType));
    }
    return newProducedType;
}
Also used : IntStream(java.util.stream.IntStream) NestedProjectionUtil(org.apache.flink.table.planner.plan.utils.NestedProjectionUtil) Arrays(java.util.Arrays) ShortcutUtils.unwrapTypeFactory(org.apache.flink.table.planner.utils.ShortcutUtils.unwrapTypeFactory) SourceAbilityContext(org.apache.flink.table.planner.plan.abilities.source.SourceAbilityContext) Column(org.apache.flink.table.catalog.Column) ResolvedSchema(org.apache.flink.table.catalog.ResolvedSchema) RexNodeExtractor(org.apache.flink.table.planner.plan.utils.RexNodeExtractor) FlinkTypeFactory(org.apache.flink.table.planner.calcite.FlinkTypeFactory) RowType(org.apache.flink.table.types.logical.RowType) SupportsProjectionPushDown(org.apache.flink.table.connector.source.abilities.SupportsProjectionPushDown) ArrayList(java.util.ArrayList) RexNode(org.apache.calcite.rex.RexNode) NestedSchema(org.apache.flink.table.planner.plan.utils.NestedSchema) Projection(org.apache.flink.table.connector.Projection) ProjectRemoveRule(org.apache.calcite.rel.rules.ProjectRemoveRule) DynamicSourceUtils.createProducedType(org.apache.flink.table.planner.connectors.DynamicSourceUtils.createProducedType) RelDataType(org.apache.calcite.rel.type.RelDataType) DynamicTableSource(org.apache.flink.table.connector.source.DynamicTableSource) TableConfig(org.apache.flink.table.api.TableConfig) LogicalProject(org.apache.calcite.rel.logical.LogicalProject) ProjectPushDownSpec(org.apache.flink.table.planner.plan.abilities.source.ProjectPushDownSpec) TableException(org.apache.flink.table.api.TableException) ShortcutUtils.unwrapContext(org.apache.flink.table.planner.utils.ShortcutUtils.unwrapContext) RelRule(org.apache.calcite.plan.RelRule) NestedColumn(org.apache.flink.table.planner.plan.utils.NestedColumn) Collectors(java.util.stream.Collectors) DynamicSourceUtils.createRequiredMetadataKeys(org.apache.flink.table.planner.connectors.DynamicSourceUtils.createRequiredMetadataKeys) SourceAbilitySpec(org.apache.flink.table.planner.plan.abilities.source.SourceAbilitySpec) TableSourceTable(org.apache.flink.table.planner.plan.schema.TableSourceTable) RelOptRuleCall(org.apache.calcite.plan.RelOptRuleCall) RexInputRef(org.apache.calcite.rex.RexInputRef) Objects(java.util.Objects) DynamicSourceUtils(org.apache.flink.table.planner.connectors.DynamicSourceUtils) RelOptRule(org.apache.calcite.plan.RelOptRule) List(java.util.List) Stream(java.util.stream.Stream) UniqueConstraint(org.apache.flink.table.catalog.UniqueConstraint) SupportsReadingMetadata(org.apache.flink.table.connector.source.abilities.SupportsReadingMetadata) ReadingMetadataSpec(org.apache.flink.table.planner.plan.abilities.source.ReadingMetadataSpec) Internal(org.apache.flink.annotation.Internal) Collections(java.util.Collections) LogicalTableScan(org.apache.calcite.rel.logical.LogicalTableScan) ProjectPushDownSpec(org.apache.flink.table.planner.plan.abilities.source.ProjectPushDownSpec) Objects(java.util.Objects) NestedColumn(org.apache.flink.table.planner.plan.utils.NestedColumn) RowType(org.apache.flink.table.types.logical.RowType) ReadingMetadataSpec(org.apache.flink.table.planner.plan.abilities.source.ReadingMetadataSpec) UniqueConstraint(org.apache.flink.table.catalog.UniqueConstraint)

Example 3 with NestedColumn

use of org.apache.flink.table.planner.plan.utils.NestedColumn in project flink by apache.

the class ProjectWatermarkAssignerTransposeRule method onMatch.

@Override
public void onMatch(RelOptRuleCall call) {
    LogicalProject project = call.rel(0);
    LogicalWatermarkAssigner watermarkAssigner = call.rel(1);
    // NOTES: DON'T use the nestedSchema datatype to build the transposed project.
    NestedSchema nestedSchema = getUsedFieldsInTopLevelProjectAndWatermarkAssigner(project, watermarkAssigner);
    FlinkRelBuilder builder = (FlinkRelBuilder) call.builder().push(watermarkAssigner.getInput());
    List<RexInputRef> transposedProjects = new LinkedList<>();
    List<String> usedNames = new LinkedList<>();
    // add the used column RexInputRef and names into list
    for (NestedColumn column : nestedSchema.columns().values()) {
        // mark by hand
        column.setIndexOfLeafInNewSchema(transposedProjects.size());
        column.markLeaf();
        usedNames.add(column.name());
        transposedProjects.add(builder.field(column.indexInOriginSchema()));
    }
    // get the rowtime field index in the transposed project
    String rowTimeName = watermarkAssigner.getRowType().getFieldNames().get(watermarkAssigner.rowtimeFieldIndex());
    int indexOfRowTimeInTransposedProject;
    if (nestedSchema.columns().get(rowTimeName) == null) {
        // push the RexInputRef of the rowtime into the list
        int rowTimeIndexInInput = watermarkAssigner.rowtimeFieldIndex();
        indexOfRowTimeInTransposedProject = transposedProjects.size();
        transposedProjects.add(builder.field(rowTimeIndexInInput));
        usedNames.add(rowTimeName);
    } else {
        // find rowtime ref in the list and mark the location
        indexOfRowTimeInTransposedProject = nestedSchema.columns().get(rowTimeName).indexOfLeafInNewSchema();
    }
    // the rowtime column has no rowtime indicator
    builder.project(transposedProjects, usedNames);
    // rewrite the top level field reference
    RexNode newWatermarkExpr = watermarkAssigner.watermarkExpr().accept(new RexShuttle() {

        @Override
        public RexNode visitInputRef(RexInputRef inputRef) {
            String fieldName = watermarkAssigner.getRowType().getFieldNames().get(inputRef.getIndex());
            return builder.field(nestedSchema.columns().get(fieldName).indexOfLeafInNewSchema());
        }
    });
    builder.watermark(indexOfRowTimeInTransposedProject, newWatermarkExpr);
    List<RexNode> newProjects = NestedProjectionUtil.rewrite(project.getProjects(), nestedSchema, call.builder().getRexBuilder());
    RelNode newProject = builder.project(newProjects, project.getRowType().getFieldNames()).build();
    call.transformTo(newProject);
}
Also used : RexShuttle(org.apache.calcite.rex.RexShuttle) NestedColumn(org.apache.flink.table.planner.plan.utils.NestedColumn) LinkedList(java.util.LinkedList) RelNode(org.apache.calcite.rel.RelNode) LogicalWatermarkAssigner(org.apache.flink.table.planner.plan.nodes.calcite.LogicalWatermarkAssigner) FlinkRelBuilder(org.apache.flink.table.planner.calcite.FlinkRelBuilder) RexInputRef(org.apache.calcite.rex.RexInputRef) LogicalProject(org.apache.calcite.rel.logical.LogicalProject) NestedSchema(org.apache.flink.table.planner.plan.utils.NestedSchema) RexNode(org.apache.calcite.rex.RexNode)

Aggregations

LogicalProject (org.apache.calcite.rel.logical.LogicalProject)3 NestedColumn (org.apache.flink.table.planner.plan.utils.NestedColumn)3 NestedSchema (org.apache.flink.table.planner.plan.utils.NestedSchema)3 ArrayList (java.util.ArrayList)2 LogicalTableScan (org.apache.calcite.rel.logical.LogicalTableScan)2 RelDataType (org.apache.calcite.rel.type.RelDataType)2 RexInputRef (org.apache.calcite.rex.RexInputRef)2 RexNode (org.apache.calcite.rex.RexNode)2 ResolvedSchema (org.apache.flink.table.catalog.ResolvedSchema)2 DynamicTableSource (org.apache.flink.table.connector.source.DynamicTableSource)2 FlinkTypeFactory (org.apache.flink.table.planner.calcite.FlinkTypeFactory)2 SourceAbilityContext (org.apache.flink.table.planner.plan.abilities.source.SourceAbilityContext)2 SourceAbilitySpec (org.apache.flink.table.planner.plan.abilities.source.SourceAbilitySpec)2 TableSourceTable (org.apache.flink.table.planner.plan.schema.TableSourceTable)2 RowType (org.apache.flink.table.types.logical.RowType)2 Arrays (java.util.Arrays)1 Collections (java.util.Collections)1 LinkedList (java.util.LinkedList)1 List (java.util.List)1 Objects (java.util.Objects)1