Search in sources :

Example 46 with DualInputPlanNode

use of org.apache.flink.optimizer.plan.DualInputPlanNode in project flink by apache.

the class AdditionalOperatorsTest method testCrossWithSmall.

@Test
public void testCrossWithSmall() {
    // construct the plan
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(DEFAULT_PARALLELISM);
    DataSet<Long> set1 = env.generateSequence(0, 1);
    DataSet<Long> set2 = env.generateSequence(0, 1);
    set1.crossWithTiny(set2).name("Cross").output(new DiscardingOutputFormat<Tuple2<Long, Long>>());
    try {
        Plan plan = env.createProgramPlan();
        OptimizedPlan oPlan = compileWithStats(plan);
        OptimizerPlanNodeResolver resolver = new OptimizerPlanNodeResolver(oPlan);
        DualInputPlanNode crossPlanNode = resolver.getNode("Cross");
        Channel in1 = crossPlanNode.getInput1();
        Channel in2 = crossPlanNode.getInput2();
        assertEquals(ShipStrategyType.FORWARD, in1.getShipStrategy());
        assertEquals(ShipStrategyType.BROADCAST, in2.getShipStrategy());
    } catch (CompilerException ce) {
        ce.printStackTrace();
        fail("The Flink optimizer is unable to compile this plan correctly.");
    }
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Tuple2(org.apache.flink.api.java.tuple.Tuple2) Channel(org.apache.flink.optimizer.plan.Channel) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Test(org.junit.Test)

Example 47 with DualInputPlanNode

use of org.apache.flink.optimizer.plan.DualInputPlanNode in project flink by apache.

the class TwoInputNode method instantiate.

protected void instantiate(OperatorDescriptorDual operator, Channel in1, Channel in2, List<Set<? extends NamedChannel>> broadcastPlanChannels, List<PlanNode> target, CostEstimator estimator, RequestedGlobalProperties globPropsReq1, RequestedGlobalProperties globPropsReq2, RequestedLocalProperties locPropsReq1, RequestedLocalProperties locPropsReq2) {
    final PlanNode inputSource1 = in1.getSource();
    final PlanNode inputSource2 = in2.getSource();
    for (List<NamedChannel> broadcastChannelsCombination : Sets.cartesianProduct(broadcastPlanChannels)) {
        boolean validCombination = true;
        // check whether the broadcast inputs use the same plan candidate at the branching point
        for (int i = 0; i < broadcastChannelsCombination.size(); i++) {
            NamedChannel nc = broadcastChannelsCombination.get(i);
            PlanNode bcSource = nc.getSource();
            if (!(areBranchCompatible(bcSource, inputSource1) || areBranchCompatible(bcSource, inputSource2))) {
                validCombination = false;
                break;
            }
            // check branch compatibility against all other broadcast variables
            for (int k = 0; k < i; k++) {
                PlanNode otherBcSource = broadcastChannelsCombination.get(k).getSource();
                if (!areBranchCompatible(bcSource, otherBcSource)) {
                    validCombination = false;
                    break;
                }
            }
        }
        if (!validCombination) {
            continue;
        }
        placePipelineBreakersIfNecessary(operator.getStrategy(), in1, in2);
        DualInputPlanNode node = operator.instantiate(in1, in2, this);
        node.setBroadcastInputs(broadcastChannelsCombination);
        SemanticProperties semPropsGlobalPropFiltering = getSemanticPropertiesForGlobalPropertyFiltering();
        GlobalProperties gp1 = in1.getGlobalProperties().clone().filterBySemanticProperties(semPropsGlobalPropFiltering, 0);
        GlobalProperties gp2 = in2.getGlobalProperties().clone().filterBySemanticProperties(semPropsGlobalPropFiltering, 1);
        GlobalProperties combined = operator.computeGlobalProperties(gp1, gp2);
        SemanticProperties semPropsLocalPropFiltering = getSemanticPropertiesForLocalPropertyFiltering();
        LocalProperties lp1 = in1.getLocalProperties().clone().filterBySemanticProperties(semPropsLocalPropFiltering, 0);
        LocalProperties lp2 = in2.getLocalProperties().clone().filterBySemanticProperties(semPropsLocalPropFiltering, 1);
        LocalProperties locals = operator.computeLocalProperties(lp1, lp2);
        node.initProperties(combined, locals);
        node.updatePropertiesWithUniqueSets(getUniqueFields());
        target.add(node);
    }
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) SemanticProperties(org.apache.flink.api.common.operators.SemanticProperties) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) PlanNode(org.apache.flink.optimizer.plan.PlanNode) RequestedGlobalProperties(org.apache.flink.optimizer.dataproperties.RequestedGlobalProperties) GlobalProperties(org.apache.flink.optimizer.dataproperties.GlobalProperties) NamedChannel(org.apache.flink.optimizer.plan.NamedChannel) RequestedLocalProperties(org.apache.flink.optimizer.dataproperties.RequestedLocalProperties) LocalProperties(org.apache.flink.optimizer.dataproperties.LocalProperties)

Example 48 with DualInputPlanNode

use of org.apache.flink.optimizer.plan.DualInputPlanNode in project flink by apache.

the class GenericFlatTypePostPass method traverse.

@SuppressWarnings("unchecked")
protected void traverse(PlanNode node, T parentSchema, boolean createUtilities) {
    // distinguish the node types
    if (node instanceof SinkPlanNode) {
        SinkPlanNode sn = (SinkPlanNode) node;
        Channel inchannel = sn.getInput();
        T schema = createEmptySchema();
        sn.postPassHelper = schema;
        // add the sinks information to the schema
        try {
            getSinkSchema(sn, schema);
        } catch (ConflictingFieldTypeInfoException e) {
            throw new CompilerPostPassException("Conflicting type infomation for the data sink '" + sn.getSinkNode().getOperator().getName() + "'.");
        }
        // descend to the input channel
        try {
            propagateToChannel(schema, inchannel, createUtilities);
        } catch (MissingFieldTypeInfoException ex) {
            throw new CompilerPostPassException("Missing type infomation for the channel that inputs to the data sink '" + sn.getSinkNode().getOperator().getName() + "'.");
        }
    } else if (node instanceof SourcePlanNode) {
        if (createUtilities) {
            ((SourcePlanNode) node).setSerializer(createSerializer(parentSchema, node));
        // nothing else to be done here. the source has no input and no strategy itself
        }
    } else if (node instanceof BulkIterationPlanNode) {
        BulkIterationPlanNode iterationNode = (BulkIterationPlanNode) node;
        // get the nodes current schema
        T schema;
        if (iterationNode.postPassHelper == null) {
            schema = createEmptySchema();
            iterationNode.postPassHelper = schema;
        } else {
            schema = (T) iterationNode.postPassHelper;
        }
        schema.increaseNumConnectionsThatContributed();
        // add the parent schema to the schema
        if (propagateParentSchemaDown) {
            addSchemaToSchema(parentSchema, schema, iterationNode.getProgramOperator().getName());
        }
        // check whether all outgoing channels have not yet contributed. come back later if not.
        if (schema.getNumConnectionsThatContributed() < iterationNode.getOutgoingChannels().size()) {
            return;
        }
        if (iterationNode.getRootOfStepFunction() instanceof NAryUnionPlanNode) {
            throw new CompilerException("Optimizer cannot compile an iteration step function where next partial solution is created by a Union node.");
        }
        // traverse the termination criterion for the first time. create schema only, no utilities. Needed in case of intermediate termination criterion
        if (iterationNode.getRootOfTerminationCriterion() != null) {
            SingleInputPlanNode addMapper = (SingleInputPlanNode) iterationNode.getRootOfTerminationCriterion();
            traverse(addMapper.getInput().getSource(), createEmptySchema(), false);
            try {
                addMapper.getInput().setSerializer(createSerializer(createEmptySchema()));
            } catch (MissingFieldTypeInfoException e) {
                throw new RuntimeException(e);
            }
        }
        // traverse the step function for the first time. create schema only, no utilities
        traverse(iterationNode.getRootOfStepFunction(), schema, false);
        T pss = (T) iterationNode.getPartialSolutionPlanNode().postPassHelper;
        if (pss == null) {
            throw new CompilerException("Error in Optimizer Post Pass: Partial solution schema is null after first traversal of the step function.");
        }
        // traverse the step function for the second time, taking the schema of the partial solution
        traverse(iterationNode.getRootOfStepFunction(), pss, createUtilities);
        if (iterationNode.getRootOfTerminationCriterion() != null) {
            SingleInputPlanNode addMapper = (SingleInputPlanNode) iterationNode.getRootOfTerminationCriterion();
            traverse(addMapper.getInput().getSource(), createEmptySchema(), createUtilities);
            try {
                addMapper.getInput().setSerializer(createSerializer(createEmptySchema()));
            } catch (MissingFieldTypeInfoException e) {
                throw new RuntimeException(e);
            }
        }
        // take the schema from the partial solution node and add its fields to the iteration result schema.
        // input and output schema need to be identical, so this is essentially a sanity check
        addSchemaToSchema(pss, schema, iterationNode.getProgramOperator().getName());
        // set the serializer
        if (createUtilities) {
            iterationNode.setSerializerForIterationChannel(createSerializer(pss, iterationNode.getPartialSolutionPlanNode()));
        }
        // done, we can now propagate our info down
        try {
            propagateToChannel(schema, iterationNode.getInput(), createUtilities);
        } catch (MissingFieldTypeInfoException e) {
            throw new CompilerPostPassException("Could not set up runtime strategy for input channel to node '" + iterationNode.getProgramOperator().getName() + "'. Missing type information for key field " + e.getFieldNumber());
        }
    } else if (node instanceof WorksetIterationPlanNode) {
        WorksetIterationPlanNode iterationNode = (WorksetIterationPlanNode) node;
        // get the nodes current schema
        T schema;
        if (iterationNode.postPassHelper == null) {
            schema = createEmptySchema();
            iterationNode.postPassHelper = schema;
        } else {
            schema = (T) iterationNode.postPassHelper;
        }
        schema.increaseNumConnectionsThatContributed();
        // add the parent schema to the schema (which refers to the solution set schema)
        if (propagateParentSchemaDown) {
            addSchemaToSchema(parentSchema, schema, iterationNode.getProgramOperator().getName());
        }
        // check whether all outgoing channels have not yet contributed. come back later if not.
        if (schema.getNumConnectionsThatContributed() < iterationNode.getOutgoingChannels().size()) {
            return;
        }
        if (iterationNode.getNextWorkSetPlanNode() instanceof NAryUnionPlanNode) {
            throw new CompilerException("Optimizer cannot compile a workset iteration step function where the next workset is produced by a Union node.");
        }
        if (iterationNode.getSolutionSetDeltaPlanNode() instanceof NAryUnionPlanNode) {
            throw new CompilerException("Optimizer cannot compile a workset iteration step function where the solution set delta is produced by a Union node.");
        }
        // traverse the step function
        // pass an empty schema to the next workset and the parent schema to the solution set delta
        // these first traversals are schema only
        traverse(iterationNode.getNextWorkSetPlanNode(), createEmptySchema(), false);
        traverse(iterationNode.getSolutionSetDeltaPlanNode(), schema, false);
        T wss = (T) iterationNode.getWorksetPlanNode().postPassHelper;
        T sss = (T) iterationNode.getSolutionSetPlanNode().postPassHelper;
        if (wss == null) {
            throw new CompilerException("Error in Optimizer Post Pass: Workset schema is null after first traversal of the step function.");
        }
        if (sss == null) {
            throw new CompilerException("Error in Optimizer Post Pass: Solution set schema is null after first traversal of the step function.");
        }
        // make the second pass and instantiate the utilities
        traverse(iterationNode.getNextWorkSetPlanNode(), wss, createUtilities);
        traverse(iterationNode.getSolutionSetDeltaPlanNode(), sss, createUtilities);
        // the solution set input and the result must have the same schema, this acts as a sanity check.
        try {
            for (Map.Entry<Integer, X> entry : sss) {
                Integer pos = entry.getKey();
                schema.addType(pos, entry.getValue());
            }
        } catch (ConflictingFieldTypeInfoException e) {
            throw new CompilerPostPassException("Conflicting type information for field " + e.getFieldNumber() + " in node '" + iterationNode.getProgramOperator().getName() + "'. Contradicting types between the " + "result of the iteration and the solution set schema: " + e.getPreviousType() + " and " + e.getNewType() + ". Most probable cause: Invalid constant field annotations.");
        }
        // set the serializers and comparators
        if (createUtilities) {
            WorksetIterationNode optNode = iterationNode.getIterationNode();
            iterationNode.setWorksetSerializer(createSerializer(wss, iterationNode.getWorksetPlanNode()));
            iterationNode.setSolutionSetSerializer(createSerializer(sss, iterationNode.getSolutionSetPlanNode()));
            try {
                iterationNode.setSolutionSetComparator(createComparator(optNode.getSolutionSetKeyFields(), null, sss));
            } catch (MissingFieldTypeInfoException ex) {
                throw new CompilerPostPassException("Could not set up the solution set for workset iteration '" + optNode.getOperator().getName() + "'. Missing type information for key field " + ex.getFieldNumber() + '.');
            }
        }
        // done, we can now propagate our info down
        try {
            propagateToChannel(schema, iterationNode.getInitialSolutionSetInput(), createUtilities);
            propagateToChannel(wss, iterationNode.getInitialWorksetInput(), createUtilities);
        } catch (MissingFieldTypeInfoException ex) {
            throw new CompilerPostPassException("Could not set up runtime strategy for input channel to node '" + iterationNode.getProgramOperator().getName() + "'. Missing type information for key field " + ex.getFieldNumber());
        }
    } else if (node instanceof SingleInputPlanNode) {
        SingleInputPlanNode sn = (SingleInputPlanNode) node;
        // get the nodes current schema
        T schema;
        if (sn.postPassHelper == null) {
            schema = createEmptySchema();
            sn.postPassHelper = schema;
        } else {
            schema = (T) sn.postPassHelper;
        }
        schema.increaseNumConnectionsThatContributed();
        SingleInputNode optNode = sn.getSingleInputNode();
        // add the parent schema to the schema
        if (propagateParentSchemaDown) {
            addSchemaToSchema(parentSchema, schema, optNode, 0);
        }
        // check whether all outgoing channels have not yet contributed. come back later if not.
        if (schema.getNumConnectionsThatContributed() < sn.getOutgoingChannels().size()) {
            return;
        }
        // add the nodes local information
        try {
            getSingleInputNodeSchema(sn, schema);
        } catch (ConflictingFieldTypeInfoException e) {
            throw new CompilerPostPassException(getConflictingTypeErrorMessage(e, optNode.getOperator().getName()));
        }
        if (createUtilities) {
            // parameterize the node's driver strategy
            for (int i = 0; i < sn.getDriverStrategy().getNumRequiredComparators(); i++) {
                try {
                    sn.setComparator(createComparator(sn.getKeys(i), sn.getSortOrders(i), schema), i);
                } catch (MissingFieldTypeInfoException e) {
                    throw new CompilerPostPassException("Could not set up runtime strategy for node '" + optNode.getOperator().getName() + "'. Missing type information for key field " + e.getFieldNumber());
                }
            }
        }
        // done, we can now propagate our info down
        try {
            propagateToChannel(schema, sn.getInput(), createUtilities);
        } catch (MissingFieldTypeInfoException e) {
            throw new CompilerPostPassException("Could not set up runtime strategy for input channel to node '" + optNode.getOperator().getName() + "'. Missing type information for field " + e.getFieldNumber());
        }
        // don't forget the broadcast inputs
        for (Channel c : sn.getBroadcastInputs()) {
            try {
                propagateToChannel(createEmptySchema(), c, createUtilities);
            } catch (MissingFieldTypeInfoException e) {
                throw new CompilerPostPassException("Could not set up runtime strategy for broadcast channel in node '" + optNode.getOperator().getName() + "'. Missing type information for field " + e.getFieldNumber());
            }
        }
    } else if (node instanceof DualInputPlanNode) {
        DualInputPlanNode dn = (DualInputPlanNode) node;
        // get the nodes current schema
        T schema1;
        T schema2;
        if (dn.postPassHelper1 == null) {
            schema1 = createEmptySchema();
            schema2 = createEmptySchema();
            dn.postPassHelper1 = schema1;
            dn.postPassHelper2 = schema2;
        } else {
            schema1 = (T) dn.postPassHelper1;
            schema2 = (T) dn.postPassHelper2;
        }
        schema1.increaseNumConnectionsThatContributed();
        schema2.increaseNumConnectionsThatContributed();
        TwoInputNode optNode = dn.getTwoInputNode();
        // add the parent schema to the schema
        if (propagateParentSchemaDown) {
            addSchemaToSchema(parentSchema, schema1, optNode, 0);
            addSchemaToSchema(parentSchema, schema2, optNode, 1);
        }
        // check whether all outgoing channels have not yet contributed. come back later if not.
        if (schema1.getNumConnectionsThatContributed() < dn.getOutgoingChannels().size()) {
            return;
        }
        // add the nodes local information
        try {
            getDualInputNodeSchema(dn, schema1, schema2);
        } catch (ConflictingFieldTypeInfoException e) {
            throw new CompilerPostPassException(getConflictingTypeErrorMessage(e, optNode.getOperator().getName()));
        }
        // parameterize the node's driver strategy
        if (createUtilities) {
            if (dn.getDriverStrategy().getNumRequiredComparators() > 0) {
                // set the individual comparators
                try {
                    dn.setComparator1(createComparator(dn.getKeysForInput1(), dn.getSortOrders(), schema1));
                    dn.setComparator2(createComparator(dn.getKeysForInput2(), dn.getSortOrders(), schema2));
                } catch (MissingFieldTypeInfoException e) {
                    throw new CompilerPostPassException("Could not set up runtime strategy for node '" + optNode.getOperator().getName() + "'. Missing type information for field " + e.getFieldNumber());
                }
                // set the pair comparator
                try {
                    dn.setPairComparator(createPairComparator(dn.getKeysForInput1(), dn.getKeysForInput2(), dn.getSortOrders(), schema1, schema2));
                } catch (MissingFieldTypeInfoException e) {
                    throw new CompilerPostPassException("Could not set up runtime strategy for node '" + optNode.getOperator().getName() + "'. Missing type information for field " + e.getFieldNumber());
                }
            }
        }
        // done, we can now propagate our info down
        try {
            propagateToChannel(schema1, dn.getInput1(), createUtilities);
        } catch (MissingFieldTypeInfoException e) {
            throw new CompilerPostPassException("Could not set up runtime strategy for the first input channel to node '" + optNode.getOperator().getName() + "'. Missing type information for field " + e.getFieldNumber());
        }
        try {
            propagateToChannel(schema2, dn.getInput2(), createUtilities);
        } catch (MissingFieldTypeInfoException e) {
            throw new CompilerPostPassException("Could not set up runtime strategy for the second input channel to node '" + optNode.getOperator().getName() + "'. Missing type information for field " + e.getFieldNumber());
        }
        // don't forget the broadcast inputs
        for (Channel c : dn.getBroadcastInputs()) {
            try {
                propagateToChannel(createEmptySchema(), c, createUtilities);
            } catch (MissingFieldTypeInfoException e) {
                throw new CompilerPostPassException("Could not set up runtime strategy for broadcast channel in node '" + optNode.getOperator().getName() + "'. Missing type information for field " + e.getFieldNumber());
            }
        }
    } else if (node instanceof NAryUnionPlanNode) {
        // only propagate the info down
        try {
            for (Channel channel : node.getInputs()) {
                propagateToChannel(parentSchema, channel, createUtilities);
            }
        } catch (MissingFieldTypeInfoException ex) {
            throw new CompilerPostPassException("Could not set up runtime strategy for the input channel to " + " a union node. Missing type information for field " + ex.getFieldNumber());
        }
    } else // catch the sources of the iterative step functions
    if (node instanceof BulkPartialSolutionPlanNode || node instanceof SolutionSetPlanNode || node instanceof WorksetPlanNode) {
        // get the nodes current schema
        T schema;
        String name;
        if (node instanceof BulkPartialSolutionPlanNode) {
            BulkPartialSolutionPlanNode psn = (BulkPartialSolutionPlanNode) node;
            if (psn.postPassHelper == null) {
                schema = createEmptySchema();
                psn.postPassHelper = schema;
            } else {
                schema = (T) psn.postPassHelper;
            }
            name = "partial solution of bulk iteration '" + psn.getPartialSolutionNode().getIterationNode().getOperator().getName() + "'";
        } else if (node instanceof SolutionSetPlanNode) {
            SolutionSetPlanNode ssn = (SolutionSetPlanNode) node;
            if (ssn.postPassHelper == null) {
                schema = createEmptySchema();
                ssn.postPassHelper = schema;
            } else {
                schema = (T) ssn.postPassHelper;
            }
            name = "solution set of workset iteration '" + ssn.getSolutionSetNode().getIterationNode().getOperator().getName() + "'";
        } else if (node instanceof WorksetPlanNode) {
            WorksetPlanNode wsn = (WorksetPlanNode) node;
            if (wsn.postPassHelper == null) {
                schema = createEmptySchema();
                wsn.postPassHelper = schema;
            } else {
                schema = (T) wsn.postPassHelper;
            }
            name = "workset of workset iteration '" + wsn.getWorksetNode().getIterationNode().getOperator().getName() + "'";
        } else {
            throw new CompilerException();
        }
        schema.increaseNumConnectionsThatContributed();
        // add the parent schema to the schema
        addSchemaToSchema(parentSchema, schema, name);
    } else {
        throw new CompilerPostPassException("Unknown node type encountered: " + node.getClass().getName());
    }
}
Also used : SingleInputNode(org.apache.flink.optimizer.dag.SingleInputNode) SolutionSetPlanNode(org.apache.flink.optimizer.plan.SolutionSetPlanNode) WorksetIterationPlanNode(org.apache.flink.optimizer.plan.WorksetIterationPlanNode) BulkPartialSolutionPlanNode(org.apache.flink.optimizer.plan.BulkPartialSolutionPlanNode) Channel(org.apache.flink.optimizer.plan.Channel) NAryUnionPlanNode(org.apache.flink.optimizer.plan.NAryUnionPlanNode) SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) WorksetIterationNode(org.apache.flink.optimizer.dag.WorksetIterationNode) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) SourcePlanNode(org.apache.flink.optimizer.plan.SourcePlanNode) CompilerException(org.apache.flink.optimizer.CompilerException) WorksetPlanNode(org.apache.flink.optimizer.plan.WorksetPlanNode) CompilerPostPassException(org.apache.flink.optimizer.CompilerPostPassException) BulkIterationPlanNode(org.apache.flink.optimizer.plan.BulkIterationPlanNode) TwoInputNode(org.apache.flink.optimizer.dag.TwoInputNode)

Example 49 with DualInputPlanNode

use of org.apache.flink.optimizer.plan.DualInputPlanNode in project flink by apache.

the class FeedbackPropertiesMatchTest method testNoPartialSolutionFoundTwoInputOperator.

@Test
public void testNoPartialSolutionFoundTwoInputOperator() {
    try {
        SourcePlanNode target = new SourcePlanNode(getSourceNode(), "Partial Solution");
        SourcePlanNode source1 = new SourcePlanNode(getSourceNode(), "Source 1");
        SourcePlanNode source2 = new SourcePlanNode(getSourceNode(), "Source 2");
        Channel toMap1 = new Channel(source1);
        toMap1.setShipStrategy(ShipStrategyType.FORWARD, DataExchangeMode.PIPELINED);
        toMap1.setLocalStrategy(LocalStrategy.NONE);
        SingleInputPlanNode map1 = new SingleInputPlanNode(getMapNode(), "Mapper 1", toMap1, DriverStrategy.MAP);
        Channel toMap2 = new Channel(source2);
        toMap2.setShipStrategy(ShipStrategyType.FORWARD, DataExchangeMode.PIPELINED);
        toMap2.setLocalStrategy(LocalStrategy.NONE);
        SingleInputPlanNode map2 = new SingleInputPlanNode(getMapNode(), "Mapper 2", toMap2, DriverStrategy.MAP);
        Channel toJoin1 = new Channel(map1);
        Channel toJoin2 = new Channel(map2);
        toJoin1.setShipStrategy(ShipStrategyType.FORWARD, DataExchangeMode.PIPELINED);
        toJoin1.setLocalStrategy(LocalStrategy.NONE);
        toJoin2.setShipStrategy(ShipStrategyType.FORWARD, DataExchangeMode.PIPELINED);
        toJoin2.setLocalStrategy(LocalStrategy.NONE);
        DualInputPlanNode join = new DualInputPlanNode(getJoinNode(), "Join", toJoin1, toJoin2, DriverStrategy.HYBRIDHASH_BUILD_FIRST);
        FeedbackPropertiesMeetRequirementsReport report = join.checkPartialSolutionPropertiesMet(target, new GlobalProperties(), new LocalProperties());
        assertEquals(NO_PARTIAL_SOLUTION, report);
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : SingleInputPlanNode(org.apache.flink.optimizer.plan.SingleInputPlanNode) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) FeedbackPropertiesMeetRequirementsReport(org.apache.flink.optimizer.plan.PlanNode.FeedbackPropertiesMeetRequirementsReport) RequestedGlobalProperties(org.apache.flink.optimizer.dataproperties.RequestedGlobalProperties) GlobalProperties(org.apache.flink.optimizer.dataproperties.GlobalProperties) Channel(org.apache.flink.optimizer.plan.Channel) SourcePlanNode(org.apache.flink.optimizer.plan.SourcePlanNode) RequestedLocalProperties(org.apache.flink.optimizer.dataproperties.RequestedLocalProperties) LocalProperties(org.apache.flink.optimizer.dataproperties.LocalProperties) Test(org.junit.Test)

Example 50 with DualInputPlanNode

use of org.apache.flink.optimizer.plan.DualInputPlanNode in project flink by apache.

the class GroupOrderTest method testCoGroupWithGroupOrder.

@Test
public void testCoGroupWithGroupOrder() {
    // construct the plan
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    env.setParallelism(DEFAULT_PARALLELISM);
    DataSet<Tuple7<Long, Long, Long, Long, Long, Long, Long>> set1 = env.readCsvFile("/tmp/fake1.csv").types(Long.class, Long.class, Long.class, Long.class, Long.class, Long.class, Long.class);
    DataSet<Tuple7<Long, Long, Long, Long, Long, Long, Long>> set2 = env.readCsvFile("/tmp/fake2.csv").types(Long.class, Long.class, Long.class, Long.class, Long.class, Long.class, Long.class);
    set1.coGroup(set2).where(3, 0).equalTo(6, 0).sortFirstGroup(5, Order.DESCENDING).sortSecondGroup(1, Order.DESCENDING).sortSecondGroup(4, Order.ASCENDING).with(new IdentityCoGrouper<Tuple7<Long, Long, Long, Long, Long, Long, Long>>()).name("CoGroup").output(new DiscardingOutputFormat<Tuple7<Long, Long, Long, Long, Long, Long, Long>>()).name("Sink");
    Plan plan = env.createProgramPlan();
    OptimizedPlan oPlan;
    try {
        oPlan = compileNoStats(plan);
    } catch (CompilerException ce) {
        ce.printStackTrace();
        fail("The pact compiler is unable to compile this plan correctly.");
        // silence the compiler
        return;
    }
    OptimizerPlanNodeResolver resolver = getOptimizerPlanNodeResolver(oPlan);
    SinkPlanNode sinkNode = resolver.getNode("Sink");
    DualInputPlanNode coGroupNode = resolver.getNode("CoGroup");
    // verify the strategies
    Assert.assertEquals(ShipStrategyType.FORWARD, sinkNode.getInput().getShipStrategy());
    Assert.assertEquals(ShipStrategyType.PARTITION_HASH, coGroupNode.getInput1().getShipStrategy());
    Assert.assertEquals(ShipStrategyType.PARTITION_HASH, coGroupNode.getInput2().getShipStrategy());
    Channel c1 = coGroupNode.getInput1();
    Channel c2 = coGroupNode.getInput2();
    Assert.assertEquals(LocalStrategy.SORT, c1.getLocalStrategy());
    Assert.assertEquals(LocalStrategy.SORT, c2.getLocalStrategy());
    FieldList ship1 = new FieldList(3, 0);
    FieldList ship2 = new FieldList(6, 0);
    FieldList local1 = new FieldList(3, 0, 5);
    FieldList local2 = new FieldList(6, 0, 1, 4);
    Assert.assertEquals(ship1, c1.getShipStrategyKeys());
    Assert.assertEquals(ship2, c2.getShipStrategyKeys());
    Assert.assertEquals(local1, c1.getLocalStrategyKeys());
    Assert.assertEquals(local2, c2.getLocalStrategyKeys());
    Assert.assertTrue(c1.getLocalStrategySortOrder()[0] == coGroupNode.getSortOrders()[0]);
    Assert.assertTrue(c1.getLocalStrategySortOrder()[1] == coGroupNode.getSortOrders()[1]);
    Assert.assertTrue(c2.getLocalStrategySortOrder()[0] == coGroupNode.getSortOrders()[0]);
    Assert.assertTrue(c2.getLocalStrategySortOrder()[1] == coGroupNode.getSortOrders()[1]);
    // check that the local group orderings are correct
    Assert.assertEquals(false, c1.getLocalStrategySortOrder()[2]);
    Assert.assertEquals(false, c2.getLocalStrategySortOrder()[2]);
    Assert.assertEquals(true, c2.getLocalStrategySortOrder()[3]);
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) Channel(org.apache.flink.optimizer.plan.Channel) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) DiscardingOutputFormat(org.apache.flink.api.java.io.DiscardingOutputFormat) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) FieldList(org.apache.flink.api.common.operators.util.FieldList) DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) Tuple7(org.apache.flink.api.java.tuple.Tuple7) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) IdentityCoGrouper(org.apache.flink.optimizer.testfunctions.IdentityCoGrouper) Test(org.junit.Test)

Aggregations

DualInputPlanNode (org.apache.flink.optimizer.plan.DualInputPlanNode)96 Test (org.junit.Test)86 OptimizedPlan (org.apache.flink.optimizer.plan.OptimizedPlan)81 Plan (org.apache.flink.api.common.Plan)76 SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)67 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)65 Tuple3 (org.apache.flink.api.java.tuple.Tuple3)36 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)31 SingleInputPlanNode (org.apache.flink.optimizer.plan.SingleInputPlanNode)27 JobGraphGenerator (org.apache.flink.optimizer.plantranslate.JobGraphGenerator)19 Channel (org.apache.flink.optimizer.plan.Channel)14 WorksetIterationPlanNode (org.apache.flink.optimizer.plan.WorksetIterationPlanNode)13 FieldList (org.apache.flink.api.common.operators.util.FieldList)12 InvalidProgramException (org.apache.flink.api.common.InvalidProgramException)11 DiscardingOutputFormat (org.apache.flink.api.java.io.DiscardingOutputFormat)11 PlanNode (org.apache.flink.optimizer.plan.PlanNode)11 Tuple1 (org.apache.flink.api.java.tuple.Tuple1)10 SourcePlanNode (org.apache.flink.optimizer.plan.SourcePlanNode)10 ShipStrategyType (org.apache.flink.runtime.operators.shipping.ShipStrategyType)10 ReplicatingInputFormat (org.apache.flink.api.common.io.ReplicatingInputFormat)8