Search in sources :

Example 21 with CompilerException

use of org.apache.flink.optimizer.CompilerException in project flink by apache.

the class HashJoinBuildFirstProperties method instantiate.

@Override
public DualInputPlanNode instantiate(Channel in1, Channel in2, TwoInputNode node) {
    DriverStrategy strategy;
    if (!in1.isOnDynamicPath() && in2.isOnDynamicPath()) {
        // sanity check that the first input is cached and remove that cache
        if (!in1.getTempMode().isCached()) {
            throw new CompilerException("No cache at point where static and dynamic parts meet.");
        }
        in1.setTempMode(in1.getTempMode().makeNonCached());
        strategy = DriverStrategy.HYBRIDHASH_BUILD_FIRST_CACHED;
    } else {
        strategy = DriverStrategy.HYBRIDHASH_BUILD_FIRST;
    }
    return new DualInputPlanNode(node, "Join(" + node.getOperator().getName() + ")", in1, in2, strategy, this.keys1, this.keys2);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) CompilerException(org.apache.flink.optimizer.CompilerException) DriverStrategy(org.apache.flink.runtime.operators.DriverStrategy)

Example 22 with CompilerException

use of org.apache.flink.optimizer.CompilerException in project flink by apache.

the class HashJoinBuildSecondProperties method instantiate.

@Override
public DualInputPlanNode instantiate(Channel in1, Channel in2, TwoInputNode node) {
    DriverStrategy strategy;
    if (!in2.isOnDynamicPath() && in1.isOnDynamicPath()) {
        // sanity check that the first input is cached and remove that cache
        if (!in2.getTempMode().isCached()) {
            throw new CompilerException("No cache at point where static and dynamic parts meet.");
        }
        in2.setTempMode(in2.getTempMode().makeNonCached());
        strategy = DriverStrategy.HYBRIDHASH_BUILD_SECOND_CACHED;
    } else {
        strategy = DriverStrategy.HYBRIDHASH_BUILD_SECOND;
    }
    return new DualInputPlanNode(node, "Join (" + node.getOperator().getName() + ")", in1, in2, strategy, this.keys1, this.keys2);
}
Also used : DualInputPlanNode(org.apache.flink.optimizer.plan.DualInputPlanNode) CompilerException(org.apache.flink.optimizer.CompilerException) DriverStrategy(org.apache.flink.runtime.operators.DriverStrategy)

Example 23 with CompilerException

use of org.apache.flink.optimizer.CompilerException in project flink by apache.

the class OperatorDescriptorDual method checkSameOrdering.

protected boolean checkSameOrdering(GlobalProperties produced1, GlobalProperties produced2, int numRelevantFields) {
    Ordering prod1 = produced1.getPartitioningOrdering();
    Ordering prod2 = produced2.getPartitioningOrdering();
    if (prod1 == null || prod2 == null) {
        throw new CompilerException("The given properties do not meet this operators requirements.");
    }
    // check that order of fields is equivalent
    if (!checkEquivalentFieldPositionsInKeyFields(prod1.getInvolvedIndexes(), prod2.getInvolvedIndexes(), numRelevantFields)) {
        return false;
    }
    // check that both inputs have the same directions of order
    for (int i = 0; i < numRelevantFields; i++) {
        if (prod1.getOrder(i) != prod2.getOrder(i)) {
            return false;
        }
    }
    return true;
}
Also used : Ordering(org.apache.flink.api.common.operators.Ordering) CompilerException(org.apache.flink.optimizer.CompilerException)

Example 24 with CompilerException

use of org.apache.flink.optimizer.CompilerException in project flink by apache.

the class CliFrontendPackageProgramTest method testPlanWithExternalClass.

/**
	 * Ensure that we will never have the following error.
	 *
	 * <pre>
	 * 	org.apache.flink.client.program.ProgramInvocationException: The main method caused an error.
	 *		at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:398)
	 *		at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:301)
	 *		at org.apache.flink.client.program.Client.getOptimizedPlan(Client.java:140)
	 *		at org.apache.flink.client.program.Client.getOptimizedPlanAsJson(Client.java:125)
	 *		at org.apache.flink.client.CliFrontend.info(CliFrontend.java:439)
	 *		at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:931)
	 *		at org.apache.flink.client.CliFrontend.main(CliFrontend.java:951)
	 *	Caused by: java.io.IOException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.io.RCFileInputFormat
	 *		at org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:102)
	 *		at org.apache.hcatalog.mapreduce.HCatInputFormat.setInput(HCatInputFormat.java:54)
	 *		at tlabs.CDR_In_Report.createHCatInputFormat(CDR_In_Report.java:322)
	 *		at tlabs.CDR_Out_Report.main(CDR_Out_Report.java:380)
	 *		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	 *		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	 *		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	 *		at java.lang.reflect.Method.invoke(Method.java:622)
	 *		at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:383)
	 * </pre>
	 *
	 * The test works as follows:
	 *
	 * <ul>
	 *   <li> Use the CliFrontend to invoke a jar file that loads a class which is only available
	 * 	      in the jarfile itself (via a custom classloader)
	 *   <li> Change the Usercode classloader of the PackagedProgram to a special classloader for this test
	 *   <li> the classloader will accept the special class (and return a String.class)
	 * </ul>
	 */
@Test
public void testPlanWithExternalClass() throws CompilerException, ProgramInvocationException {
    // create a final object reference, to be able to change its val later
    final boolean[] callme = { false };
    try {
        String[] arguments = { "--classpath", "file:///tmp/foo", "--classpath", "file:///tmp/bar", "-c", TEST_JAR_CLASSLOADERTEST_CLASS, getTestJarPath(), "true", "arg1", "arg2" };
        URL[] classpath = new URL[] { new URL("file:///tmp/foo"), new URL("file:///tmp/bar") };
        String[] reducedArguments = { "true", "arg1", "arg2" };
        RunOptions options = CliFrontendParser.parseRunCommand(arguments);
        assertEquals(getTestJarPath(), options.getJarFilePath());
        assertArrayEquals(classpath, options.getClasspaths().toArray());
        assertEquals(TEST_JAR_CLASSLOADERTEST_CLASS, options.getEntryPointClassName());
        assertArrayEquals(reducedArguments, options.getProgramArgs());
        CliFrontend frontend = new CliFrontend(CliFrontendTestUtils.getConfigDir());
        PackagedProgram prog = spy(frontend.buildProgram(options));
        ClassLoader testClassLoader = new ClassLoader(prog.getUserCodeClassLoader()) {

            @Override
            public Class<?> loadClass(String name) throws ClassNotFoundException {
                if ("org.apache.hadoop.hive.ql.io.RCFileInputFormat".equals(name)) {
                    callme[0] = true;
                    // Intentionally return the wrong class.
                    return String.class;
                } else {
                    return super.loadClass(name);
                }
            }
        };
        when(prog.getUserCodeClassLoader()).thenReturn(testClassLoader);
        assertEquals(TEST_JAR_CLASSLOADERTEST_CLASS, prog.getMainClassName());
        assertArrayEquals(reducedArguments, prog.getArguments());
        Configuration c = new Configuration();
        Optimizer compiler = new Optimizer(new DataStatistics(), new DefaultCostEstimator(), c);
        // we expect this to fail with a "ClassNotFoundException"
        ClusterClient.getOptimizedPlanAsJson(compiler, prog, 666);
        fail("Should have failed with a ClassNotFoundException");
    } catch (ProgramInvocationException e) {
        if (!(e.getCause() instanceof ClassNotFoundException)) {
            e.printStackTrace();
            fail("Program didn't throw ClassNotFoundException");
        }
        assertTrue("Classloader was not called", callme[0]);
    } catch (Exception e) {
        e.printStackTrace();
        fail("Program failed with the wrong exception: " + e.getClass().getName());
    }
}
Also used : Configuration(org.apache.flink.configuration.Configuration) Optimizer(org.apache.flink.optimizer.Optimizer) DataStatistics(org.apache.flink.optimizer.DataStatistics) URL(java.net.URL) ProgramInvocationException(org.apache.flink.client.program.ProgramInvocationException) FileNotFoundException(java.io.FileNotFoundException) CompilerException(org.apache.flink.optimizer.CompilerException) PackagedProgram(org.apache.flink.client.program.PackagedProgram) ProgramInvocationException(org.apache.flink.client.program.ProgramInvocationException) DefaultCostEstimator(org.apache.flink.optimizer.costs.DefaultCostEstimator) RunOptions(org.apache.flink.client.cli.RunOptions) Test(org.junit.Test)

Example 25 with CompilerException

use of org.apache.flink.optimizer.CompilerException in project flink by apache.

the class CostEstimator method costOperator.

// ------------------------------------------------------------------------	
/**
	 * This method computes the cost of an operator. The cost is composed of cost for input shipping,
	 * locally processing an input, and running the operator.
	 * 
	 * It requires at least that all inputs are set and have a proper ship strategy set,
	 * which is not equal to <tt>NONE</tt>.
	 * 
	 * @param n The node to compute the costs for.
	 */
public void costOperator(PlanNode n) {
    // initialize costs objects with no costs
    final Costs totalCosts = new Costs();
    final long availableMemory = n.getGuaranteedAvailableMemory();
    // add the shipping strategy costs
    for (Channel channel : n.getInputs()) {
        final Costs costs = new Costs();
        switch(channel.getShipStrategy()) {
            case NONE:
                throw new CompilerException("Cannot determine costs: Shipping strategy has not been set for an input.");
            case FORWARD:
                //				costs.addHeuristicNetworkCost(channel.getMaxDepth());
                break;
            case PARTITION_RANDOM:
                addRandomPartitioningCost(channel, costs);
                break;
            case PARTITION_HASH:
            case PARTITION_CUSTOM:
                addHashPartitioningCost(channel, costs);
                break;
            case PARTITION_RANGE:
                addRangePartitionCost(channel, costs);
                break;
            case BROADCAST:
                addBroadcastCost(channel, channel.getReplicationFactor(), costs);
                break;
            case PARTITION_FORCED_REBALANCE:
                addRandomPartitioningCost(channel, costs);
                break;
            default:
                throw new CompilerException("Unknown shipping strategy for input: " + channel.getShipStrategy());
        }
        switch(channel.getLocalStrategy()) {
            case NONE:
                break;
            case SORT:
            case COMBININGSORT:
                addLocalSortCost(channel, costs);
                break;
            default:
                throw new CompilerException("Unsupported local strategy for input: " + channel.getLocalStrategy());
        }
        if (channel.getTempMode() != null && channel.getTempMode() != TempMode.NONE) {
            addArtificialDamCost(channel, 0, costs);
        }
        // adjust with the cost weight factor
        if (channel.isOnDynamicPath()) {
            costs.multiplyWith(channel.getCostWeight());
        }
        totalCosts.addCosts(costs);
    }
    Channel firstInput = null;
    Channel secondInput = null;
    Costs driverCosts = new Costs();
    int costWeight = 1;
    // adjust with the cost weight factor
    if (n.isOnDynamicPath()) {
        costWeight = n.getCostWeight();
    }
    // get the inputs, if we have some
    {
        Iterator<Channel> channels = n.getInputs().iterator();
        if (channels.hasNext()) {
            firstInput = channels.next();
        }
        if (channels.hasNext()) {
            secondInput = channels.next();
        }
    }
    // determine the local costs
    switch(n.getDriverStrategy()) {
        case NONE:
        case UNARY_NO_OP:
        case BINARY_NO_OP:
        case MAP:
        case MAP_PARTITION:
        case FLAT_MAP:
        case ALL_GROUP_REDUCE:
        case ALL_REDUCE:
        case CO_GROUP:
        case CO_GROUP_RAW:
        case SORTED_GROUP_REDUCE:
        case SORTED_REDUCE:
        case SORTED_GROUP_COMBINE:
        // partial grouping is always local and main memory resident. we should add a relative cpu cost at some point
        case ALL_GROUP_COMBINE:
        case UNION:
            break;
        case INNER_MERGE:
        case FULL_OUTER_MERGE:
        case LEFT_OUTER_MERGE:
        case RIGHT_OUTER_MERGE:
            addLocalMergeCost(firstInput, secondInput, driverCosts, costWeight);
            break;
        case HYBRIDHASH_BUILD_FIRST:
        case RIGHT_HYBRIDHASH_BUILD_FIRST:
        case LEFT_HYBRIDHASH_BUILD_FIRST:
        case FULL_OUTER_HYBRIDHASH_BUILD_FIRST:
            addHybridHashCosts(firstInput, secondInput, driverCosts, costWeight);
            break;
        case HYBRIDHASH_BUILD_SECOND:
        case LEFT_HYBRIDHASH_BUILD_SECOND:
        case RIGHT_HYBRIDHASH_BUILD_SECOND:
        case FULL_OUTER_HYBRIDHASH_BUILD_SECOND:
            addHybridHashCosts(secondInput, firstInput, driverCosts, costWeight);
            break;
        case HYBRIDHASH_BUILD_FIRST_CACHED:
            addCachedHybridHashCosts(firstInput, secondInput, driverCosts, costWeight);
            break;
        case HYBRIDHASH_BUILD_SECOND_CACHED:
            addCachedHybridHashCosts(secondInput, firstInput, driverCosts, costWeight);
            break;
        case NESTEDLOOP_BLOCKED_OUTER_FIRST:
            addBlockNestedLoopsCosts(firstInput, secondInput, availableMemory, driverCosts, costWeight);
            break;
        case NESTEDLOOP_BLOCKED_OUTER_SECOND:
            addBlockNestedLoopsCosts(secondInput, firstInput, availableMemory, driverCosts, costWeight);
            break;
        case NESTEDLOOP_STREAMED_OUTER_FIRST:
            addStreamedNestedLoopsCosts(firstInput, secondInput, availableMemory, driverCosts, costWeight);
            break;
        case NESTEDLOOP_STREAMED_OUTER_SECOND:
            addStreamedNestedLoopsCosts(secondInput, firstInput, availableMemory, driverCosts, costWeight);
            break;
        default:
            throw new CompilerException("Unknown local strategy: " + n.getDriverStrategy().name());
    }
    totalCosts.addCosts(driverCosts);
    n.setCosts(totalCosts);
}
Also used : Channel(org.apache.flink.optimizer.plan.Channel) Iterator(java.util.Iterator) CompilerException(org.apache.flink.optimizer.CompilerException)

Aggregations

CompilerException (org.apache.flink.optimizer.CompilerException)48 PlanNode (org.apache.flink.optimizer.plan.PlanNode)16 DualInputPlanNode (org.apache.flink.optimizer.plan.DualInputPlanNode)15 Channel (org.apache.flink.optimizer.plan.Channel)14 SingleInputPlanNode (org.apache.flink.optimizer.plan.SingleInputPlanNode)14 WorksetIterationPlanNode (org.apache.flink.optimizer.plan.WorksetIterationPlanNode)13 BulkIterationPlanNode (org.apache.flink.optimizer.plan.BulkIterationPlanNode)12 SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)12 SolutionSetPlanNode (org.apache.flink.optimizer.plan.SolutionSetPlanNode)12 WorksetPlanNode (org.apache.flink.optimizer.plan.WorksetPlanNode)12 BulkPartialSolutionPlanNode (org.apache.flink.optimizer.plan.BulkPartialSolutionPlanNode)11 SourcePlanNode (org.apache.flink.optimizer.plan.SourcePlanNode)11 NAryUnionPlanNode (org.apache.flink.optimizer.plan.NAryUnionPlanNode)10 NamedChannel (org.apache.flink.optimizer.plan.NamedChannel)10 IterationPlanNode (org.apache.flink.optimizer.plan.IterationPlanNode)9 ArrayList (java.util.ArrayList)8 Configuration (org.apache.flink.configuration.Configuration)8 JobVertex (org.apache.flink.runtime.jobgraph.JobVertex)8 ShipStrategyType (org.apache.flink.runtime.operators.shipping.ShipStrategyType)8 TaskConfig (org.apache.flink.runtime.operators.util.TaskConfig)8