Search in sources :

Example 1 with EpsilonFilter

use of org.apache.flink.examples.java.graph.PageRank.EpsilonFilter in project flink by apache.

the class PageRankCompilerTest method testPageRank.

@Test
public void testPageRank() {
    try {
        final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
        // get input data
        DataSet<Long> pagesInput = env.fromElements(1l);
        @SuppressWarnings("unchecked") DataSet<Tuple2<Long, Long>> linksInput = env.fromElements(new Tuple2<Long, Long>(1l, 2l));
        // assign initial rank to pages
        DataSet<Tuple2<Long, Double>> pagesWithRanks = pagesInput.map(new RankAssigner((1.0d / 10)));
        // build adjacency list from link input
        DataSet<Tuple2<Long, Long[]>> adjacencyListInput = linksInput.groupBy(0).reduceGroup(new BuildOutgoingEdgeList());
        // set iterative data set
        IterativeDataSet<Tuple2<Long, Double>> iteration = pagesWithRanks.iterate(10);
        Configuration cfg = new Configuration();
        cfg.setString(Optimizer.HINT_LOCAL_STRATEGY, Optimizer.HINT_LOCAL_STRATEGY_HASH_BUILD_SECOND);
        DataSet<Tuple2<Long, Double>> newRanks = iteration.join(adjacencyListInput).where(0).equalTo(0).withParameters(cfg).flatMap(new JoinVertexWithEdgesMatch()).groupBy(0).aggregate(SUM, 1).map(new Dampener(0.85, 10));
        DataSet<Tuple2<Long, Double>> finalPageRanks = iteration.closeWith(newRanks, newRanks.join(iteration).where(0).equalTo(0).filter(new EpsilonFilter()));
        finalPageRanks.output(new DiscardingOutputFormat<Tuple2<Long, Double>>());
        // get the plan and compile it
        Plan p = env.createProgramPlan();
        OptimizedPlan op = compileNoStats(p);
        SinkPlanNode sinkPlanNode = (SinkPlanNode) op.getDataSinks().iterator().next();
        BulkIterationPlanNode iterPlanNode = (BulkIterationPlanNode) sinkPlanNode.getInput().getSource();
        // check that the partitioning is pushed out of the first loop
        Assert.assertEquals(ShipStrategyType.PARTITION_HASH, iterPlanNode.getInput().getShipStrategy());
        Assert.assertEquals(LocalStrategy.NONE, iterPlanNode.getInput().getLocalStrategy());
        BulkPartialSolutionPlanNode partSolPlanNode = iterPlanNode.getPartialSolutionPlanNode();
        Assert.assertEquals(ShipStrategyType.FORWARD, partSolPlanNode.getOutgoingChannels().get(0).getShipStrategy());
    } catch (Exception e) {
        e.printStackTrace();
        fail(e.getMessage());
    }
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) EpsilonFilter(org.apache.flink.examples.java.graph.PageRank.EpsilonFilter) Configuration(org.apache.flink.configuration.Configuration) BulkPartialSolutionPlanNode(org.apache.flink.optimizer.plan.BulkPartialSolutionPlanNode) BuildOutgoingEdgeList(org.apache.flink.examples.java.graph.PageRank.BuildOutgoingEdgeList) Plan(org.apache.flink.api.common.Plan) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) RankAssigner(org.apache.flink.examples.java.graph.PageRank.RankAssigner) OptimizedPlan(org.apache.flink.optimizer.plan.OptimizedPlan) Dampener(org.apache.flink.examples.java.graph.PageRank.Dampener) JoinVertexWithEdgesMatch(org.apache.flink.examples.java.graph.PageRank.JoinVertexWithEdgesMatch) Tuple2(org.apache.flink.api.java.tuple.Tuple2) SinkPlanNode(org.apache.flink.optimizer.plan.SinkPlanNode) BulkIterationPlanNode(org.apache.flink.optimizer.plan.BulkIterationPlanNode) Test(org.junit.Test)

Aggregations

Plan (org.apache.flink.api.common.Plan)1 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)1 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)1 Configuration (org.apache.flink.configuration.Configuration)1 BuildOutgoingEdgeList (org.apache.flink.examples.java.graph.PageRank.BuildOutgoingEdgeList)1 Dampener (org.apache.flink.examples.java.graph.PageRank.Dampener)1 EpsilonFilter (org.apache.flink.examples.java.graph.PageRank.EpsilonFilter)1 JoinVertexWithEdgesMatch (org.apache.flink.examples.java.graph.PageRank.JoinVertexWithEdgesMatch)1 RankAssigner (org.apache.flink.examples.java.graph.PageRank.RankAssigner)1 BulkIterationPlanNode (org.apache.flink.optimizer.plan.BulkIterationPlanNode)1 BulkPartialSolutionPlanNode (org.apache.flink.optimizer.plan.BulkPartialSolutionPlanNode)1 OptimizedPlan (org.apache.flink.optimizer.plan.OptimizedPlan)1 SinkPlanNode (org.apache.flink.optimizer.plan.SinkPlanNode)1 Test (org.junit.Test)1