Search in sources :

Example 1 with ExecType

use of org.apache.sysml.lops.LopProperties.ExecType in project incubator-systemml by apache.

the class GDFEnumOptimizer method enumHopNodePlans.

private static void enumHopNodePlans(GDFNode node, ArrayList<Plan> plans) {
    ExecType CLUSTER = OptimizerUtils.isSparkExecutionMode() ? ExecType.SPARK : ExecType.MR;
    //create cp plan, if allowed (note: most interesting properties are irrelevant for CP)
    if (node.getHop().getMemEstimate() < OptimizerUtils.getLocalMemBudget()) {
        int[] bstmp = ENUM_CP_BLOCKSIZES ? BLOCK_SIZES : new int[] { BLOCK_SIZES[0] };
        for (Integer bs : bstmp) {
            RewriteConfig rccp = new RewriteConfig(ExecType.CP, bs, FileFormatTypes.BINARY);
            InterestingProperties ipscp = rccp.deriveInterestingProperties();
            Plan cpplan = new Plan(node, ipscp, rccp, null);
    //create mr plans, if required
    if (node.requiresMREnumeration()) {
        for (Integer bs : BLOCK_SIZES) {
            RewriteConfig rcmr = new RewriteConfig(CLUSTER, bs, FileFormatTypes.BINARY);
            InterestingProperties ipsmr = rcmr.deriveInterestingProperties();
            Plan mrplan = new Plan(node, ipsmr, rcmr, null);
Also used : ExecType(org.apache.sysml.lops.LopProperties.ExecType)

Example 2 with ExecType

use of org.apache.sysml.lops.LopProperties.ExecType in project incubator-systemml by apache.

the class GDFEnumOptimizer method enumNodePlans.

private static PlanSet enumNodePlans(GDFNode node, MemoStructure memo, double maxCosts) throws DMLRuntimeException {
    ArrayList<Plan> plans = new ArrayList<Plan>();
    ExecType CLUSTER = OptimizerUtils.isSparkExecutionMode() ? ExecType.SPARK : ExecType.MR;
    // CASE 1: core hop enumeration (other than persistent/transient read/write) 
    if (node.getNodeType() == NodeType.HOP_NODE && !(node.getHop() instanceof DataOp)) {
        //core rewrite enumeration for cp and mr
        enumHopNodePlans(node, plans);
    } else //CASE 2: dataop hop enumeration 
    if (node.getHop() instanceof DataOp) {
        DataOp dhop = (DataOp) node.getHop();
        if (dhop.getDataOpType() == DataOpTypes.PERSISTENTREAD) {
            //for persistent read the interesting properties are fixed by the input
            //but we can decide on output properties
            ExecType et = (dhop.getMemEstimate() > OptimizerUtils.getLocalMemBudget() || HopRewriteUtils.alwaysRequiresReblock(dhop)) ? CLUSTER : ExecType.CP;
            int[] blocksizes = (et == CLUSTER) ? BLOCK_SIZES : new int[] { BLOCK_SIZES[0] };
            for (Integer bs : blocksizes) {
                RewriteConfig rcmr = new RewriteConfig(et, bs, FileFormatTypes.BINARY);
                InterestingProperties ipsmr = rcmr.deriveInterestingProperties();
                Plan mrplan = new Plan(node, ipsmr, rcmr, null);
        } else if (dhop.getDataOpType() == DataOpTypes.PERSISTENTWRITE) {
            //for persistent write the interesting properties are fixed by the given
            //write specification
            ExecType et = (dhop.getMemEstimate() > OptimizerUtils.getLocalMemBudget()) ? CLUSTER : ExecType.CP;
            RewriteConfig rcmr = new RewriteConfig(et, (int) dhop.getRowsInBlock(), dhop.getInputFormatType());
            InterestingProperties ipsmr = rcmr.deriveInterestingProperties();
            Plan mrplan = new Plan(node, ipsmr, rcmr, null);
        } else if (dhop.getDataOpType() == DataOpTypes.TRANSIENTREAD || dhop.getDataOpType() == DataOpTypes.TRANSIENTWRITE) {
            //note: full enumeration for transient read and write; otherwise the properties
            //of these hops are never set because pass-through plans refer to different hops
            enumHopNodePlans(node, plans);
    if (node.getNodeType() == NodeType.LOOP_NODE) {
        //TODO consistency checks inputs and outputs (updated vars)
        GDFLoopNode lnode = (GDFLoopNode) node;
        //no additional pruning (validity, optimality) required
        for (GDFNode in : lnode.getLoopInputs().values()) enumOpt(in, memo, maxCosts);
        //step 1: enumerate loop plan, incl partitioning/checkpoints/reblock for inputs
        RewriteConfig rc = new RewriteConfig(ExecType.CP, -1, null);
        InterestingProperties ips = rc.deriveInterestingProperties();
        Plan lplan = new Plan(node, ips, rc, null);
        //(predicate might be null if single variable)
        if (lnode.getLoopPredicate() != null)
            enumOpt(lnode.getLoopPredicate(), memo, maxCosts);
        //step 3: recursive call optimize on outputs
        //(return union of all output plans, later selected by output var)
        PlanSet Pout = new PlanSet();
        for (GDFNode out : lnode.getLoopOutputs().values()) Pout = Pout.union(enumOpt(out, memo, maxCosts));
    //note: global pruning later done when returning to enumOpt
    //for the entire loop node			
    if (node.getNodeType() == NodeType.CROSS_BLOCK_NODE) {
    //do nothing (leads to pass-through on crossProductChild)
    return new PlanSet(plans);
Also used : GDFLoopNode(org.apache.sysml.hops.globalopt.gdfgraph.GDFLoopNode) ArrayList(java.util.ArrayList) GDFNode(org.apache.sysml.hops.globalopt.gdfgraph.GDFNode) ExecType(org.apache.sysml.lops.LopProperties.ExecType) DataOp(org.apache.sysml.hops.DataOp)

Example 3 with ExecType

use of org.apache.sysml.lops.LopProperties.ExecType in project incubator-systemml by apache.

the class Plan method checkValidBlocksizesInMR.

	 * If operation is executed in MR, all input blocksizes need to match.
	 * Note that the output blocksize can be different since we would add
	 * additional reblocks after that operation.
	 * @return true if valid blocksizes in MR
public boolean checkValidBlocksizesInMR() {
    boolean ret = true;
    ExecType CLUSTER = OptimizerUtils.isSparkExecutionMode() ? ExecType.SPARK : ExecType.MR;
    if (_conf.getExecType() == CLUSTER && _childs != null && _childs.size() > 1) {
        int size0 = _childs.get(0)._conf.getBlockSize();
        if (size0 > 0) {
            //-1 compatible with everything
            for (Plan c : _childs) ret &= (c._conf.getBlockSize() == size0 || c._conf.getBlockSize() <= 0);
    return ret;
Also used : ExecType(org.apache.sysml.lops.LopProperties.ExecType)

Example 4 with ExecType

use of org.apache.sysml.lops.LopProperties.ExecType in project incubator-systemml by apache.

the class ParameterizedBuiltinOp method constructLopsGroupedAggregate.

private void constructLopsGroupedAggregate(HashMap<String, Lop> inputlops, ExecType et) throws HopsException, LopsException {
    //reset reblock requirement (see MR aggregate / construct lops)
    //determine output dimensions
    long outputDim1 = -1, outputDim2 = -1;
    Lop numGroups = inputlops.get(Statement.GAGG_NUM_GROUPS);
    if (!dimsKnown() && numGroups != null && numGroups instanceof Data && ((Data) numGroups).isLiteral()) {
        long ngroups = ((Data) numGroups).getLongValue();
        Lop input = inputlops.get(GroupedAggregate.COMBINEDINPUT);
        long inDim1 = input.getOutputParameters().getNumRows();
        long inDim2 = input.getOutputParameters().getNumCols();
        boolean rowwise = (inDim1 == 1 && inDim2 > 1);
        if (rowwise) {
            outputDim1 = ngroups;
            outputDim2 = 1;
        } else {
            //vector or matrix
            outputDim1 = inDim2;
            outputDim2 = ngroups;
    //construct lops
    if (et == ExecType.MR) {
        Lop grp_agg = null;
        // construct necessary lops: combineBinary/combineTertiary and groupedAgg
        boolean isWeighted = (_paramIndexMap.get(Statement.GAGG_WEIGHTS) != null);
        if (isWeighted) {
            Lop append = BinaryOp.constructAppendLopChain(getInput().get(_paramIndexMap.get(Statement.GAGG_TARGET)), getInput().get(_paramIndexMap.get(Statement.GAGG_GROUPS)), getInput().get(_paramIndexMap.get(Statement.GAGG_WEIGHTS)), DataType.MATRIX, getValueType(), true, getInput().get(_paramIndexMap.get(Statement.GAGG_TARGET)));
            // add the combine lop to parameter list, with a new name "combinedinput"
            inputlops.put(GroupedAggregate.COMBINEDINPUT, append);
            grp_agg = new GroupedAggregate(inputlops, isWeighted, getDataType(), getValueType());
            grp_agg.getOutputParameters().setDimensions(outputDim1, outputDim2, getRowsInBlock(), getColsInBlock(), -1);
        } else {
            Hop target = getInput().get(_paramIndexMap.get(Statement.GAGG_TARGET));
            Hop groups = getInput().get(_paramIndexMap.get(Statement.GAGG_GROUPS));
            Lop append = null;
            //physical operator selection
            double groupsSizeP = OptimizerUtils.estimatePartitionedSizeExactSparsity(groups.getDim1(), groups.getDim2(), groups.getRowsInBlock(), groups.getColsInBlock(), groups.getNnz());
            if (//mapgroupedagg
            groupsSizeP < OptimizerUtils.getRemoteMemBudgetMap(true) && getInput().get(_paramIndexMap.get(Statement.GAGG_FN)) instanceof LiteralOp && ((LiteralOp) getInput().get(_paramIndexMap.get(Statement.GAGG_FN))).getStringValue().equals("sum") && inputlops.get(Statement.GAGG_NUM_GROUPS) != null) {
                //pre partitioning
                boolean needPart = (groups.dimsKnown() && groups.getDim1() * groups.getDim2() > DistributedCacheInput.PARTITION_SIZE);
                if (needPart) {
                    ExecType etPart = (OptimizerUtils.estimateSizeExactSparsity(groups.getDim1(), groups.getDim2(), 1.0) < OptimizerUtils.getLocalMemBudget()) ? ExecType.CP : //operator selection
                    Lop dcinput = new DataPartition(groups.constructLops(), DataType.MATRIX, ValueType.DOUBLE, etPart, PDataPartitionFormat.ROW_BLOCK_WISE_N);
                    dcinput.getOutputParameters().setDimensions(groups.getDim1(), groups.getDim2(), target.getRowsInBlock(), target.getColsInBlock(), groups.getNnz());
                    inputlops.put(Statement.GAGG_GROUPS, dcinput);
                Lop grp_agg_m = new GroupedAggregateM(inputlops, getDataType(), getValueType(), needPart, ExecType.MR);
                grp_agg_m.getOutputParameters().setDimensions(outputDim1, outputDim2, target.getRowsInBlock(), target.getColsInBlock(), -1);
                //post aggregation 
                Group grp = new Group(grp_agg_m, Group.OperationTypes.Sort, getDataType(), getValueType());
                grp.getOutputParameters().setDimensions(outputDim1, outputDim2, target.getRowsInBlock(), target.getColsInBlock(), -1);
                Aggregate agg1 = new Aggregate(grp, HopsAgg2Lops.get(AggOp.SUM), getDataType(), getValueType(), ExecType.MR);
                agg1.getOutputParameters().setDimensions(outputDim1, outputDim2, target.getRowsInBlock(), target.getColsInBlock(), -1);
                grp_agg = agg1;
            //note: no reblock required
            } else //general case: groupedagg
                if (// multi-column-block result matrix
                target.getDim2() >= target.getColsInBlock() || // unkown
                target.getDim2() <= 0) {
                    long m1_dim1 = target.getDim1();
                    long m1_dim2 = target.getDim2();
                    long m2_dim1 = groups.getDim1();
                    long m2_dim2 = groups.getDim2();
                    long m3_dim1 = m1_dim1;
                    long m3_dim2 = ((m1_dim2 > 0 && m2_dim2 > 0) ? (m1_dim2 + m2_dim2) : -1);
                    long m3_nnz = (target.getNnz() > 0 && groups.getNnz() > 0) ? (target.getNnz() + groups.getNnz()) : -1;
                    long brlen = target.getRowsInBlock();
                    long bclen = target.getColsInBlock();
                    Lop offset = createOffsetLop(target, true);
                    Lop rep = new RepMat(groups.constructLops(), offset, true, groups.getDataType(), groups.getValueType());
                    Group group1 = new Group(target.constructLops(), Group.OperationTypes.Sort, DataType.MATRIX, target.getValueType());
                    group1.getOutputParameters().setDimensions(m1_dim1, m1_dim2, brlen, bclen, target.getNnz());
                    Group group2 = new Group(rep, Group.OperationTypes.Sort, DataType.MATRIX, groups.getValueType());
                    group1.getOutputParameters().setDimensions(m2_dim1, m2_dim2, brlen, bclen, groups.getNnz());
                    append = new AppendR(group1, group2, DataType.MATRIX, ValueType.DOUBLE, true, ExecType.MR);
                    append.getOutputParameters().setDimensions(m3_dim1, m3_dim2, brlen, bclen, m3_nnz);
                } else //single-column-block vector or matrix
                    append = BinaryOp.constructMRAppendLop(target, groups, DataType.MATRIX, getValueType(), true, target);
                // add the combine lop to parameter list, with a new name "combinedinput"
                inputlops.put(GroupedAggregate.COMBINEDINPUT, append);
                grp_agg = new GroupedAggregate(inputlops, isWeighted, getDataType(), getValueType());
                grp_agg.getOutputParameters().setDimensions(outputDim1, outputDim2, getRowsInBlock(), getColsInBlock(), -1);
    } else //CP/Spark 
        Lop grp_agg = null;
        if (et == ExecType.CP) {
            int k = OptimizerUtils.getConstrainedNumThreads(_maxNumThreads);
            grp_agg = new GroupedAggregate(inputlops, getDataType(), getValueType(), et, k);
            grp_agg.getOutputParameters().setDimensions(outputDim1, outputDim2, getRowsInBlock(), getColsInBlock(), -1);
        } else if (et == ExecType.SPARK) {
            //physical operator selection
            Hop groups = getInput().get(_paramIndexMap.get(Statement.GAGG_GROUPS));
            boolean broadcastGroups = (_paramIndexMap.get(Statement.GAGG_WEIGHTS) == null && OptimizerUtils.checkSparkBroadcastMemoryBudget(groups.getDim1(), groups.getDim2(), groups.getRowsInBlock(), groups.getColsInBlock(), groups.getNnz()));
            if (//mapgroupedagg
            broadcastGroups && getInput().get(_paramIndexMap.get(Statement.GAGG_FN)) instanceof LiteralOp && ((LiteralOp) getInput().get(_paramIndexMap.get(Statement.GAGG_FN))).getStringValue().equals("sum") && inputlops.get(Statement.GAGG_NUM_GROUPS) != null) {
                Hop target = getInput().get(_paramIndexMap.get(Statement.GAGG_TARGET));
                grp_agg = new GroupedAggregateM(inputlops, getDataType(), getValueType(), true, ExecType.SPARK);
                grp_agg.getOutputParameters().setDimensions(outputDim1, outputDim2, target.getRowsInBlock(), target.getColsInBlock(), -1);
            //no reblock required (directly output binary block)
            } else //groupedagg (w/ or w/o broadcast)
                grp_agg = new GroupedAggregate(inputlops, getDataType(), getValueType(), et, broadcastGroups);
                grp_agg.getOutputParameters().setDimensions(outputDim1, outputDim2, -1, -1, -1);
Also used : Group(org.apache.sysml.lops.Group) MultiThreadedHop(org.apache.sysml.hops.Hop.MultiThreadedHop) Data(org.apache.sysml.lops.Data) Lop(org.apache.sysml.lops.Lop) RepMat(org.apache.sysml.lops.RepMat) AppendR(org.apache.sysml.lops.AppendR) ExecType(org.apache.sysml.lops.LopProperties.ExecType) GroupedAggregate(org.apache.sysml.lops.GroupedAggregate) Aggregate(org.apache.sysml.lops.Aggregate) GroupedAggregate(org.apache.sysml.lops.GroupedAggregate) DataPartition(org.apache.sysml.lops.DataPartition) GroupedAggregateM(org.apache.sysml.lops.GroupedAggregateM)

Example 5 with ExecType

use of org.apache.sysml.lops.LopProperties.ExecType in project incubator-systemml by apache.

the class ParameterizedBuiltinOp method constructLops.

public Lop constructLops() throws HopsException, LopsException {
    //return already created lops
    if (getLops() != null)
        return getLops();
    // construct lops for all input parameters
    HashMap<String, Lop> inputlops = new HashMap<String, Lop>();
    for (Entry<String, Integer> cur : _paramIndexMap.entrySet()) {
        inputlops.put(cur.getKey(), getInput().get(cur.getValue()).constructLops());
    switch(_op) {
        case GROUPEDAGG:
                ExecType et = optFindExecType();
                constructLopsGroupedAggregate(inputlops, et);
        case RMEMPTY:
                ExecType et = optFindExecType();
                et = (et == ExecType.MR && !COMPILE_PARALLEL_REMOVEEMPTY) ? ExecType.CP_FILE : et;
                constructLopsRemoveEmpty(inputlops, et);
        case REXPAND:
                ExecType et = optFindExecType();
                constructLopsRExpand(inputlops, et);
        case TRANSFORM:
                ExecType et = optFindExecType();
                ParameterizedBuiltin pbilop = new ParameterizedBuiltin(inputlops, HopsParameterizedBuiltinLops.get(_op), getDataType(), getValueType(), et);
                // output of transform is always in CSV format
                // to produce a blocked output, this lop must be 
                // fed into CSV Reblock lop.
        case CDF:
        case INVCDF:
        case REPLACE:
        case TRANSFORMAPPLY:
        case TRANSFORMMETA:
        case TOSTRING:
                ExecType et = optFindExecType();
                ParameterizedBuiltin pbilop = new ParameterizedBuiltin(inputlops, HopsParameterizedBuiltinLops.get(_op), getDataType(), getValueType(), et);
            throw new HopsException("Unknown ParamBuiltinOp: " + _op);
    //add reblock/checkpoint lops if necessary
    return getLops();
Also used : ParameterizedBuiltin(org.apache.sysml.lops.ParameterizedBuiltin) HashMap(java.util.HashMap) ExecType(org.apache.sysml.lops.LopProperties.ExecType) Lop(org.apache.sysml.lops.Lop)


ExecType (org.apache.sysml.lops.LopProperties.ExecType)58 Lop (org.apache.sysml.lops.Lop)24 MultiThreadedHop (org.apache.sysml.hops.Hop.MultiThreadedHop)13 Group (org.apache.sysml.lops.Group)12 Aggregate (org.apache.sysml.lops.Aggregate)10 LopsException (org.apache.sysml.lops.LopsException)8 DataPartition (org.apache.sysml.lops.DataPartition)6 SortKeys (org.apache.sysml.lops.SortKeys)6 UnaryCP (org.apache.sysml.lops.UnaryCP)6 CombineUnary (org.apache.sysml.lops.CombineUnary)5 PickByCount (org.apache.sysml.lops.PickByCount)5 ArrayList (java.util.ArrayList)4 CombineBinary (org.apache.sysml.lops.CombineBinary)4 HashMap (java.util.HashMap)3 Data (org.apache.sysml.lops.Data)3 PartialAggregate (org.apache.sysml.lops.PartialAggregate)3 SparkAggType (org.apache.sysml.hops.AggBinaryOp.SparkAggType)2 DataOp (org.apache.sysml.hops.DataOp)2 Hop (org.apache.sysml.hops.Hop)2 OperationTypes (org.apache.sysml.lops.Aggregate.OperationTypes)2