Search in sources :

Example 6 with Fields

use of cascading.tuple.Fields in project SpyGlass by ParallelAI.

the class HBaseScheme method columns.

private String[] columns(String[] familyNames, Fields[] fieldsArray) {
    if (columns != null) {
        return columns;
    }
    int size = 0;
    for (Fields fields : fieldsArray) {
        size += fields.size();
    }
    columns = new String[size];
    int count = 0;
    for (int i = 0; i < fieldsArray.length; i++) {
        Fields fields = fieldsArray[i];
        for (int j = 0; j < fields.size(); j++) {
            if (familyNames == null) {
                columns[count++] = hbaseColumn((String) fields.get(j));
            } else {
                columns[count++] = hbaseColumn(familyNames[i]) + (String) fields.get(j);
            }
        }
    }
    return columns;
}
Also used : Fields(cascading.tuple.Fields)

Example 7 with Fields

use of cascading.tuple.Fields in project SpyGlass by ParallelAI.

the class HBaseScheme method setSourceSink.

private void setSourceSink(Fields keyFields, Fields[] columnFields) {
    Fields allFields = keyFields;
    if (columnFields.length != 0) {
        // prepend
        allFields = Fields.join(keyFields, Fields.join(columnFields));
    }
    setSourceFields(allFields);
    setSinkFields(allFields);
}
Also used : Fields(cascading.tuple.Fields)

Example 8 with Fields

use of cascading.tuple.Fields in project SpyGlass by ParallelAI.

the class HBaseRawScheme method sink.

@SuppressWarnings("unchecked")
@Override
public void sink(FlowProcess<JobConf> flowProcess, SinkCall<Object[], OutputCollector> sinkCall) throws IOException {
    TupleEntry tupleEntry = sinkCall.getOutgoingEntry();
    OutputCollector outputCollector = sinkCall.getOutput();
    Tuple key = tupleEntry.selectTuple(RowKeyField);
    Object okey = key.getObject(0);
    ImmutableBytesWritable keyBytes = getBytes(okey);
    Put put = new Put(keyBytes.get());
    Fields outFields = tupleEntry.getFields().subtract(RowKeyField);
    if (null != outFields) {
        TupleEntry values = tupleEntry.selectEntry(outFields);
        for (int n = 0; n < values.getFields().size(); n++) {
            Object o = values.get(n);
            ImmutableBytesWritable valueBytes = getBytes(o);
            Comparable field = outFields.get(n);
            ColumnName cn = parseColumn((String) field);
            if (null == cn.family) {
                if (n >= familyNames.length)
                    cn.family = familyNames[familyNames.length - 1];
                else
                    cn.family = familyNames[n];
            }
            if (null != o || writeNulls)
                put.add(Bytes.toBytes(cn.family), Bytes.toBytes(cn.name), valueBytes.get());
        }
    }
    outputCollector.collect(null, put);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) ImmutableBytesWritable(org.apache.hadoop.hbase.io.ImmutableBytesWritable) Fields(cascading.tuple.Fields) TupleEntry(cascading.tuple.TupleEntry) Tuple(cascading.tuple.Tuple) Put(org.apache.hadoop.hbase.client.Put)

Example 9 with Fields

use of cascading.tuple.Fields in project SpyGlass by ParallelAI.

the class HBaseRawScheme method setSourceFields.

private void setSourceFields() {
    Fields sourceFields = Fields.join(RowKeyField, RowField);
    setSourceFields(sourceFields);
}
Also used : Fields(cascading.tuple.Fields)

Aggregations

Fields (cascading.tuple.Fields)9 Tuple (cascading.tuple.Tuple)4 ImmutableBytesWritable (org.apache.hadoop.hbase.io.ImmutableBytesWritable)3 Flow (cascading.flow.Flow)2 FlowDef (cascading.flow.FlowDef)2 Insert (cascading.operation.Insert)2 ExpressionFunction (cascading.operation.expression.ExpressionFunction)2 RegexFilter (cascading.operation.regex.RegexFilter)2 RegexSplitGenerator (cascading.operation.regex.RegexSplitGenerator)2 CoGroup (cascading.pipe.CoGroup)2 Each (cascading.pipe.Each)2 GroupBy (cascading.pipe.GroupBy)2 HashJoin (cascading.pipe.HashJoin)2 Pipe (cascading.pipe.Pipe)2 CountBy (cascading.pipe.assembly.CountBy)2 Rename (cascading.pipe.assembly.Rename)2 Retain (cascading.pipe.assembly.Retain)2 SumBy (cascading.pipe.assembly.SumBy)2 Unique (cascading.pipe.assembly.Unique)2 LeftJoin (cascading.pipe.joiner.LeftJoin)2