Search in sources :

Example 11 with Tuple

use of cascading.tuple.Tuple in project SpyGlass by ParallelAI.

the class HBaseScheme method source.

@Override
public boolean source(FlowProcess<JobConf> flowProcess, SourceCall<Object[], RecordReader> sourceCall) throws IOException {
    Tuple result = new Tuple();
    Object key = sourceCall.getContext()[0];
    Object value = sourceCall.getContext()[1];
    boolean hasNext = sourceCall.getInput().next(key, value);
    if (!hasNext) {
        return false;
    }
    // Skip nulls
    if (key == null || value == null) {
        return true;
    }
    ImmutableBytesWritable keyWritable = (ImmutableBytesWritable) key;
    Result row = (Result) value;
    result.add(keyWritable);
    for (int i = 0; i < this.familyNames.length; i++) {
        String familyName = this.familyNames[i];
        byte[] familyNameBytes = Bytes.toBytes(familyName);
        Fields fields = this.valueFields[i];
        for (int k = 0; k < fields.size(); k++) {
            String fieldName = (String) fields.get(k);
            byte[] fieldNameBytes = Bytes.toBytes(fieldName);
            byte[] cellValue = row.getValue(familyNameBytes, fieldNameBytes);
            result.add(cellValue != null ? new ImmutableBytesWritable(cellValue) : null);
        }
    }
    sourceCall.getIncomingEntry().setTuple(result);
    return true;
}
Also used : ImmutableBytesWritable(org.apache.hadoop.hbase.io.ImmutableBytesWritable) Fields(cascading.tuple.Fields) Tuple(cascading.tuple.Tuple) Result(org.apache.hadoop.hbase.client.Result)

Example 12 with Tuple

use of cascading.tuple.Tuple in project elephant-bird by twitter.

the class LzoBinaryScheme method source.

@Override
public boolean source(FlowProcess<? extends Configuration> flowProcess, SourceCall<Object[], RecordReader> sourceCall) throws IOException {
    Object[] context = sourceCall.getContext();
    while (sourceCall.getInput().next(context[0], context[1])) {
        Object out = ((T) context[1]).get();
        if (out != null) {
            sourceCall.getIncomingEntry().setTuple(new Tuple(out));
            return true;
        }
        LOG.warn("failed to decode record");
    }
    return false;
}
Also used : Tuple(cascading.tuple.Tuple)

Example 13 with Tuple

use of cascading.tuple.Tuple in project elephant-bird by twitter.

the class LzoBinaryScheme method source.

@Override
public boolean source(FlowProcess<JobConf> flowProcess, SourceCall<Object[], RecordReader> sourceCall) throws IOException {
    Object[] context = sourceCall.getContext();
    while (sourceCall.getInput().next(context[0], context[1])) {
        Object out = ((T) context[1]).get();
        if (out != null) {
            sourceCall.getIncomingEntry().setTuple(new Tuple(out));
            return true;
        }
        LOG.warn("failed to decode record");
    }
    return false;
}
Also used : Tuple(cascading.tuple.Tuple)

Aggregations

Tuple (cascading.tuple.Tuple)13 TupleEntry (cascading.tuple.TupleEntry)5 Fields (cascading.tuple.Fields)4 ImmutableBytesWritable (org.apache.hadoop.hbase.io.ImmutableBytesWritable)4 OutputCollector (org.apache.hadoop.mapred.OutputCollector)3 Put (org.apache.hadoop.hbase.client.Put)2 Result (org.apache.hadoop.hbase.client.Result)2 Container (org.apache.parquet.hadoop.mapred.Container)2 Function (cascading.operation.Function)1 TupleListCollector (cascading.tuple.TupleListCollector)1 ArrayList (java.util.ArrayList)1 Test (org.junit.Test)1