Search in sources :

Example 6 with TupleEntry

use of cascading.tuple.TupleEntry in project parquet-mr by apache.

the class ParquetTupleScheme method sink.

@Override
public void sink(FlowProcess<? extends JobConf> fp, SinkCall<Object[], OutputCollector> sink) throws IOException {
    TupleEntry tuple = sink.getOutgoingEntry();
    OutputCollector outputCollector = sink.getOutput();
    outputCollector.collect(null, tuple);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) TupleEntry(cascading.tuple.TupleEntry)

Example 7 with TupleEntry

use of cascading.tuple.TupleEntry in project parquet-mr by apache.

the class ParquetTupleScheme method sink.

@Override
public void sink(FlowProcess<JobConf> fp, SinkCall<Object[], OutputCollector> sink) throws IOException {
    TupleEntry tuple = sink.getOutgoingEntry();
    OutputCollector outputCollector = sink.getOutput();
    outputCollector.collect(null, tuple);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) TupleEntry(cascading.tuple.TupleEntry)

Example 8 with TupleEntry

use of cascading.tuple.TupleEntry in project parquet-mr by apache.

the class ParquetValueScheme method sink.

@SuppressWarnings("unchecked")
@Override
public void sink(FlowProcess<JobConf> fp, SinkCall<Object[], OutputCollector> sc) throws IOException {
    TupleEntry tuple = sc.getOutgoingEntry();
    if (tuple.size() != 1) {
        throw new RuntimeException("ParquetValueScheme expects tuples with an arity of exactly 1, but found " + tuple.getFields());
    }
    T value = (T) tuple.getObject(0);
    OutputCollector output = sc.getOutput();
    output.collect(null, value);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) TupleEntry(cascading.tuple.TupleEntry)

Example 9 with TupleEntry

use of cascading.tuple.TupleEntry in project SpyGlass by ParallelAI.

the class JDBCScheme method sink.

@Override
public void sink(FlowProcess<JobConf> flowProcess, SinkCall<Object[], OutputCollector> sinkCall) throws IOException {
    // it's ok to use NULL here so the collector does not write anything
    TupleEntry tupleEntry = sinkCall.getOutgoingEntry();
    OutputCollector outputCollector = sinkCall.getOutput();
    if (updateBy != null) {
        Tuple allValues = tupleEntry.selectTuple(updateValueFields);
        Tuple updateValues = tupleEntry.selectTuple(updateByFields);
        allValues = cleanTuple(allValues);
        TupleRecord key = new TupleRecord(allValues);
        if (updateValues.equals(updateIfTuple))
            outputCollector.collect(key, null);
        else
            outputCollector.collect(key, key);
        return;
    }
    Tuple result = tupleEntry.selectTuple(getSinkFields());
    result = cleanTuple(result);
    outputCollector.collect(new TupleRecord(result), null);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) TupleEntry(cascading.tuple.TupleEntry) Tuple(cascading.tuple.Tuple)

Example 10 with TupleEntry

use of cascading.tuple.TupleEntry in project SpyGlass by ParallelAI.

the class HBaseRawScheme method sink.

@SuppressWarnings("unchecked")
@Override
public void sink(FlowProcess<JobConf> flowProcess, SinkCall<Object[], OutputCollector> sinkCall) throws IOException {
    TupleEntry tupleEntry = sinkCall.getOutgoingEntry();
    OutputCollector outputCollector = sinkCall.getOutput();
    Tuple key = tupleEntry.selectTuple(RowKeyField);
    Object okey = key.getObject(0);
    ImmutableBytesWritable keyBytes = getBytes(okey);
    Put put = new Put(keyBytes.get());
    Fields outFields = tupleEntry.getFields().subtract(RowKeyField);
    if (null != outFields) {
        TupleEntry values = tupleEntry.selectEntry(outFields);
        for (int n = 0; n < values.getFields().size(); n++) {
            Object o = values.get(n);
            ImmutableBytesWritable valueBytes = getBytes(o);
            Comparable field = outFields.get(n);
            ColumnName cn = parseColumn((String) field);
            if (null == cn.family) {
                if (n >= familyNames.length)
                    cn.family = familyNames[familyNames.length - 1];
                else
                    cn.family = familyNames[n];
            }
            if (null != o || writeNulls)
                put.add(Bytes.toBytes(cn.family), Bytes.toBytes(cn.name), valueBytes.get());
        }
    }
    outputCollector.collect(null, put);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) ImmutableBytesWritable(org.apache.hadoop.hbase.io.ImmutableBytesWritable) Fields(cascading.tuple.Fields) TupleEntry(cascading.tuple.TupleEntry) Tuple(cascading.tuple.Tuple) Put(org.apache.hadoop.hbase.client.Put)

Aggregations

TupleEntry (cascading.tuple.TupleEntry)11 OutputCollector (org.apache.hadoop.mapred.OutputCollector)9 Tuple (cascading.tuple.Tuple)5 Fields (cascading.tuple.Fields)2 Put (org.apache.hadoop.hbase.client.Put)2 ImmutableBytesWritable (org.apache.hadoop.hbase.io.ImmutableBytesWritable)2