Search in sources :

Example 1 with TupleEntry

use of cascading.tuple.TupleEntry in project elephant-bird by twitter.

the class LzoBinaryScheme method sink.

@Override
public void sink(FlowProcess<JobConf> flowProcess, SinkCall<T, OutputCollector> sinkCall) throws IOException {
    OutputCollector collector = sinkCall.getOutput();
    TupleEntry entry = sinkCall.getOutgoingEntry();
    T writable = sinkCall.getContext();
    writable.set((M) entry.getTuple().getObject(0));
    collector.collect(null, writable);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) TupleEntry(cascading.tuple.TupleEntry)

Example 2 with TupleEntry

use of cascading.tuple.TupleEntry in project parquet-mr by apache.

the class ParquetValueScheme method sink.

@SuppressWarnings("unchecked")
@Override
public void sink(FlowProcess<? extends JobConf> fp, SinkCall<Object[], OutputCollector> sc) throws IOException {
    TupleEntry tuple = sc.getOutgoingEntry();
    if (tuple.size() != 1) {
        throw new RuntimeException("ParquetValueScheme expects tuples with an arity of exactly 1, but found " + tuple.getFields());
    }
    T value = (T) tuple.getObject(0);
    OutputCollector output = sc.getOutput();
    output.collect(null, value);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) TupleEntry(cascading.tuple.TupleEntry)

Example 3 with TupleEntry

use of cascading.tuple.TupleEntry in project Impatient by Cascading.

the class ScrubFunction method operate.

public void operate(FlowProcess flowProcess, FunctionCall functionCall) {
    TupleEntry argument = functionCall.getArguments();
    String doc_id = argument.getString(0);
    String token = scrubText(argument.getString(1));
    if (token.length() > 0) {
        Tuple result = new Tuple();
        result.add(doc_id);
        result.add(token);
        functionCall.getOutputCollector().add(result);
    }
}
Also used : TupleEntry(cascading.tuple.TupleEntry) Tuple(cascading.tuple.Tuple)

Example 4 with TupleEntry

use of cascading.tuple.TupleEntry in project SpyGlass by ParallelAI.

the class HBaseScheme method sink.

@Override
public void sink(FlowProcess<JobConf> flowProcess, SinkCall<Object[], OutputCollector> sinkCall) throws IOException {
    TupleEntry tupleEntry = sinkCall.getOutgoingEntry();
    OutputCollector outputCollector = sinkCall.getOutput();
    Tuple key = tupleEntry.selectTuple(keyField);
    ImmutableBytesWritable keyBytes = (ImmutableBytesWritable) key.getObject(0);
    if (useSalt) {
        keyBytes = HBaseSalter.addSaltPrefix(keyBytes);
    }
    Put put;
    if (this.timeStamp == 0L) {
        put = new Put(keyBytes.get());
    } else {
        put = new Put(keyBytes.get(), this.timeStamp);
    }
    for (int i = 0; i < valueFields.length; i++) {
        Fields fieldSelector = valueFields[i];
        TupleEntry values = tupleEntry.selectEntry(fieldSelector);
        for (int j = 0; j < values.getFields().size(); j++) {
            Fields fields = values.getFields();
            Tuple tuple = values.getTuple();
            ImmutableBytesWritable valueBytes = (ImmutableBytesWritable) tuple.getObject(j);
            if (valueBytes != null)
                put.add(Bytes.toBytes(familyNames[i]), Bytes.toBytes((String) fields.get(j)), valueBytes.get());
        }
    }
    outputCollector.collect(null, put);
}
Also used : OutputCollector(org.apache.hadoop.mapred.OutputCollector) ImmutableBytesWritable(org.apache.hadoop.hbase.io.ImmutableBytesWritable) Fields(cascading.tuple.Fields) TupleEntry(cascading.tuple.TupleEntry) Tuple(cascading.tuple.Tuple) Put(org.apache.hadoop.hbase.client.Put)

Example 5 with TupleEntry

use of cascading.tuple.TupleEntry in project ambrose by twitter.

the class ScrubFunction method operate.

public void operate(FlowProcess flowProcess, FunctionCall functionCall) {
    TupleEntry argument = functionCall.getArguments();
    String doc_id = argument.getString(0);
    String token = scrubText(argument.getString(1));
    if (token.length() > 0) {
        Tuple result = new Tuple();
        result.add(doc_id);
        result.add(token);
        functionCall.getOutputCollector().add(result);
    }
}
Also used : TupleEntry(cascading.tuple.TupleEntry) Tuple(cascading.tuple.Tuple)

Aggregations

TupleEntry (cascading.tuple.TupleEntry)11 OutputCollector (org.apache.hadoop.mapred.OutputCollector)9 Tuple (cascading.tuple.Tuple)5 Fields (cascading.tuple.Fields)2 Put (org.apache.hadoop.hbase.client.Put)2 ImmutableBytesWritable (org.apache.hadoop.hbase.io.ImmutableBytesWritable)2