Search in sources :

Example 1 with DefaultDataBag

use of org.apache.pig.data.DefaultDataBag in project hive by apache.

the class PigHCatUtil method transformToBag.

private static DataBag transformToBag(List<?> list, HCatFieldSchema hfs) throws Exception {
    if (list == null) {
        return null;
    }
    HCatFieldSchema elementSubFieldSchema = hfs.getArrayElementSchema().getFields().get(0);
    DataBag db = new DefaultDataBag();
    for (Object o : list) {
        Tuple tuple;
        if (elementSubFieldSchema.getType() == Type.STRUCT) {
            tuple = transformToTuple((List<?>) o, elementSubFieldSchema);
        } else {
            // bags always contain tuples
            tuple = tupFac.newTuple(extractPigObject(o, elementSubFieldSchema));
        }
        db.add(tuple);
    }
    return db;
}
Also used : DataBag(org.apache.pig.data.DataBag) DefaultDataBag(org.apache.pig.data.DefaultDataBag) ArrayList(java.util.ArrayList) List(java.util.List) DefaultDataBag(org.apache.pig.data.DefaultDataBag) Tuple(org.apache.pig.data.Tuple) HCatFieldSchema(org.apache.hive.hcatalog.data.schema.HCatFieldSchema)

Example 2 with DefaultDataBag

use of org.apache.pig.data.DefaultDataBag in project pygmalion by jeromatron.

the class RangeBasedStringConcatTest method testRange.

@Test
public void testRange() throws Exception {
    RangeBasedStringConcat rbsc = new RangeBasedStringConcat("0,1", " ");
    Tuple input = new DefaultTuple();
    for (String field : fields) {
        input.append(field);
    }
    String result = rbsc.exec(input);
    assertEquals("a b", result);
    rbsc = new RangeBasedStringConcat("2,6", " ");
    result = rbsc.exec(input);
    assertEquals("c g", result);
    //test out of range
    rbsc = new RangeBasedStringConcat("0,9,1000", " ");
    result = rbsc.exec(input);
    assertEquals("a", result);
    Tuple innerTuple = new DefaultTuple();
    innerTuple.append("j");
    innerTuple.append("k");
    input.append(innerTuple);
    rbsc = new RangeBasedStringConcat("0,9", " ");
    result = rbsc.exec(input);
    assertEquals("a j k", result);
    DataBag db = new DefaultDataBag();
    Tuple dbTuple = new DefaultTuple();
    dbTuple.append("l");
    dbTuple.append("m");
    db.add(dbTuple);
    innerTuple.append(db);
    rbsc = new RangeBasedStringConcat("0,9,10", " ");
    result = rbsc.exec(input);
    assertEquals("a j k l m", result);
}
Also used : DataBag(org.apache.pig.data.DataBag) DefaultDataBag(org.apache.pig.data.DefaultDataBag) DefaultTuple(org.apache.pig.data.DefaultTuple) RangeBasedStringConcat(org.pygmalion.udf.RangeBasedStringConcat) DefaultDataBag(org.apache.pig.data.DefaultDataBag) DefaultTuple(org.apache.pig.data.DefaultTuple) Tuple(org.apache.pig.data.Tuple) Test(org.junit.Test)

Example 3 with DefaultDataBag

use of org.apache.pig.data.DefaultDataBag in project pygmalion by jeromatron.

the class RangeBasedStringConcatTest method testAllConcat.

@Test
public void testAllConcat() throws Exception {
    RangeBasedStringConcat rbsc = new RangeBasedStringConcat("ALL", " ");
    Tuple input = new DefaultTuple();
    for (int i = 0; i < fields.length; i++) {
        input.append(fields[i]);
    }
    String result = rbsc.exec(input);
    assertEquals("a b c d e f g h i", result);
    Tuple innerTuple = new DefaultTuple();
    innerTuple.append("j");
    innerTuple.append("k");
    input.append(innerTuple);
    result = rbsc.exec(input);
    assertEquals("a b c d e f g h i j k", result);
    DataBag db = new DefaultDataBag();
    Tuple dbTuple = new DefaultTuple();
    dbTuple.append("l");
    dbTuple.append("m");
    db.add(dbTuple);
    innerTuple.append(db);
    result = rbsc.exec(input);
    assertEquals("a b c d e f g h i j k l m", result);
}
Also used : DataBag(org.apache.pig.data.DataBag) DefaultDataBag(org.apache.pig.data.DefaultDataBag) DefaultTuple(org.apache.pig.data.DefaultTuple) RangeBasedStringConcat(org.pygmalion.udf.RangeBasedStringConcat) DefaultDataBag(org.apache.pig.data.DefaultDataBag) DefaultTuple(org.apache.pig.data.DefaultTuple) Tuple(org.apache.pig.data.Tuple) Test(org.junit.Test)

Aggregations

DataBag (org.apache.pig.data.DataBag)3 DefaultDataBag (org.apache.pig.data.DefaultDataBag)3 Tuple (org.apache.pig.data.Tuple)3 DefaultTuple (org.apache.pig.data.DefaultTuple)2 Test (org.junit.Test)2 RangeBasedStringConcat (org.pygmalion.udf.RangeBasedStringConcat)2 ArrayList (java.util.ArrayList)1 List (java.util.List)1 HCatFieldSchema (org.apache.hive.hcatalog.data.schema.HCatFieldSchema)1