Search in sources :

Example 11 with FrontendException

use of org.apache.pig.impl.logicalLayer.FrontendException in project sketches-pig by DataSketches.

the class ReservoirSampling method outputSchema.

@Override
public Schema outputSchema(final Schema input) {
    if (input != null && input.size() > 0) {
        try {
            Schema source = input;
            // if we have a bag, grab one level down to get a tuple
            if (source.size() == 1 && source.getField(0).type == DataType.BAG) {
                source = source.getField(0).schema;
            }
            final Schema recordSchema = new Schema();
            recordSchema.add(new Schema.FieldSchema(N_ALIAS, DataType.LONG));
            recordSchema.add(new Schema.FieldSchema(K_ALIAS, DataType.INTEGER));
            // this should add a bag to the output
            recordSchema.add(new Schema.FieldSchema(SAMPLES_ALIAS, source, DataType.BAG));
            return new Schema(new Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(), source), recordSchema, DataType.TUPLE));
        } catch (final FrontendException e) {
            throw new RuntimeException(e);
        }
    }
    return null;
}
Also used : Schema(org.apache.pig.impl.logicalLayer.schema.Schema) FrontendException(org.apache.pig.impl.logicalLayer.FrontendException)

Example 12 with FrontendException

use of org.apache.pig.impl.logicalLayer.FrontendException in project parquet-mr by apache.

the class TupleReadSupport method getPigSchemaFromMultipleFiles.

/**
 * @param fileSchema the parquet schema from the file
 * @param keyValueMetaData the extra meta data from the files
 * @return the pig schema according to the file
 */
static Schema getPigSchemaFromMultipleFiles(MessageType fileSchema, Map<String, Set<String>> keyValueMetaData) {
    Set<String> pigSchemas = PigMetaData.getPigSchemas(keyValueMetaData);
    if (pigSchemas == null) {
        return pigSchemaConverter.convert(fileSchema);
    }
    Schema mergedPigSchema = null;
    for (String pigSchemaString : pigSchemas) {
        try {
            mergedPigSchema = union(mergedPigSchema, parsePigSchema(pigSchemaString));
        } catch (FrontendException e) {
            throw new ParquetDecodingException("can not merge " + pigSchemaString + " into " + mergedPigSchema, e);
        }
    }
    return mergedPigSchema;
}
Also used : ParquetDecodingException(org.apache.parquet.io.ParquetDecodingException) PigSchemaConverter.parsePigSchema(org.apache.parquet.pig.PigSchemaConverter.parsePigSchema) Schema(org.apache.pig.impl.logicalLayer.schema.Schema) FieldSchema(org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema) FrontendException(org.apache.pig.impl.logicalLayer.FrontendException)

Example 13 with FrontendException

use of org.apache.pig.impl.logicalLayer.FrontendException in project parquet-mr by apache.

the class PigSchemaConverter method convertMap.

/**
 * @param alias
 * @param fieldSchema
 * @return an optional group containing one repeated group field (key, value)
 * @throws FrontendException
 */
private GroupType convertMap(String alias, FieldSchema fieldSchema) {
    Schema innerSchema = fieldSchema.schema;
    if (innerSchema == null || innerSchema.size() != 1) {
        throw new SchemaConversionException("Invalid map Schema, schema should contain exactly one field: " + fieldSchema);
    }
    FieldSchema innerField = null;
    try {
        innerField = innerSchema.getField(0);
    } catch (FrontendException fe) {
        throw new SchemaConversionException("Invalid map schema, cannot infer innerschema: ", fe);
    }
    Type convertedValue = convertWithName(innerField, "value");
    return ConversionPatterns.stringKeyMapType(Repetition.OPTIONAL, alias, name(innerField.alias, "map"), convertedValue);
}
Also used : PrimitiveType(org.apache.parquet.schema.PrimitiveType) DataType(org.apache.pig.data.DataType) OriginalType(org.apache.parquet.schema.OriginalType) GroupType(org.apache.parquet.schema.GroupType) MessageType(org.apache.parquet.schema.MessageType) Type(org.apache.parquet.schema.Type) Schema(org.apache.pig.impl.logicalLayer.schema.Schema) FieldSchema(org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema) FieldSchema(org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema) FrontendException(org.apache.pig.impl.logicalLayer.FrontendException)

Example 14 with FrontendException

use of org.apache.pig.impl.logicalLayer.FrontendException in project parquet-mr by apache.

the class ParquetLoader method getSchemaFromRequiredFieldList.

private Schema getSchemaFromRequiredFieldList(Schema schema, List<RequiredField> fieldList) throws FrontendException {
    Schema s = new Schema();
    for (RequiredField rf : fieldList) {
        FieldSchema f;
        try {
            f = schema.getField(rf.getAlias()).clone();
        } catch (CloneNotSupportedException e) {
            throw new FrontendException("Clone not supported for the fieldschema", e);
        }
        if (rf.getSubFields() == null) {
            s.add(f);
        } else {
            Schema innerSchema = getSchemaFromRequiredFieldList(f.schema, rf.getSubFields());
            if (innerSchema == null) {
                return null;
            } else {
                f.schema = innerSchema;
                s.add(f);
            }
        }
    }
    return s;
}
Also used : Schema(org.apache.pig.impl.logicalLayer.schema.Schema) PigSchemaConverter.parsePigSchema(org.apache.parquet.pig.PigSchemaConverter.parsePigSchema) ResourceSchema(org.apache.pig.ResourceSchema) FieldSchema(org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema) FieldSchema(org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema) FrontendException(org.apache.pig.impl.logicalLayer.FrontendException)

Example 15 with FrontendException

use of org.apache.pig.impl.logicalLayer.FrontendException in project zeppelin by apache.

the class PigQueryInterpreter method interpret.

@Override
public InterpreterResult interpret(String st, InterpreterContext context) {
    // '-' is invalid for pig alias
    String alias = "paragraph_" + context.getParagraphId().replace("-", "_");
    String[] lines = st.split("\n");
    List<String> queries = new ArrayList<>();
    for (int i = 0; i < lines.length; ++i) {
        if (i == lines.length - 1) {
            lines[i] = alias + " = " + lines[i];
        }
        queries.add(lines[i]);
    }
    StringBuilder resultBuilder = new StringBuilder("%table ");
    try {
        pigServer.setJobName(createJobName(st, context));
        File tmpScriptFile = PigUtils.createTempPigScript(queries);
        // each thread should its own ScriptState & PigStats
        ScriptState.start(pigServer.getPigContext().getExecutionEngine().instantiateScriptState());
        // reset PigStats, otherwise you may get the PigStats of last job in the same thread
        // because PigStats is ThreadLocal variable
        PigStats.start(pigServer.getPigContext().getExecutionEngine().instantiatePigStats());
        PigScriptListener scriptListener = new PigScriptListener();
        ScriptState.get().registerListener(scriptListener);
        listenerMap.put(context.getParagraphId(), scriptListener);
        pigServer.registerScript(tmpScriptFile.getAbsolutePath());
        Schema schema = pigServer.dumpSchema(alias);
        boolean schemaKnown = (schema != null);
        if (schemaKnown) {
            for (int i = 0; i < schema.size(); ++i) {
                Schema.FieldSchema field = schema.getField(i);
                resultBuilder.append(field.alias != null ? field.alias : "col_" + i);
                if (i != schema.size() - 1) {
                    resultBuilder.append("\t");
                }
            }
            resultBuilder.append("\n");
        }
        Iterator<Tuple> iter = pigServer.openIterator(alias);
        boolean firstRow = true;
        int index = 0;
        while (iter.hasNext() && index < maxResult) {
            index++;
            Tuple tuple = iter.next();
            if (firstRow && !schemaKnown) {
                for (int i = 0; i < tuple.size(); ++i) {
                    resultBuilder.append("c_" + i + "\t");
                }
                resultBuilder.append("\n");
                firstRow = false;
            }
            resultBuilder.append(StringUtils.join(tuple.iterator(), "\t"));
            resultBuilder.append("\n");
        }
        if (index >= maxResult && iter.hasNext()) {
            resultBuilder.append("\n");
            resultBuilder.append(ResultMessages.getExceedsLimitRowsMessage(maxResult, MAX_RESULTS));
        }
    } catch (IOException e) {
        // 4. Other errors.
        if (e instanceof FrontendException) {
            FrontendException fe = (FrontendException) e;
            if (!fe.getMessage().contains("Backend error :")) {
                LOGGER.error("Fail to run pig query.", e);
                return new InterpreterResult(Code.ERROR, ExceptionUtils.getStackTrace(e));
            }
        }
        if (e.getCause() instanceof ParseException) {
            return new InterpreterResult(Code.ERROR, e.getMessage());
        }
        PigStats stats = PigStats.get();
        if (stats != null) {
            String errorMsg = stats.getDisplayString();
            if (errorMsg != null) {
                return new InterpreterResult(Code.ERROR, errorMsg);
            }
        }
        LOGGER.error("Fail to run pig query.", e);
        return new InterpreterResult(Code.ERROR, ExceptionUtils.getStackTrace(e));
    } finally {
        listenerMap.remove(context.getParagraphId());
    }
    return new InterpreterResult(Code.SUCCESS, resultBuilder.toString());
}
Also used : PigStats(org.apache.pig.tools.pigstats.PigStats) Schema(org.apache.pig.impl.logicalLayer.schema.Schema) ArrayList(java.util.ArrayList) InterpreterResult(org.apache.zeppelin.interpreter.InterpreterResult) IOException(java.io.IOException) ParseException(org.apache.pig.tools.pigscript.parser.ParseException) File(java.io.File) Tuple(org.apache.pig.data.Tuple) FrontendException(org.apache.pig.impl.logicalLayer.FrontendException)

Aggregations

FrontendException (org.apache.pig.impl.logicalLayer.FrontendException)36 Schema (org.apache.pig.impl.logicalLayer.schema.Schema)27 FieldSchema (org.apache.pig.impl.logicalLayer.schema.Schema.FieldSchema)11 ArrayList (java.util.ArrayList)6 HCatFieldSchema (org.apache.hive.hcatalog.data.schema.HCatFieldSchema)5 HCatSchema (org.apache.hive.hcatalog.data.schema.HCatSchema)5 ResourceSchema (org.apache.pig.ResourceSchema)4 IOException (java.io.IOException)3 OriginalType (org.apache.parquet.schema.OriginalType)3 PigServer (org.apache.pig.PigServer)3 DataType (org.apache.pig.data.DataType)3 Tuple (org.apache.pig.data.Tuple)3 File (java.io.File)2 List (java.util.List)2 HCatException (org.apache.hive.hcatalog.common.HCatException)2 PigSchemaConverter.parsePigSchema (org.apache.parquet.pig.PigSchemaConverter.parsePigSchema)2 GroupType (org.apache.parquet.schema.GroupType)2 MessageType (org.apache.parquet.schema.MessageType)2 PrimitiveType (org.apache.parquet.schema.PrimitiveType)2 Type (org.apache.parquet.schema.Type)2