Search in sources :

Example 6 with LivyUserException

use of com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException in project kylo by Teradata.

the class SparkLivyRestClient method downloadTransform.

@Nonnull
@Override
public Optional<Response> downloadTransform(@Nonnull SparkShellProcess process, @Nonnull String transformId, @Nonnull String saveId) {
    // 1. get "SaveResult" serialized from Livy
    logger.entry(process, transformId, saveId);
    JerseyRestClient client = sparkLivyProcessManager.getClient(process);
    String script = scriptGenerator.script("getSaveResult", ScalaScriptUtils.scalaStr(saveId));
    Statement statement = submitCode(client, script, process);
    if (statement.getState() == StatementState.running || statement.getState() == StatementState.waiting) {
        statement = pollStatement(client, process, statement.getId());
    } else {
        throw logger.throwing(new LivyUserException("livy.unexpected_error"));
    }
    URI uri = LivyRestModelTransformer.toUri(statement);
    SaveResult result = new SaveResult(new Path(uri));
    // 2. Create a response with data from filesysem
    if (result.getPath() != null) {
        Optional<Response> response = Optional.of(Response.ok(new ZipStreamingOutput(result.getPath(), fileSystem)).header(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_OCTET_STREAM).header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"" + saveId + ".zip\"").build());
        return logger.exit(response);
    } else {
        return logger.exit(Optional.of(createErrorResponse(Response.Status.NOT_FOUND, "download.notFound")));
    }
}
Also used : Path(org.apache.hadoop.fs.Path) SparkJobResponse(com.thinkbiganalytics.kylo.spark.rest.model.job.SparkJobResponse) TransformResponse(com.thinkbiganalytics.spark.rest.model.TransformResponse) Response(javax.ws.rs.core.Response) SaveResponse(com.thinkbiganalytics.spark.rest.model.SaveResponse) ServerStatusResponse(com.thinkbiganalytics.spark.rest.model.ServerStatusResponse) ZipStreamingOutput(com.thinkbiganalytics.spark.io.ZipStreamingOutput) Statement(com.thinkbiganalytics.kylo.spark.model.Statement) SaveResult(com.thinkbiganalytics.spark.model.SaveResult) LivyUserException(com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException) URI(java.net.URI) JerseyRestClient(com.thinkbiganalytics.rest.JerseyRestClient) Nonnull(javax.annotation.Nonnull)

Example 7 with LivyUserException

use of com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException in project kylo by Teradata.

the class DefaultLivyClient method pollStatement.

@Override
public Statement pollStatement(JerseyRestClient jerseyClient, SparkLivyProcess sparkLivyProcess, Integer stmtId, Long wait) {
    logger.entry(jerseyClient, sparkLivyProcess, stmtId, wait);
    long stopPolling = Long.MAX_VALUE;
    long startMillis = System.currentTimeMillis();
    if (wait != null) {
        // Limit the amount of time we will poll for a statement to complete.
        stopPolling = startMillis + livyProperties.getPollingLimit();
    }
    Statement statement;
    int pollCount = 1;
    do {
        statement = getStatement(jerseyClient, sparkLivyProcess, stmtId);
        if (statement.getState().equals(StatementState.error)) {
            // TODO: what about cancelled? or cancelling?
            logger.error("Unexpected error encountered while processing a statement", new LivyCodeException(statement.toString()));
            throw logger.throwing(new LivyUserException("livy.unexpected_error"));
        }
        if (System.currentTimeMillis() > stopPolling || statement.getState().equals(StatementState.available)) {
            break;
        }
        logger.trace("Statement was not ready, polling now with attempt '{}'", pollCount++);
        // statement not ready, wait for some time...
        try {
            Thread.sleep(livyProperties.getPollingInterval());
        } catch (InterruptedException e) {
            logger.error("Thread interrupted while polling Livy", e);
        }
    } while (true);
    logger.debug("exit DefaultLivyClient poll statement in '{}' millis, after '{}' attempts ", System.currentTimeMillis() - startMillis, pollCount);
    return logger.exit(statement);
}
Also used : Statement(com.thinkbiganalytics.kylo.spark.model.Statement) LivyCodeException(com.thinkbiganalytics.kylo.spark.exceptions.LivyCodeException) LivyUserException(com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException)

Example 8 with LivyUserException

use of com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException in project kylo by Teradata.

the class LivyRestModelTransformer method toTransformQueryResultWithSchema.

private static TransformQueryResult toTransformQueryResultWithSchema(TransformResponse transformResponse, StatementOutputResponse sor) {
    logger.entry(sor);
    checkCodeWasWellFormed(sor);
    TransformQueryResult tqr = new TransformQueryResult();
    transformResponse.setResults(tqr);
    tqr.setColumns(Lists.newArrayList());
    JsonNode data = sor.getData();
    if (data != null) {
        JsonNode appJson = data.get("application/json");
        String payload = appJson.asText();
        ArrayNode json;
        try {
            json = (ArrayNode) mapper.readTree(payload);
        } catch (IOException e) {
            logger.error("An unexpected IOException occurred", new LivyDeserializationException("could not deserialize JSON returned from Livy"));
            throw logger.throwing(new LivyUserException("livy.unexpected_error"));
        }
        // end try/catch
        // array contains three objects (dfRows, actualCols, actualRows )
        transformResponse.setActualCols(json.get(1).asInt());
        transformResponse.setActualRows(json.get(2).asInt());
        json = (ArrayNode) json.get(0);
        int numRows = 0;
        Iterator<JsonNode> rowIter = json.elements();
        List<List<Object>> rowData = Lists.newArrayList();
        while (rowIter.hasNext()) {
            JsonNode row = rowIter.next();
            if (numRows++ == 0) {
                String schemaPayload = row.asText();
                ObjectNode schemaObj;
                try {
                    schemaObj = (ObjectNode) mapper.readTree(schemaPayload);
                } catch (IOException e) {
                    logger.error("Unexpected error deserializing results", new LivyDeserializationException("Unable to deserialize dataFrame schema as serialized by Livy"));
                    throw logger.throwing(new LivyUserException("livy.unexpected_error"));
                }
                // end try/catch
                // build column metadata
                logger.debug("build column metadata");
                String type = schemaObj.get("type").asText();
                if (type.equals("struct")) {
                    ArrayNode fields = (ArrayNode) schemaObj.get("fields");
                    Iterator<JsonNode> colObjsIter = fields.elements();
                    int colIdx = 0;
                    while (colObjsIter.hasNext()) {
                        ObjectNode colObj = (ObjectNode) colObjsIter.next();
                        final JsonNode dataType = colObj.get("type");
                        JsonNode metadata = colObj.get("metadata");
                        String name = colObj.get("name").asText();
                        // "true"|"false"
                        String nullable = colObj.get("nullable").asText();
                        QueryResultColumn qrc = new DefaultQueryResultColumn();
                        qrc.setDisplayName(name);
                        qrc.setField(name);
                        // not used, but still be expected to be unique
                        qrc.setHiveColumnLabel(name);
                        qrc.setIndex(colIdx++);
                        // dataType is always empty if %json of dataframe directly:: https://www.mail-archive.com/user@livy.incubator.apache.org/msg00262.html
                        qrc.setDataType(convertDataFrameDataType(dataType));
                        qrc.setComment(metadata.asText());
                        tqr.getColumns().add(qrc);
                    }
                }
                // will there be types other than "struct"?
                continue;
            }
            // end schema extraction
            // get row data
            logger.debug("build row data");
            ArrayNode valueRows = (ArrayNode) row;
            Iterator<JsonNode> valuesIter = valueRows.elements();
            while (valuesIter.hasNext()) {
                ArrayNode valueNode = (ArrayNode) valuesIter.next();
                Iterator<JsonNode> valueNodes = valueNode.elements();
                List<Object> newValues = Lists.newArrayListWithCapacity(tqr.getColumns().size());
                while (valueNodes.hasNext()) {
                    JsonNode value = valueNodes.next();
                    // extract values according to how jackson deserialized it
                    if (value.isObject()) {
                        // spark treats an array as a struct with a single field "values" ...
                        // Maps and structs can't contain arrays so
                        ArrayNode valuesArray = (ArrayNode) value.get("values");
                        if (valuesArray != null && valuesArray.isArray()) {
                            Iterator<JsonNode> arrIter = valuesArray.iterator();
                            List<Object> arrVals = Lists.newArrayListWithExpectedSize(valuesArray.size());
                            while (arrIter.hasNext()) {
                                JsonNode valNode = arrIter.next();
                                if (valNode.isNumber()) {
                                    arrVals.add(valNode.numberValue());
                                } else {
                                    arrVals.add(valNode.asText());
                                }
                            // end if
                            }
                            // end while
                            newValues.add(arrVals.toArray());
                        } else {
                            Map<String, Object> result = null;
                            try {
                                result = mapper.convertValue(value, Map.class);
                            } catch (Exception e) {
                                // column value must be a struct or other complex type that we don't handle special..
                                newValues.add(value.toString());
                            }
                            newValues.add(result);
                        }
                    // end if
                    } else if (value.isNumber()) {
                        // easy peasy.. it's just a number
                        newValues.add(value.numberValue());
                    } else if (value.isNull()) {
                        newValues.add(null);
                    } else if (value.isValueNode()) {
                        // value Nodes we just get the raw text..
                        newValues.add(value.asText());
                    } else {
                        // default = treat it as string..
                        newValues.add(value.toString());
                    }
                // end if
                }
                // end while
                rowData.add(newValues);
            }
        // end of valueRows
        }
        // end sor.data
        logger.trace("rowData={}", rowData);
        tqr.setRows(rowData);
    // tqr.setValidationResults(null);
    }
    return logger.exit(tqr);
}
Also used : ObjectNode(com.fasterxml.jackson.databind.node.ObjectNode) DefaultQueryResultColumn(com.thinkbiganalytics.discovery.model.DefaultQueryResultColumn) JsonNode(com.fasterxml.jackson.databind.JsonNode) IOException(java.io.IOException) SparkLivySaveException(com.thinkbiganalytics.kylo.spark.livy.SparkLivySaveException) IOException(java.io.IOException) LivyUserException(com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException) LivyCodeException(com.thinkbiganalytics.kylo.spark.exceptions.LivyCodeException) WebApplicationException(javax.ws.rs.WebApplicationException) LivyDeserializationException(com.thinkbiganalytics.kylo.spark.exceptions.LivyDeserializationException) TransformQueryResult(com.thinkbiganalytics.spark.rest.model.TransformQueryResult) List(java.util.List) ArrayNode(com.fasterxml.jackson.databind.node.ArrayNode) LivyUserException(com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException) DefaultQueryResultColumn(com.thinkbiganalytics.discovery.model.DefaultQueryResultColumn) QueryResultColumn(com.thinkbiganalytics.discovery.schema.QueryResultColumn) LivyDeserializationException(com.thinkbiganalytics.kylo.spark.exceptions.LivyDeserializationException) Map(java.util.Map)

Example 9 with LivyUserException

use of com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException in project kylo by Teradata.

the class SparkLivyRestClient method kyloCatalogTransform.

@Nonnull
public TransformResponse kyloCatalogTransform(@Nonnull final SparkShellProcess process, @Nonnull final KyloCatalogReadRequest request) {
    logger.entry(process, request);
    String script = scriptGenerator.wrappedScript("kyloCatalogTransform", "", "\n", ScalaScriptUtils.toJsonInScalaString(request));
    logger.debug("scala str\n{}", script);
    JerseyRestClient client = sparkLivyProcessManager.getClient(process);
    Statement statement = submitCode(client, script, process);
    if (statement.getState() == StatementState.running || statement.getState() == StatementState.waiting) {
        statement = pollStatement(client, process, statement.getId());
    } else {
        throw logger.throwing(new LivyUserException("livy.unexpected_error"));
    }
    // call with null so a transformId will be generated for this query
    return logger.exit(LivyRestModelTransformer.toTransformResponse(statement, null));
}
Also used : Statement(com.thinkbiganalytics.kylo.spark.model.Statement) LivyUserException(com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException) JerseyRestClient(com.thinkbiganalytics.rest.JerseyRestClient) Nonnull(javax.annotation.Nonnull)

Example 10 with LivyUserException

use of com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException in project kylo by Teradata.

the class SparkLivyRestClient method getDataSources.

@Nonnull
@Override
public DataSources getDataSources(@Nonnull SparkShellProcess process) {
    logger.entry(process);
    JerseyRestClient client = sparkLivyProcessManager.getClient(process);
    String script = scriptGenerator.script("getDataSources");
    Statement statement = submitCode(client, script, process);
    if (statement.getState() == StatementState.running || statement.getState() == StatementState.waiting) {
        statement = pollStatement(client, process, statement.getId());
    } else {
        throw logger.throwing(new LivyUserException("livy.unexpected_error"));
    }
    return logger.exit(LivyRestModelTransformer.toDataSources(statement));
}
Also used : Statement(com.thinkbiganalytics.kylo.spark.model.Statement) LivyUserException(com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException) JerseyRestClient(com.thinkbiganalytics.rest.JerseyRestClient) Nonnull(javax.annotation.Nonnull)

Aggregations

LivyUserException (com.thinkbiganalytics.kylo.spark.exceptions.LivyUserException)13 JerseyRestClient (com.thinkbiganalytics.rest.JerseyRestClient)7 Statement (com.thinkbiganalytics.kylo.spark.model.Statement)6 Nonnull (javax.annotation.Nonnull)5 JsonNode (com.fasterxml.jackson.databind.JsonNode)4 LivyCodeException (com.thinkbiganalytics.kylo.spark.exceptions.LivyCodeException)4 LivyDeserializationException (com.thinkbiganalytics.kylo.spark.exceptions.LivyDeserializationException)4 ArrayNode (com.fasterxml.jackson.databind.node.ArrayNode)3 ObjectNode (com.fasterxml.jackson.databind.node.ObjectNode)3 Session (com.thinkbiganalytics.kylo.spark.model.Session)3 TransformResponse (com.thinkbiganalytics.spark.rest.model.TransformResponse)3 IOException (java.io.IOException)3 DefaultQueryResultColumn (com.thinkbiganalytics.discovery.model.DefaultQueryResultColumn)2 QueryResultColumn (com.thinkbiganalytics.discovery.schema.QueryResultColumn)2 LivyServerNotReachableException (com.thinkbiganalytics.kylo.spark.exceptions.LivyServerNotReachableException)2 SparkLivySaveException (com.thinkbiganalytics.kylo.spark.livy.SparkLivySaveException)2 SparkJobResponse (com.thinkbiganalytics.kylo.spark.rest.model.job.SparkJobResponse)2 SaveResponse (com.thinkbiganalytics.spark.rest.model.SaveResponse)2 ServerStatusResponse (com.thinkbiganalytics.spark.rest.model.ServerStatusResponse)2 TransformQueryResult (com.thinkbiganalytics.spark.rest.model.TransformQueryResult)2