Search in sources :

Example 1 with SchemaParserDescriptor

use of com.thinkbiganalytics.discovery.model.SchemaParserDescriptor in project kylo by Teradata.

the class SchemaDiscoveryRestController method getFileParsers.

@GET
@Path("/file-parsers")
@Produces(MediaType.APPLICATION_JSON)
@ApiOperation("Gets the available file parsers.")
@ApiResponses(@ApiResponse(code = 200, message = "Returns the file parsers.", response = SchemaParserDescriptor.class, responseContainer = "List"))
public Response getFileParsers() {
    List<FileSchemaParser> parsers = FileParserFactory.instance().listSchemaParsers();
    List<SchemaParserDescriptor> descriptors = new ArrayList<>();
    SchemaParserAnnotationTransformer transformer = new SchemaParserAnnotationTransformer();
    for (FileSchemaParser parser : parsers) {
        SchemaParserDescriptor descriptor = transformer.toUIModel(parser);
        descriptors.add(descriptor);
    }
    return Response.ok(descriptors).build();
}
Also used : ArrayList(java.util.ArrayList) FileSchemaParser(com.thinkbiganalytics.discovery.parser.FileSchemaParser) SchemaParserDescriptor(com.thinkbiganalytics.discovery.model.SchemaParserDescriptor) Path(javax.ws.rs.Path) Produces(javax.ws.rs.Produces) GET(javax.ws.rs.GET) ApiOperation(io.swagger.annotations.ApiOperation) ApiResponses(io.swagger.annotations.ApiResponses)

Example 2 with SchemaParserDescriptor

use of com.thinkbiganalytics.discovery.model.SchemaParserDescriptor in project kylo by Teradata.

the class SchemaDiscoveryRestController method uploadFileSpark.

/**
 * Generate the spark script that can parse the passed in file using the passed in "parserDescriptor"
 *
 * @param parserDescriptor  metadata about how the file should be parsed
 * @param dataFrameVariable the name of the dataframe variable in the generate spark code
 * @param limit             a number indicating how many rows the script should limit the output
 * @param fileInputStream   the file
 * @param fileMetaData      metadata about the file
 * @return an object including the name of the file on disk and the generated spark script
 */
@POST
@Path("/spark/sample-file")
@Consumes(MediaType.MULTIPART_FORM_DATA)
@Produces(MediaType.APPLICATION_JSON)
@ApiOperation("Determines the schema of the provided file.")
@ApiResponses({ @ApiResponse(code = 200, message = "Returns the spark script that parses the sample file.", response = Schema.class), @ApiResponse(code = 500, message = "The schema could not be determined.", response = RestResponseStatus.class) })
public Response uploadFileSpark(@FormDataParam("parser") String parserDescriptor, @FormDataParam("dataFrameVariable") @DefaultValue("df") String dataFrameVariable, @FormDataParam("limit") @DefaultValue("-1") Integer limit, @FormDataParam("file") InputStream fileInputStream, @FormDataParam("file") FormDataContentDisposition fileMetaData) throws Exception {
    SampleFileSparkScript sampleFileSparkScript = null;
    SchemaParserAnnotationTransformer transformer = new SchemaParserAnnotationTransformer();
    try {
        SchemaParserDescriptor descriptor = ObjectMapperSerializer.deserialize(parserDescriptor, SchemaParserDescriptor.class);
        FileSchemaParser p = transformer.fromUiModel(descriptor);
        SparkFileSchemaParser sparkFileSchemaParser = (SparkFileSchemaParser) p;
        sparkFileSchemaParser.setDataFrameVariable(dataFrameVariable);
        sparkFileSchemaParser.setLimit(limit);
        sampleFileSparkScript = sparkFileSchemaParser.getSparkScript(fileInputStream);
    } catch (IOException e) {
        throw new WebApplicationException(e.getMessage());
    } catch (PolicyTransformException e) {
        log.warn("Failed to convert parser", e);
        throw new InternalServerErrorException(STRINGS.getString("discovery.transformError"), e);
    }
    if (sampleFileSparkScript == null) {
        log.warn("Failed to convert parser");
        throw new InternalServerErrorException(STRINGS.getString("discovery.transformError"));
    }
    return Response.ok(sampleFileSparkScript).build();
}
Also used : SampleFileSparkScript(com.thinkbiganalytics.discovery.parser.SampleFileSparkScript) SparkFileSchemaParser(com.thinkbiganalytics.discovery.parser.SparkFileSchemaParser) WebApplicationException(javax.ws.rs.WebApplicationException) InternalServerErrorException(javax.ws.rs.InternalServerErrorException) IOException(java.io.IOException) PolicyTransformException(com.thinkbiganalytics.policy.PolicyTransformException) FileSchemaParser(com.thinkbiganalytics.discovery.parser.FileSchemaParser) SparkFileSchemaParser(com.thinkbiganalytics.discovery.parser.SparkFileSchemaParser) SchemaParserDescriptor(com.thinkbiganalytics.discovery.model.SchemaParserDescriptor) Path(javax.ws.rs.Path) POST(javax.ws.rs.POST) Consumes(javax.ws.rs.Consumes) Produces(javax.ws.rs.Produces) ApiOperation(io.swagger.annotations.ApiOperation) ApiResponses(io.swagger.annotations.ApiResponses)

Example 3 with SchemaParserDescriptor

use of com.thinkbiganalytics.discovery.model.SchemaParserDescriptor in project kylo by Teradata.

the class SchemaDiscoveryRestController method uploadFile.

@POST
@Path("/hive/sample-file")
@Consumes(MediaType.MULTIPART_FORM_DATA)
@Produces(MediaType.APPLICATION_JSON)
@ApiOperation("Determines the schema of the provided file.")
@ApiResponses({ @ApiResponse(code = 200, message = "Returns the schema.", response = Schema.class), @ApiResponse(code = 500, message = "The schema could not be determined.", response = RestResponseStatus.class) })
public Response uploadFile(@FormDataParam("parser") String parserDescriptor, @FormDataParam("file") InputStream fileInputStream, @FormDataParam("file") FormDataContentDisposition fileMetaData) throws Exception {
    Schema schema;
    SchemaParserAnnotationTransformer transformer = new SchemaParserAnnotationTransformer();
    try {
        SchemaParserDescriptor descriptor = ObjectMapperSerializer.deserialize(parserDescriptor, SchemaParserDescriptor.class);
        FileSchemaParser p = transformer.fromUiModel(descriptor);
        // TODO: Detect charset
        schema = p.parse(fileInputStream, Charset.defaultCharset(), TableSchemaType.HIVE);
    } catch (IOException e) {
        throw new WebApplicationException(e.getMessage());
    } catch (PolicyTransformException e) {
        log.warn("Failed to convert parser", e);
        throw new InternalServerErrorException(STRINGS.getString("discovery.transformError"), e);
    }
    return Response.ok(schema).build();
}
Also used : WebApplicationException(javax.ws.rs.WebApplicationException) Schema(com.thinkbiganalytics.discovery.schema.Schema) InternalServerErrorException(javax.ws.rs.InternalServerErrorException) IOException(java.io.IOException) PolicyTransformException(com.thinkbiganalytics.policy.PolicyTransformException) FileSchemaParser(com.thinkbiganalytics.discovery.parser.FileSchemaParser) SparkFileSchemaParser(com.thinkbiganalytics.discovery.parser.SparkFileSchemaParser) SchemaParserDescriptor(com.thinkbiganalytics.discovery.model.SchemaParserDescriptor) Path(javax.ws.rs.Path) POST(javax.ws.rs.POST) Consumes(javax.ws.rs.Consumes) Produces(javax.ws.rs.Produces) ApiOperation(io.swagger.annotations.ApiOperation) ApiResponses(io.swagger.annotations.ApiResponses)

Example 4 with SchemaParserDescriptor

use of com.thinkbiganalytics.discovery.model.SchemaParserDescriptor in project kylo by Teradata.

the class SchemaDiscoveryRestControllerTest method createMockParserDescriptor.

private SchemaParserDescriptor createMockParserDescriptor() {
    SchemaParserDescriptor descriptor = new SchemaParserDescriptor();
    descriptor.setObjectClassType("com.thinkbiganalytics.discovery.rest.controller.MockSchemaParser2");
    FieldRuleProperty propDetect = new FieldRuleProperty();
    propDetect.setName("Auto Detect?");
    propDetect.setObjectProperty("autoDetect");
    propDetect.setValue("true");
    FieldRuleProperty propHeader = new FieldRuleProperty();
    propHeader.setName("Header?");
    propHeader.setObjectProperty("headerRow");
    propHeader.setValue("false");
    descriptor.setProperties(Arrays.asList(propDetect, propHeader));
    return descriptor;
}
Also used : FieldRuleProperty(com.thinkbiganalytics.policy.rest.model.FieldRuleProperty) SchemaParserDescriptor(com.thinkbiganalytics.discovery.model.SchemaParserDescriptor)

Example 5 with SchemaParserDescriptor

use of com.thinkbiganalytics.discovery.model.SchemaParserDescriptor in project kylo by Teradata.

the class SchemaParserAnnotationTransformer method buildUiModel.

@Override
public SchemaParserDescriptor buildUiModel(SchemaParser annotation, FileSchemaParser policy, List<FieldRuleProperty> properties) {
    SchemaParserDescriptor descriptor = new SchemaParserDescriptor();
    descriptor.setProperties(properties);
    descriptor.setName(annotation.name());
    descriptor.setDescription(annotation.description());
    descriptor.setObjectClassType(policy.getClass().getTypeName());
    descriptor.setTags(annotation.tags());
    descriptor.setGeneratesHiveSerde(annotation.generatesHiveSerde());
    descriptor.setSupportsBinary(annotation.supportsBinary());
    descriptor.setAllowSkipHeader(annotation.allowSkipHeader());
    descriptor.setPrimary(annotation.primary());
    descriptor.setUsesSpark(annotation.usesSpark());
    descriptor.setMimeTypes(annotation.mimeTypes());
    descriptor.setSparkFormat(annotation.sparkFormat());
    return descriptor;
}
Also used : SchemaParserDescriptor(com.thinkbiganalytics.discovery.model.SchemaParserDescriptor)

Aggregations

SchemaParserDescriptor (com.thinkbiganalytics.discovery.model.SchemaParserDescriptor)13 FileSchemaParser (com.thinkbiganalytics.discovery.parser.FileSchemaParser)7 SparkFileSchemaParser (com.thinkbiganalytics.discovery.parser.SparkFileSchemaParser)5 SampleFileSparkScript (com.thinkbiganalytics.discovery.parser.SampleFileSparkScript)4 ApiOperation (io.swagger.annotations.ApiOperation)4 ApiResponses (io.swagger.annotations.ApiResponses)4 Path (javax.ws.rs.Path)4 Produces (javax.ws.rs.Produces)4 SchemaParserAnnotationTransformer (com.thinkbiganalytics.discovery.rest.controller.SchemaParserAnnotationTransformer)3 PolicyTransformException (com.thinkbiganalytics.policy.PolicyTransformException)3 ArrayList (java.util.ArrayList)3 List (java.util.List)3 Map (java.util.Map)3 Collectors (java.util.stream.Collectors)3 Consumes (javax.ws.rs.Consumes)3 InternalServerErrorException (javax.ws.rs.InternalServerErrorException)3 POST (javax.ws.rs.POST)3 FileParserFactory (com.thinkbiganalytics.discovery.FileParserFactory)2 AbstractTransformResponseModifier (com.thinkbiganalytics.spark.rest.controller.AbstractTransformResponseModifier)2 FileMetadataResponse (com.thinkbiganalytics.spark.rest.model.FileMetadataResponse)2