Search in sources :

Example 21 with PreparationDTO

use of org.talend.dataprep.api.preparation.PreparationDTO in project data-prep by Talend.

the class PreparationServiceTest method setHeadShouldCleanStepList.

@Test
public void setHeadShouldCleanStepList() throws IOException {
    // get a prep
    Preparation preparation = new Preparation();
    preparation.setName("prep_name_foo");
    preparation.setDataSetId("1234");
    preparation.setRowMetadata(new RowMetadata());
    PreparationDTO prep = clientTest.createPreparation(preparation, home.getId());
    final String step = IOUtils.toString(this.getClass().getResourceAsStream("actions/append_lower_case.json"), UTF_8);
    for (int i = 0; i < 5; i++) {
        clientTest.addStep(prep.getId(), step);
    }
    prep = clientTest.getPreparation(prep.getId());
    List<String> originalStepIds = prep.getSteps();
    updateHeadAndCheckResult(prep, originalStepIds, 0);
    updateHeadAndCheckResult(prep, originalStepIds, 3);
    updateHeadAndCheckResult(prep, originalStepIds, 2);
}
Also used : PreparationDTO(org.talend.dataprep.api.preparation.PreparationDTO) Preparation(org.talend.dataprep.api.preparation.Preparation) RowMetadata(org.talend.dataprep.api.dataset.RowMetadata) Test(org.junit.Test) BasePreparationTest(org.talend.dataprep.preparation.BasePreparationTest)

Example 22 with PreparationDTO

use of org.talend.dataprep.api.preparation.PreparationDTO in project data-prep by Talend.

the class TransformationService method getPreparationColumnSemanticCategories.

/**
 * Return the semantic types for a given preparation / column.
 *
 * @param preparationId the preparation id.
 * @param columnId the column id.
 * @param stepId the step id (optional, if not specified, it's 'head')
 * @return the semantic types for a given preparation / column.
 */
@RequestMapping(value = "/preparations/{preparationId}/columns/{columnId}/types", method = GET)
@ApiOperation(value = "list the types of the wanted column", notes = "This list can be used by user to change the column type.")
@Timed
@PublicAPI
public List<SemanticDomain> getPreparationColumnSemanticCategories(@ApiParam(value = "The preparation id") @PathVariable String preparationId, @ApiParam(value = "The column id") @PathVariable String columnId, @ApiParam(value = "The preparation version") @RequestParam(defaultValue = "head") String stepId) {
    LOG.debug("listing preparation semantic categories for preparation #{} column #{}@{}", preparationId, columnId, stepId);
    // get the preparation
    final PreparationDTO preparation = getPreparation(preparationId);
    // get the step (in case of 'head', the real step id must be found)
    final String version = // 
    StringUtils.equals("head", stepId) ? preparation.getSteps().get(preparation.getSteps().size() - 1) : stepId;
    /*
         * OK, this one is a bit tricky so pay attention.
         *
         * To be able to get the semantic types, the analyzer service needs to run on the result of the preparation.
         *
         * The result must be found in the cache, so if the preparation is not cached, the preparation is run so that
         * it gets cached.
         *
         * Then, the analyzer service just gets the data from the cache. That's it.
         */
    // generate the cache keys for both metadata & content
    final ContentCacheKey metadataKey = cacheKeyGenerator.metadataBuilder().preparationId(preparationId).stepId(version).sourceType(HEAD).build();
    final ContentCacheKey contentKey = cacheKeyGenerator.contentBuilder().datasetId(preparation.getDataSetId()).preparationId(preparationId).stepId(// 
    version).format(JSON).sourceType(// 
    HEAD).build();
    // if the preparation is not cached, let's compute it to have some cache
    if (!contentCache.has(metadataKey) || !contentCache.has(contentKey)) {
        addPreparationInCache(preparation, stepId);
    }
    // run the analyzer service on the cached content
    try (final InputStream metadataCache = contentCache.get(metadataKey);
        final InputStream contentCacheStream = this.contentCache.get(contentKey)) {
        final DataSetMetadata metadata = mapper.readerFor(DataSetMetadata.class).readValue(metadataCache);
        final List<SemanticDomain> semanticDomains = getSemanticDomains(metadata, columnId, contentCacheStream);
        LOG.debug("found {} for preparation #{}, column #{}", semanticDomains, preparationId, columnId);
        return semanticDomains;
    } catch (IOException e) {
        throw new TDPException(UNEXPECTED_EXCEPTION, e);
    }
}
Also used : TDPException(org.talend.dataprep.exception.TDPException) PreparationDTO(org.talend.dataprep.api.preparation.PreparationDTO) PipedInputStream(java.io.PipedInputStream) InputStream(java.io.InputStream) ContentCacheKey(org.talend.dataprep.cache.ContentCacheKey) SemanticDomain(org.talend.dataprep.api.dataset.statistics.SemanticDomain) IOException(java.io.IOException) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) Timed(org.talend.dataprep.metrics.Timed) ApiOperation(io.swagger.annotations.ApiOperation) PublicAPI(org.talend.dataprep.security.PublicAPI) RequestMapping(org.springframework.web.bind.annotation.RequestMapping)

Example 23 with PreparationDTO

use of org.talend.dataprep.api.preparation.PreparationDTO in project data-prep by Talend.

the class TransformationService method getPreparationExportTypesForPreparation.

/**
 * Get the available export formats for preparation
 */
@RequestMapping(value = "/export/formats/preparations/{preparationId}", method = GET)
@ApiOperation(value = "Get the available format types for the preparation")
@Timed
public Stream<ExportFormatMessage> getPreparationExportTypesForPreparation(@PathVariable String preparationId) {
    final PreparationDTO preparation = getPreparation(preparationId);
    final DataSetMetadata metadata = datasetClient.getDataSetMetadata(preparation.getDataSetId());
    return getPreparationExportTypesForDataSet(metadata.getId());
}
Also used : PreparationDTO(org.talend.dataprep.api.preparation.PreparationDTO) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) Timed(org.talend.dataprep.metrics.Timed) ApiOperation(io.swagger.annotations.ApiOperation) RequestMapping(org.springframework.web.bind.annotation.RequestMapping)

Example 24 with PreparationDTO

use of org.talend.dataprep.api.preparation.PreparationDTO in project data-prep by Talend.

the class TransformationService method executeMetadata.

@RequestMapping(value = "/apply/preparation/{preparationId}/{stepId}/metadata", method = GET)
@ApiOperation(value = "Run the transformation given the provided export parameters", notes = "This operation transforms the dataset or preparation using parameters in export parameters.")
@VolumeMetered
@// 
AsyncOperation(// 
conditionalClass = GetPrepMetadataAsyncCondition.class, // 
resultUrlGenerator = PrepMetadataGetContentUrlGenerator.class, executionIdGeneratorClass = PrepMetadataExecutionIdGenerator.class)
public DataSetMetadata executeMetadata(@PathVariable("preparationId") @AsyncParameter String preparationId, @PathVariable("stepId") @AsyncParameter String stepId) {
    LOG.debug("getting preparation metadata for #{}, step {}", preparationId, stepId);
    final PreparationDTO preparation = getPreparation(preparationId);
    if (preparation.getSteps().size() > 1) {
        String headId = "head".equalsIgnoreCase(stepId) ? preparation.getHeadId() : stepId;
        final TransformationMetadataCacheKey cacheKey = cacheKeyGenerator.generateMetadataKey(preparationId, headId, HEAD);
        // No metadata in cache, recompute it
        if (!contentCache.has(cacheKey)) {
            try {
                LOG.debug("Metadata not available for preparation '{}' at step '{}'", preparationId, headId);
                ExportParameters parameters = new ExportParameters();
                parameters.setPreparationId(preparationId);
                parameters.setExportType("JSON");
                parameters.setStepId(headId);
                parameters.setFrom(HEAD);
                // we regenerate cache
                parameters = exportParametersUtil.populateFromPreparationExportParameter(parameters);
                preparationExportStrategy.performPreparation(parameters, new NullOutputStream());
            } catch (Exception e) {
                throw new TDPException(TransformationErrorCodes.METADATA_NOT_FOUND, e);
            }
        }
        if (contentCache.has(cacheKey)) {
            try (InputStream stream = contentCache.get(cacheKey)) {
                return mapper.readerFor(DataSetMetadata.class).readValue(stream);
            } catch (IOException e) {
                throw new TDPException(CommonErrorCodes.UNEXPECTED_EXCEPTION, e);
            }
        }
    } else {
        LOG.debug("No step in preparation '{}', falls back to get dataset metadata (id: {})", preparationId, preparation.getDataSetId());
        return datasetClient.getDataSetMetadata(preparation.getDataSetId());
    }
    return null;
}
Also used : TDPException(org.talend.dataprep.exception.TDPException) PreparationDTO(org.talend.dataprep.api.preparation.PreparationDTO) ExportParameters(org.talend.dataprep.api.export.ExportParameters) PipedInputStream(java.io.PipedInputStream) InputStream(java.io.InputStream) TransformationMetadataCacheKey(org.talend.dataprep.cache.TransformationMetadataCacheKey) IOException(java.io.IOException) IOException(java.io.IOException) TDPException(org.talend.dataprep.exception.TDPException) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) NullOutputStream(org.apache.commons.io.output.NullOutputStream) AsyncOperation(org.talend.dataprep.async.AsyncOperation) VolumeMetered(org.talend.dataprep.metrics.VolumeMetered) ApiOperation(io.swagger.annotations.ApiOperation) RequestMapping(org.springframework.web.bind.annotation.RequestMapping)

Example 25 with PreparationDTO

use of org.talend.dataprep.api.preparation.PreparationDTO in project data-prep by Talend.

the class ApplyPreparationExportStrategy method executeApplyPreparation.

private void executeApplyPreparation(ExportParameters parameters, OutputStream outputStream) {
    final String stepId = parameters.getStepId();
    final String preparationId = parameters.getPreparationId();
    final String formatName = parameters.getExportType();
    final PreparationDTO preparation = getPreparation(preparationId);
    final String dataSetId = parameters.getDatasetId();
    try (DataSet dataSet = getDataset(parameters, dataSetId)) {
        // head is not allowed as step id
        final String version = getCleanStepId(preparation, stepId);
        // create tee to broadcast to cache + service output
        final TransformationCacheKey key = // 
        cacheKeyGenerator.generateContentKey(// 
        dataSetId, // 
        preparationId, // 
        version, // 
        formatName, // 
        parameters.getFrom(), // 
        parameters.getArguments(), // 
        parameters.getFilter());
        if (LOGGER.isDebugEnabled()) {
            LOGGER.debug("Transformation Cache Key : {}", key.getKey());
            LOGGER.debug("Cache key details: {}", key);
        }
        executePipeline(parameters, outputStream, key, preparationId, version, dataSet);
    } finally {
        if (!technicianIdentityReleased) {
            securityProxy.releaseIdentity();
        }
    }
}
Also used : TransformationCacheKey(org.talend.dataprep.cache.TransformationCacheKey) PreparationDTO(org.talend.dataprep.api.preparation.PreparationDTO) DataSet(org.talend.dataprep.api.dataset.DataSet)

Aggregations

PreparationDTO (org.talend.dataprep.api.preparation.PreparationDTO)45 Test (org.junit.Test)22 ExportParameters (org.talend.dataprep.api.export.ExportParameters)9 ApiOperation (io.swagger.annotations.ApiOperation)7 InputStream (java.io.InputStream)7 RequestMapping (org.springframework.web.bind.annotation.RequestMapping)7 ArrayList (java.util.ArrayList)6 Action (org.talend.dataprep.api.preparation.Action)6 Logger (org.slf4j.Logger)5 LoggerFactory (org.slf4j.LoggerFactory)5 DataSetMetadata (org.talend.dataprep.api.dataset.DataSetMetadata)5 RowMetadata (org.talend.dataprep.api.dataset.RowMetadata)5 Preparation (org.talend.dataprep.api.preparation.Preparation)5 PreparationDetailsDTO (org.talend.dataprep.api.preparation.PreparationDetailsDTO)5 TransformationCacheKey (org.talend.dataprep.cache.TransformationCacheKey)5 TDPException (org.talend.dataprep.exception.TDPException)5 OutputStream (java.io.OutputStream)4 StringUtils (org.apache.commons.lang3.StringUtils)4 Autowired (org.springframework.beans.factory.annotation.Autowired)4 Component (org.springframework.stereotype.Component)4