Search in sources :

Example 26 with DataSetMetadata

use of org.talend.dataprep.api.dataset.DataSetMetadata in project data-prep by Talend.

the class DataSetJSONTest method testRoundTrip.

@Test
public void testRoundTrip() throws Exception {
    DataSet dataSet = from(DataSetJSONTest.class.getResourceAsStream("test3.json"));
    final DataSetMetadata metadata = dataSet.getMetadata();
    metadata.getContent().addParameter(CSVFormatFamily.SEPARATOR_PARAMETER, ",");
    metadata.getContent().setFormatFamilyId(new CSVFormatFamily().getBeanId());
    assertNotNull(metadata);
    StringWriter writer = new StringWriter();
    to(dataSet, writer);
    assertThat(writer.toString(), sameJSONAsFile(DataSetJSONTest.class.getResourceAsStream("test3.json")));
}
Also used : DataSet(org.talend.dataprep.api.dataset.DataSet) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) CSVFormatFamily(org.talend.dataprep.schema.csv.CSVFormatFamily) ServiceBaseTest(org.talend.ServiceBaseTest) Test(org.junit.Test)

Example 27 with DataSetMetadata

use of org.talend.dataprep.api.dataset.DataSetMetadata in project data-prep by Talend.

the class AggregationService method aggregate.

/**
 * Process an aggregation.
 *
 * @param parameters the aggregation parameters.
 * @param dataset the dataset input.
 * @return the aggregation result.
 */
public AggregationResult aggregate(AggregationParameters parameters, DataSet dataset) {
    // check the parameters
    if (parameters.getOperations().isEmpty() || parameters.getGroupBy().isEmpty()) {
        throw new TDPException(CommonErrorCodes.BAD_AGGREGATION_PARAMETERS);
    }
    AggregationResult result = new AggregationResult(parameters.getOperations().get(0).getOperator());
    // get the aggregator
    Aggregator aggregator = factory.get(parameters);
    // Build optional filter
    final DataSetMetadata metadata = dataset.getMetadata();
    final RowMetadata rowMetadata = metadata != null ? metadata.getRowMetadata() : new RowMetadata();
    final Predicate<DataSetRow> filter = filterService.build(parameters.getFilter(), rowMetadata);
    // process the dataset
    dataset.getRecords().filter(filter).forEach(row -> aggregator.accept(row, result));
    // Normalize result (perform clean / optimization now that all input was processed).
    aggregator.normalize(result);
    return result;
}
Also used : TDPException(org.talend.dataprep.exception.TDPException) AggregationResult(org.talend.dataprep.transformation.aggregation.api.AggregationResult) Aggregator(org.talend.dataprep.transformation.aggregation.operation.Aggregator) RowMetadata(org.talend.dataprep.api.dataset.RowMetadata) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) DataSetRow(org.talend.dataprep.api.dataset.row.DataSetRow)

Example 28 with DataSetMetadata

use of org.talend.dataprep.api.dataset.DataSetMetadata in project data-prep by Talend.

the class DataSetServiceTest method datePattern.

@Test
public void datePattern() throws Exception {
    int before = dataSetMetadataRepository.size();
    String dataSetId = given().body(IOUtils.toString(this.getClass().getResourceAsStream("../date_time_pattern.csv"), UTF_8)).queryParam(CONTENT_TYPE, "text/csv").when().post("/datasets").asString();
    int after = dataSetMetadataRepository.size();
    assertThat(after - before, is(1));
    assertQueueMessages(dataSetId);
    final DataSetMetadata dataSetMetadata = dataSetMetadataRepository.get(dataSetId);
    assertNotNull(dataSetMetadata);
    final ColumnMetadata column = dataSetMetadata.getRowMetadata().getById("0001");
    assertThat(column.getType(), is("date"));
    assertThat(column.getDomain(), is(""));
    final Statistics statistics = mapper.readerFor(Statistics.class).readValue(this.getClass().getResourceAsStream("../date_time_pattern_expected.json"));
    assertThat(column.getStatistics(), CoreMatchers.equalTo(statistics));
}
Also used : ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) Matchers.containsString(org.hamcrest.Matchers.containsString) Matchers.isEmptyString(org.hamcrest.Matchers.isEmptyString) Statistics(org.talend.dataprep.api.dataset.statistics.Statistics) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) DataSetBaseTest(org.talend.dataprep.dataset.DataSetBaseTest) Test(org.junit.Test)

Example 29 with DataSetMetadata

use of org.talend.dataprep.api.dataset.DataSetMetadata in project data-prep by Talend.

the class DataSetServiceTest method previewNonDraft.

@Test
public void previewNonDraft() throws Exception {
    // Create a data set
    String dataSetId = given().body(IOUtils.toString(this.getClass().getResourceAsStream(TAGADA_CSV), UTF_8)).queryParam("Content-Type", "text/csv").when().post("/datasets").asString();
    final DataSetMetadata dataSetMetadata = dataSetMetadataRepository.get(dataSetId);
    assertThat(dataSetMetadata, notNullValue());
    // Ensure it is no draft
    dataSetMetadata.setDraft(false);
    dataSetMetadataRepository.save(dataSetMetadata);
    // Should receive a 301 that redirects to the GET data set content operation
    // 
    given().redirects().follow(false).contentType(JSON).get("/datasets/{id}/preview", dataSetId).then().statusCode(HttpStatus.MOVED_PERMANENTLY.value());
    // Should receive a 200 if code follows redirection
    // 
    given().redirects().follow(true).contentType(JSON).get("/datasets/{id}/preview", dataSetId).then().statusCode(OK.value());
}
Also used : Matchers.containsString(org.hamcrest.Matchers.containsString) Matchers.isEmptyString(org.hamcrest.Matchers.isEmptyString) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) DataSetBaseTest(org.talend.dataprep.dataset.DataSetBaseTest) Test(org.junit.Test)

Example 30 with DataSetMetadata

use of org.talend.dataprep.api.dataset.DataSetMetadata in project data-prep by Talend.

the class DataSetServiceTest method compatibleDatasetsListDateSort.

@Test
public void compatibleDatasetsListDateSort() throws Exception {
    String dataSetId = createCSVDataSet(this.getClass().getResourceAsStream(TAGADA_CSV), "ds-13");
    String dataSetId2 = createCSVDataSet(this.getClass().getResourceAsStream(TAGADA_CSV), "ds-12");
    String dataSetId3 = createCSVDataSet(this.getClass().getResourceAsStream(TAGADA_CSV), "ds-11");
    DataSetMetadata metadata1 = dataSetMetadataRepository.get(dataSetId);
    metadata1.setName("CCCC");
    dataSetMetadataRepository.save(metadata1);
    DataSetMetadata metadata2 = dataSetMetadataRepository.get(dataSetId2);
    metadata2.setName("BBBB");
    dataSetMetadataRepository.save(metadata2);
    DataSetMetadata metadata3 = dataSetMetadataRepository.get(dataSetId3);
    metadata3.setName("AAAA");
    dataSetMetadataRepository.save(metadata3);
    // when
    final String actual = expect().statusCode(200).log().ifValidationFails().get("/datasets/{id}/compatibledatasets?sort=creationDate", dataSetId).asString();
    // Ensure order by name (most recent first)
    final Iterator<JsonNode> elements = mapper.readTree(actual).elements();
    String[] expectedNames = new String[] { "AAAA", "BBBB" };
    int i = 0;
    while (elements.hasNext()) {
        assertThat(elements.next().get("name").asText(), is(expectedNames[i++]));
    }
}
Also used : JsonNode(com.fasterxml.jackson.databind.JsonNode) Matchers.containsString(org.hamcrest.Matchers.containsString) Matchers.isEmptyString(org.hamcrest.Matchers.isEmptyString) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) DataSetBaseTest(org.talend.dataprep.dataset.DataSetBaseTest) Test(org.junit.Test)

Aggregations

DataSetMetadata (org.talend.dataprep.api.dataset.DataSetMetadata)192 Test (org.junit.Test)126 DataSetBaseTest (org.talend.dataprep.dataset.DataSetBaseTest)63 ColumnMetadata (org.talend.dataprep.api.dataset.ColumnMetadata)48 InputStream (java.io.InputStream)45 Matchers.containsString (org.hamcrest.Matchers.containsString)28 Matchers.isEmptyString (org.hamcrest.Matchers.isEmptyString)28 TDPException (org.talend.dataprep.exception.TDPException)26 RowMetadata (org.talend.dataprep.api.dataset.RowMetadata)20 DataSetServiceTest (org.talend.dataprep.dataset.service.DataSetServiceTest)20 ApiOperation (io.swagger.annotations.ApiOperation)18 DataSet (org.talend.dataprep.api.dataset.DataSet)18 Type (org.talend.dataprep.api.type.Type)17 Timed (org.talend.dataprep.metrics.Timed)17 DistributedLock (org.talend.dataprep.lock.DistributedLock)16 Autowired (org.springframework.beans.factory.annotation.Autowired)14 DataSetRow (org.talend.dataprep.api.dataset.row.DataSetRow)14 IOException (java.io.IOException)13 RequestMapping (org.springframework.web.bind.annotation.RequestMapping)13 ArrayList (java.util.ArrayList)12