Search in sources :

Example 1 with Statistics

use of org.talend.dataprep.api.dataset.statistics.Statistics in project data-prep by Talend.

the class ReorderColumn method swapColumnMetadata.

protected void swapColumnMetadata(ColumnMetadata originColumn, ColumnMetadata targetColumn) throws Exception {
    ColumnMetadata targetColumnCopy = ColumnMetadata.Builder.column().copy(targetColumn).build();
    ColumnMetadata originColumnCopy = ColumnMetadata.Builder.column().copy(originColumn).build();
    BeanUtils.copyProperties(targetColumn, originColumn);
    BeanUtils.copyProperties(originColumn, targetColumnCopy);
    Statistics originalStatistics = originColumnCopy.getStatistics();
    Statistics targetStatistics = targetColumnCopy.getStatistics();
    BeanUtils.copyProperties(targetColumn.getStatistics(), originalStatistics);
    BeanUtils.copyProperties(originColumn.getStatistics(), targetStatistics);
    Quality originalQuality = originColumnCopy.getQuality();
    Quality targetQualityCopty = targetColumnCopy.getQuality();
    BeanUtils.copyProperties(targetColumn.getQuality(), originalQuality);
    BeanUtils.copyProperties(originColumn.getQuality(), targetQualityCopty);
}
Also used : ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) Quality(org.talend.dataprep.api.dataset.Quality) Statistics(org.talend.dataprep.api.dataset.statistics.Statistics)

Example 2 with Statistics

use of org.talend.dataprep.api.dataset.statistics.Statistics in project data-prep by Talend.

the class ExtractDateTokensTest method should_update_metadata.

@Test
public void should_update_metadata() throws IOException {
    // given
    final List<ColumnMetadata> input = new ArrayList<>();
    input.add(createMetadata("0000"));
    input.add(createMetadata("0001"));
    input.add(createMetadata("0002"));
    final RowMetadata rowMetadata = new RowMetadata(input);
    ObjectMapper mapper = new ObjectMapper();
    final Statistics statistics = mapper.reader(Statistics.class).readValue(getDateTestJsonAsStream("statistics_yyyy-MM-dd.json"));
    input.get(1).setStatistics(statistics);
    // when
    ActionTestWorkbench.test(rowMetadata, actionRegistry, factory.create(action, parameters));
    // then
    assertNotNull(rowMetadata.getById("0003"));
    assertNotNull(rowMetadata.getById("0004"));
    assertNotNull(rowMetadata.getById("0005"));
    assertNotNull(rowMetadata.getById("0006"));
    assertNull(rowMetadata.getById("0007"));
}
Also used : ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) RowMetadata(org.talend.dataprep.api.dataset.RowMetadata) Statistics(org.talend.dataprep.api.dataset.statistics.Statistics) ActionMetadataTestUtils.setStatistics(org.talend.dataprep.transformation.actions.ActionMetadataTestUtils.setStatistics) ObjectMapper(com.fasterxml.jackson.databind.ObjectMapper) Test(org.junit.Test)

Example 3 with Statistics

use of org.talend.dataprep.api.dataset.statistics.Statistics in project data-prep by Talend.

the class SplitTest method test_TDP_876.

@Test
public void test_TDP_876() {
    // given
    final DataSetRow row = // 
    builder().with(// 
    value("lorem bacon").type(Type.STRING)).with(// 
    value("Bacon ipsum dolor amet swine leberkas pork belly").type(Type.STRING)).with(// 
    value("01/01/2015").type(Type.STRING)).build();
    // when
    // 
    ActionTestWorkbench.test(// 
    Collections.singletonList(row), // Test requires some analysis in asserts
    analyzerService, actionRegistry, factory.create(action, parameters));
    // then
    final RowMetadata actual = row.getRowMetadata();
    Statistics originalStats = actual.getById("0001").getStatistics();
    final List<PatternFrequency> originalPatterns = originalStats.getPatternFrequencies();
    assertFalse(originalPatterns.equals(actual.getById("0003").getStatistics().getPatternFrequencies()));
    assertFalse(originalPatterns.equals(actual.getById("0004").getStatistics().getPatternFrequencies()));
}
Also used : PatternFrequency(org.talend.dataprep.api.dataset.statistics.PatternFrequency) RowMetadata(org.talend.dataprep.api.dataset.RowMetadata) Statistics(org.talend.dataprep.api.dataset.statistics.Statistics) DataSetRow(org.talend.dataprep.api.dataset.row.DataSetRow) AbstractMetadataBaseTest(org.talend.dataprep.transformation.actions.AbstractMetadataBaseTest) Test(org.junit.Test)

Example 4 with Statistics

use of org.talend.dataprep.api.dataset.statistics.Statistics in project data-prep by Talend.

the class ColumnMetadataTest method should_set_empty_statistics.

@Test
public void should_set_empty_statistics() {
    ColumnMetadata column = ColumnMetadata.Builder.column().id(42).type(Type.STRING).build();
    column.setStatistics(null);
    assertEquals(new Statistics(), column.getStatistics());
}
Also used : Statistics(org.talend.dataprep.api.dataset.statistics.Statistics) Test(org.junit.Test)

Example 5 with Statistics

use of org.talend.dataprep.api.dataset.statistics.Statistics in project data-prep by Talend.

the class DataSetServiceTest method datePattern.

@Test
public void datePattern() throws Exception {
    int before = dataSetMetadataRepository.size();
    String dataSetId = given().body(IOUtils.toString(this.getClass().getResourceAsStream("../date_time_pattern.csv"), UTF_8)).queryParam(CONTENT_TYPE, "text/csv").when().post("/datasets").asString();
    int after = dataSetMetadataRepository.size();
    assertThat(after - before, is(1));
    assertQueueMessages(dataSetId);
    final DataSetMetadata dataSetMetadata = dataSetMetadataRepository.get(dataSetId);
    assertNotNull(dataSetMetadata);
    final ColumnMetadata column = dataSetMetadata.getRowMetadata().getById("0001");
    assertThat(column.getType(), is("date"));
    assertThat(column.getDomain(), is(""));
    final Statistics statistics = mapper.readerFor(Statistics.class).readValue(this.getClass().getResourceAsStream("../date_time_pattern_expected.json"));
    assertThat(column.getStatistics(), CoreMatchers.equalTo(statistics));
}
Also used : ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) Matchers.containsString(org.hamcrest.Matchers.containsString) Matchers.isEmptyString(org.hamcrest.Matchers.isEmptyString) Statistics(org.talend.dataprep.api.dataset.statistics.Statistics) DataSetMetadata(org.talend.dataprep.api.dataset.DataSetMetadata) DataSetBaseTest(org.talend.dataprep.dataset.DataSetBaseTest) Test(org.junit.Test)

Aggregations

Statistics (org.talend.dataprep.api.dataset.statistics.Statistics)15 Test (org.junit.Test)10 ColumnMetadata (org.talend.dataprep.api.dataset.ColumnMetadata)10 RowMetadata (org.talend.dataprep.api.dataset.RowMetadata)9 AbstractMetadataBaseTest (org.talend.dataprep.transformation.actions.AbstractMetadataBaseTest)7 DataSetRow (org.talend.dataprep.api.dataset.row.DataSetRow)6 ObjectMapper (com.fasterxml.jackson.databind.ObjectMapper)3 SemanticDomain (org.talend.dataprep.api.dataset.statistics.SemanticDomain)3 HashMap (java.util.HashMap)2 PatternFrequency (org.talend.dataprep.api.dataset.statistics.PatternFrequency)2 ChangeDatePatternTest (org.talend.dataprep.transformation.actions.date.ChangeDatePatternTest)2 TemporalField (java.time.temporal.TemporalField)1 Matchers.containsString (org.hamcrest.Matchers.containsString)1 Matchers.isEmptyString (org.hamcrest.Matchers.isEmptyString)1 DataSetMetadata (org.talend.dataprep.api.dataset.DataSetMetadata)1 Quality (org.talend.dataprep.api.dataset.Quality)1 DataSetBaseTest (org.talend.dataprep.dataset.DataSetBaseTest)1 ActionMetadataTestUtils.setStatistics (org.talend.dataprep.transformation.actions.ActionMetadataTestUtils.setStatistics)1 JulianDayConverter (org.talend.dataquality.converters.JulianDayConverter)1