Search in sources :

Example 1 with DataTypeEnum

use of org.talend.dataquality.statistics.type.DataTypeEnum in project data-prep by Talend.

the class TypeUtilsTest method testConvertDate.

@Test
public void testConvertDate() throws Exception {
    ColumnMetadata metadata = column().id(1).type(Type.DATE).build();
    final DataTypeEnum[] types = TypeUtils.convert(Collections.singletonList(metadata));
    assertThat(types[0], is(DataTypeEnum.DATE));
}
Also used : ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) DataTypeEnum(org.talend.dataquality.statistics.type.DataTypeEnum) Test(org.junit.Test)

Example 2 with DataTypeEnum

use of org.talend.dataquality.statistics.type.DataTypeEnum in project data-prep by Talend.

the class StreamDateHistogramAnalyzer method analyze.

@Override
public boolean analyze(String... record) {
    if (record.length != types.length) {
        throw new IllegalArgumentException("Each column of the record should be declared a DataType.Type corresponding! \n" + types.length + " type(s) declared in this histogram analyzer but " + record.length + " column(s) was found in this record. \n" + "Using method: setTypes(DataType.Type[] types) to set the types. ");
    }
    stats.resize(record.length);
    for (int index = 0; index < types.length; ++index) {
        final DataTypeEnum type = this.types[index];
        final ColumnMetadata column = this.columns.get(index);
        final String value = record[index];
        if (type == DataTypeEnum.DATE) {
            final String mostUsedDatePattern = RowMetadataUtils.getMostUsedDatePattern(column);
            if (!TypeInferenceUtils.isDate(value, Collections.singletonList(mostUsedDatePattern))) {
                LOGGER.trace("Skip date value '{}' (not valid date)", value);
                continue;
            }
            try {
                final LocalDateTime adaptedValue = dateParser.parse(value, column);
                stats.get(index).add(adaptedValue);
            } catch (DateTimeException e) {
                // just skip this value
                LOGGER.debug("Unable to process date value '{}'", value, e);
            }
        }
    }
    return true;
}
Also used : LocalDateTime(java.time.LocalDateTime) ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) DateTimeException(java.time.DateTimeException) DataTypeEnum(org.talend.dataquality.statistics.type.DataTypeEnum)

Example 3 with DataTypeEnum

use of org.talend.dataquality.statistics.type.DataTypeEnum in project data-prep by Talend.

the class TypeUtilsTest method testConvertBoolean.

@Test
public void testConvertBoolean() throws Exception {
    ColumnMetadata metadata = column().id(1).type(Type.BOOLEAN).build();
    final DataTypeEnum[] types = TypeUtils.convert(Collections.singletonList(metadata));
    assertThat(types[0], is(DataTypeEnum.BOOLEAN));
}
Also used : ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) DataTypeEnum(org.talend.dataquality.statistics.type.DataTypeEnum) Test(org.junit.Test)

Example 4 with DataTypeEnum

use of org.talend.dataquality.statistics.type.DataTypeEnum in project data-prep by Talend.

the class TypeUtilsTest method testConvertString.

@Test
public void testConvertString() throws Exception {
    ColumnMetadata metadata = column().id(1).type(Type.ANY).build();
    DataTypeEnum[] types = TypeUtils.convert(Collections.singletonList(metadata));
    assertThat(types[0], is(DataTypeEnum.STRING));
    metadata = column().id(2).type(Type.STRING).build();
    types = TypeUtils.convert(Collections.singletonList(metadata));
    assertThat(types[0], is(DataTypeEnum.STRING));
}
Also used : ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) DataTypeEnum(org.talend.dataquality.statistics.type.DataTypeEnum) Test(org.junit.Test)

Example 5 with DataTypeEnum

use of org.talend.dataquality.statistics.type.DataTypeEnum in project data-prep by Talend.

the class TypeUtilsTest method testConvertInteger.

@Test
public void testConvertInteger() throws Exception {
    ColumnMetadata metadata = column().id(1).type(Type.NUMERIC).build();
    DataTypeEnum[] types = TypeUtils.convert(Collections.singletonList(metadata));
    assertThat(types[0], is(DataTypeEnum.INTEGER));
    metadata = column().id(2).type(Type.INTEGER).build();
    types = TypeUtils.convert(Collections.singletonList(metadata));
    assertThat(types[0], is(DataTypeEnum.INTEGER));
}
Also used : ColumnMetadata(org.talend.dataprep.api.dataset.ColumnMetadata) DataTypeEnum(org.talend.dataquality.statistics.type.DataTypeEnum) Test(org.junit.Test)

Aggregations

DataTypeEnum (org.talend.dataquality.statistics.type.DataTypeEnum)8 ColumnMetadata (org.talend.dataprep.api.dataset.ColumnMetadata)7 Test (org.junit.Test)5 SemanticType (org.talend.dataquality.semantic.statistics.SemanticType)2 DataTypeOccurences (org.talend.dataquality.statistics.type.DataTypeOccurences)2 PrintWriter (java.io.PrintWriter)1 StringWriter (java.io.StringWriter)1 DateTimeException (java.time.DateTimeException)1 LocalDateTime (java.time.LocalDateTime)1 java.util (java.util)1 Collectors (java.util.stream.Collectors)1 StringUtils (org.apache.commons.lang.StringUtils)1 Logger (org.slf4j.Logger)1 LoggerFactory (org.slf4j.LoggerFactory)1 RowMetadataUtils (org.talend.dataprep.api.dataset.row.RowMetadataUtils)1 StreamDateHistogramAnalyzer (org.talend.dataprep.api.dataset.statistics.date.StreamDateHistogramAnalyzer)1 StreamDateHistogramStatistics (org.talend.dataprep.api.dataset.statistics.date.StreamDateHistogramStatistics)1 StreamNumberHistogramAnalyzer (org.talend.dataprep.api.dataset.statistics.number.StreamNumberHistogramAnalyzer)1 Type (org.talend.dataprep.api.type.Type)1 TypeUtils (org.talend.dataprep.api.type.TypeUtils)1