Search in sources :

Example 1 with EncodingFormat

use of org.apache.flink.table.connector.format.EncodingFormat in project flink by apache.

the class CsvFileFormatFactory method createEncodingFormat.

@Override
public EncodingFormat<Factory<RowData>> createEncodingFormat(DynamicTableFactory.Context context, ReadableConfig formatOptions) {
    return new EncodingFormat<BulkWriter.Factory<RowData>>() {

        @Override
        public BulkWriter.Factory<RowData> createRuntimeEncoder(DynamicTableSink.Context context, DataType physicalDataType) {
            final RowType rowType = (RowType) physicalDataType.getLogicalType();
            final CsvSchema schema = buildCsvSchema(rowType, formatOptions);
            final RowDataToCsvConverter converter = RowDataToCsvConverters.createRowConverter(rowType);
            final CsvMapper mapper = new CsvMapper();
            final ObjectNode container = mapper.createObjectNode();
            final RowDataToCsvConverter.RowDataToCsvFormatConverterContext converterContext = new RowDataToCsvConverter.RowDataToCsvFormatConverterContext(mapper, container);
            return out -> CsvBulkWriter.forSchema(mapper, schema, converter, converterContext, out);
        }

        @Override
        public ChangelogMode getChangelogMode() {
            return ChangelogMode.insertOnly();
        }
    };
}
Also used : Context(org.apache.flink.table.connector.source.DynamicTableSource.Context) DynamicTableFactory(org.apache.flink.table.factories.DynamicTableFactory) DataType(org.apache.flink.table.types.DataType) EncodingFormat(org.apache.flink.table.connector.format.EncodingFormat) ChangelogMode(org.apache.flink.table.connector.ChangelogMode) FIELD_DELIMITER(org.apache.flink.formats.csv.CsvFormatOptions.FIELD_DELIMITER) BulkWriterFormatFactory(org.apache.flink.connector.file.table.factories.BulkWriterFormatFactory) CsvSchema(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema) Context(org.apache.flink.table.connector.source.DynamicTableSource.Context) JsonNode(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.JsonNode) RowType(org.apache.flink.table.types.logical.RowType) ALLOW_COMMENTS(org.apache.flink.formats.csv.CsvFormatOptions.ALLOW_COMMENTS) Factory(org.apache.flink.api.common.serialization.BulkWriter.Factory) ReadableConfig(org.apache.flink.configuration.ReadableConfig) FileSourceSplit(org.apache.flink.connector.file.src.FileSourceSplit) IGNORE_PARSE_ERRORS(org.apache.flink.formats.csv.CsvFormatOptions.IGNORE_PARSE_ERRORS) QUOTE_CHARACTER(org.apache.flink.formats.csv.CsvFormatOptions.QUOTE_CHARACTER) RowDataToCsvConverter(org.apache.flink.formats.csv.RowDataToCsvConverters.RowDataToCsvConverter) ESCAPE_CHARACTER(org.apache.flink.formats.csv.CsvFormatOptions.ESCAPE_CHARACTER) StreamFormatAdapter(org.apache.flink.connector.file.src.impl.StreamFormatAdapter) ConfigOption(org.apache.flink.configuration.ConfigOption) StringEscapeUtils(org.apache.commons.lang3.StringEscapeUtils) Preconditions.checkNotNull(org.apache.flink.util.Preconditions.checkNotNull) BulkDecodingFormat(org.apache.flink.connector.file.table.format.BulkDecodingFormat) Projection(org.apache.flink.table.connector.Projection) BulkReaderFormatFactory(org.apache.flink.connector.file.table.factories.BulkReaderFormatFactory) RowData(org.apache.flink.table.data.RowData) DynamicTableSink(org.apache.flink.table.connector.sink.DynamicTableSink) BulkWriter(org.apache.flink.api.common.serialization.BulkWriter) ObjectNode(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.node.ObjectNode) Set(java.util.Set) ProjectableDecodingFormat(org.apache.flink.table.connector.format.ProjectableDecodingFormat) DISABLE_QUOTE_CHARACTER(org.apache.flink.formats.csv.CsvFormatOptions.DISABLE_QUOTE_CHARACTER) ARRAY_ELEMENT_DELIMITER(org.apache.flink.formats.csv.CsvFormatOptions.ARRAY_ELEMENT_DELIMITER) Converter(org.apache.flink.formats.common.Converter) NULL_LITERAL(org.apache.flink.formats.csv.CsvFormatOptions.NULL_LITERAL) Internal(org.apache.flink.annotation.Internal) BulkFormat(org.apache.flink.connector.file.src.reader.BulkFormat) Collections(java.util.Collections) CsvMapper(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvMapper) RowDataToCsvConverter(org.apache.flink.formats.csv.RowDataToCsvConverters.RowDataToCsvConverter) ObjectNode(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.node.ObjectNode) CsvMapper(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvMapper) RowType(org.apache.flink.table.types.logical.RowType) EncodingFormat(org.apache.flink.table.connector.format.EncodingFormat) RowData(org.apache.flink.table.data.RowData) CsvSchema(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.dataformat.csv.CsvSchema) BulkWriter(org.apache.flink.api.common.serialization.BulkWriter) DataType(org.apache.flink.table.types.DataType)

Example 2 with EncodingFormat

use of org.apache.flink.table.connector.format.EncodingFormat in project flink by apache.

the class CsvFormatFactory method createEncodingFormat.

@Override
public EncodingFormat<SerializationSchema<RowData>> createEncodingFormat(DynamicTableFactory.Context context, ReadableConfig formatOptions) {
    FactoryUtil.validateFactoryOptions(this, formatOptions);
    CsvCommons.validateFormatOptions(formatOptions);
    return new EncodingFormat<SerializationSchema<RowData>>() {

        @Override
        public SerializationSchema<RowData> createRuntimeEncoder(DynamicTableSink.Context context, DataType consumedDataType) {
            final RowType rowType = (RowType) consumedDataType.getLogicalType();
            final CsvRowDataSerializationSchema.Builder schemaBuilder = new CsvRowDataSerializationSchema.Builder(rowType);
            configureSerializationSchema(formatOptions, schemaBuilder);
            return schemaBuilder.build();
        }

        @Override
        public ChangelogMode getChangelogMode() {
            return ChangelogMode.insertOnly();
        }
    };
}
Also used : EncodingFormat(org.apache.flink.table.connector.format.EncodingFormat) RowData(org.apache.flink.table.data.RowData) DataType(org.apache.flink.table.types.DataType) RowType(org.apache.flink.table.types.logical.RowType)

Example 3 with EncodingFormat

use of org.apache.flink.table.connector.format.EncodingFormat in project flink by apache.

the class DebeziumJsonFormatFactory method createEncodingFormat.

@Override
public EncodingFormat<SerializationSchema<RowData>> createEncodingFormat(DynamicTableFactory.Context context, ReadableConfig formatOptions) {
    FactoryUtil.validateFactoryOptions(this, formatOptions);
    validateEncodingFormatOptions(formatOptions);
    TimestampFormat timestampFormat = JsonFormatOptionsUtil.getTimestampFormat(formatOptions);
    JsonFormatOptions.MapNullKeyMode mapNullKeyMode = JsonFormatOptionsUtil.getMapNullKeyMode(formatOptions);
    String mapNullKeyLiteral = formatOptions.get(JSON_MAP_NULL_KEY_LITERAL);
    final boolean encodeDecimalAsPlainNumber = formatOptions.get(ENCODE_DECIMAL_AS_PLAIN_NUMBER);
    return new EncodingFormat<SerializationSchema<RowData>>() {

        @Override
        public ChangelogMode getChangelogMode() {
            return ChangelogMode.newBuilder().addContainedKind(RowKind.INSERT).addContainedKind(RowKind.UPDATE_BEFORE).addContainedKind(RowKind.UPDATE_AFTER).addContainedKind(RowKind.DELETE).build();
        }

        @Override
        public SerializationSchema<RowData> createRuntimeEncoder(DynamicTableSink.Context context, DataType consumedDataType) {
            final RowType rowType = (RowType) consumedDataType.getLogicalType();
            return new DebeziumJsonSerializationSchema(rowType, timestampFormat, mapNullKeyMode, mapNullKeyLiteral, encodeDecimalAsPlainNumber);
        }
    };
}
Also used : EncodingFormat(org.apache.flink.table.connector.format.EncodingFormat) JsonFormatOptions(org.apache.flink.formats.json.JsonFormatOptions) RowData(org.apache.flink.table.data.RowData) DataType(org.apache.flink.table.types.DataType) RowType(org.apache.flink.table.types.logical.RowType) TimestampFormat(org.apache.flink.formats.common.TimestampFormat)

Example 4 with EncodingFormat

use of org.apache.flink.table.connector.format.EncodingFormat in project flink by apache.

the class RegistryAvroFormatFactory method createEncodingFormat.

@Override
public EncodingFormat<SerializationSchema<RowData>> createEncodingFormat(DynamicTableFactory.Context context, ReadableConfig formatOptions) {
    FactoryUtil.validateFactoryOptions(this, formatOptions);
    String schemaRegistryURL = formatOptions.get(URL);
    Optional<String> subject = formatOptions.getOptional(SUBJECT);
    Map<String, ?> optionalPropertiesMap = buildOptionalPropertiesMap(formatOptions);
    if (!subject.isPresent()) {
        throw new ValidationException(String.format("Option %s.%s is required for serialization", IDENTIFIER, SUBJECT.key()));
    }
    return new EncodingFormat<SerializationSchema<RowData>>() {

        @Override
        public SerializationSchema<RowData> createRuntimeEncoder(DynamicTableSink.Context context, DataType consumedDataType) {
            final RowType rowType = (RowType) consumedDataType.getLogicalType();
            return new AvroRowDataSerializationSchema(rowType, ConfluentRegistryAvroSerializationSchema.forGeneric(subject.get(), AvroSchemaConverter.convertToSchema(rowType), schemaRegistryURL, optionalPropertiesMap), RowDataToAvroConverters.createConverter(rowType));
        }

        @Override
        public ChangelogMode getChangelogMode() {
            return ChangelogMode.insertOnly();
        }
    };
}
Also used : EncodingFormat(org.apache.flink.table.connector.format.EncodingFormat) RowData(org.apache.flink.table.data.RowData) AvroRowDataSerializationSchema(org.apache.flink.formats.avro.AvroRowDataSerializationSchema) ValidationException(org.apache.flink.table.api.ValidationException) DataType(org.apache.flink.table.types.DataType) RowType(org.apache.flink.table.types.logical.RowType)

Example 5 with EncodingFormat

use of org.apache.flink.table.connector.format.EncodingFormat in project flink by apache.

the class DebeziumAvroFormatFactory method createEncodingFormat.

@Override
public EncodingFormat<SerializationSchema<RowData>> createEncodingFormat(DynamicTableFactory.Context context, ReadableConfig formatOptions) {
    FactoryUtil.validateFactoryOptions(this, formatOptions);
    String schemaRegistryURL = formatOptions.get(URL);
    Optional<String> subject = formatOptions.getOptional(SUBJECT);
    Map<String, ?> optionalPropertiesMap = buildOptionalPropertiesMap(formatOptions);
    if (!subject.isPresent()) {
        throw new ValidationException(String.format("Option '%s.%s' is required for serialization", IDENTIFIER, SUBJECT.key()));
    }
    return new EncodingFormat<SerializationSchema<RowData>>() {

        @Override
        public ChangelogMode getChangelogMode() {
            return ChangelogMode.newBuilder().addContainedKind(RowKind.INSERT).addContainedKind(RowKind.UPDATE_BEFORE).addContainedKind(RowKind.UPDATE_AFTER).addContainedKind(RowKind.DELETE).build();
        }

        @Override
        public SerializationSchema<RowData> createRuntimeEncoder(DynamicTableSink.Context context, DataType consumedDataType) {
            final RowType rowType = (RowType) consumedDataType.getLogicalType();
            return new DebeziumAvroSerializationSchema(rowType, schemaRegistryURL, subject.get(), optionalPropertiesMap);
        }
    };
}
Also used : EncodingFormat(org.apache.flink.table.connector.format.EncodingFormat) RowData(org.apache.flink.table.data.RowData) ValidationException(org.apache.flink.table.api.ValidationException) DataType(org.apache.flink.table.types.DataType) RowType(org.apache.flink.table.types.logical.RowType)

Aggregations

EncodingFormat (org.apache.flink.table.connector.format.EncodingFormat)13 RowData (org.apache.flink.table.data.RowData)13 DataType (org.apache.flink.table.types.DataType)12 RowType (org.apache.flink.table.types.logical.RowType)11 TimestampFormat (org.apache.flink.formats.common.TimestampFormat)5 JsonFormatOptions (org.apache.flink.formats.json.JsonFormatOptions)4 BulkWriter (org.apache.flink.api.common.serialization.BulkWriter)2 SerializationSchema (org.apache.flink.api.common.serialization.SerializationSchema)2 ReadableConfig (org.apache.flink.configuration.ReadableConfig)2 ValidationException (org.apache.flink.table.api.ValidationException)2 Collections (java.util.Collections)1 Set (java.util.Set)1 StringEscapeUtils (org.apache.commons.lang3.StringEscapeUtils)1 Internal (org.apache.flink.annotation.Internal)1 Factory (org.apache.flink.api.common.serialization.BulkWriter.Factory)1 ConfigOption (org.apache.flink.configuration.ConfigOption)1 DeliveryGuarantee (org.apache.flink.connector.base.DeliveryGuarantee)1 FileSourceSplit (org.apache.flink.connector.file.src.FileSourceSplit)1 StreamFormatAdapter (org.apache.flink.connector.file.src.impl.StreamFormatAdapter)1 BulkFormat (org.apache.flink.connector.file.src.reader.BulkFormat)1