Search in sources :

Example 11 with RowFormatInfo

use of org.apache.inlong.sort.formats.common.RowFormatInfo in project incubator-inlong by apache.

the class InLongMsgCsvFormatFactory method createMixedFormatConverter.

@Override
public InLongMsgCsvMixedFormatConverter createMixedFormatConverter(Map<String, String> properties) {
    final DescriptorProperties descriptorProperties = new DescriptorProperties(true);
    descriptorProperties.putProperties(properties);
    RowFormatInfo rowFormatInfo = getDataFormatInfo(descriptorProperties);
    String timeFieldName = descriptorProperties.getOptionalString(FORMAT_TIME_FIELD_NAME).orElse(InLongMsgUtils.DEFAULT_TIME_FIELD_NAME);
    String attributesFieldName = descriptorProperties.getOptionalString(FORMAT_ATTRIBUTES_FIELD_NAME).orElse(InLongMsgUtils.DEFAULT_ATTRIBUTES_FIELD_NAME);
    validateFieldNames(timeFieldName, attributesFieldName, rowFormatInfo);
    String nullLiteral = descriptorProperties.getOptionalString(FORMAT_NULL_LITERAL).orElse(null);
    boolean ignoreErrors = descriptorProperties.getOptionalBoolean(FORMAT_IGNORE_ERRORS).orElse(DEFAULT_IGNORE_ERRORS);
    return new InLongMsgCsvMixedFormatConverter(rowFormatInfo, timeFieldName, attributesFieldName, nullLiteral, ignoreErrors);
}
Also used : DescriptorProperties(org.apache.flink.table.descriptors.DescriptorProperties) RowFormatInfo(org.apache.inlong.sort.formats.common.RowFormatInfo)

Example 12 with RowFormatInfo

use of org.apache.inlong.sort.formats.common.RowFormatInfo in project incubator-inlong by apache.

the class InLongMsgUtils method getDataFormatInfo.

public static RowFormatInfo getDataFormatInfo(DescriptorProperties descriptorProperties) {
    if (descriptorProperties.containsKey(TableFormatConstants.FORMAT_SCHEMA)) {
        return TableFormatUtils.deserializeRowFormatInfo(descriptorProperties);
    } else {
        TableSchema tableSchema = deriveSchema(descriptorProperties.asMap());
        String[] fieldNames = tableSchema.getFieldNames();
        DataType[] fieldTypes = tableSchema.getFieldDataTypes();
        String[] dataFieldNames = new String[fieldNames.length - 2];
        FormatInfo[] dataFieldFormatInfos = new FormatInfo[fieldNames.length - 2];
        for (int i = 0; i < dataFieldNames.length; ++i) {
            dataFieldNames[i] = fieldNames[i + 2];
            dataFieldFormatInfos[i] = TableFormatUtils.deriveFormatInfo(fieldTypes[i + 2].getLogicalType());
        }
        return new RowFormatInfo(dataFieldNames, dataFieldFormatInfos);
    }
}
Also used : TableSchema(org.apache.flink.table.api.TableSchema) RowFormatInfo(org.apache.inlong.sort.formats.common.RowFormatInfo) DataType(org.apache.flink.table.types.DataType) FormatInfo(org.apache.inlong.sort.formats.common.FormatInfo) RowFormatInfo(org.apache.inlong.sort.formats.common.RowFormatInfo)

Example 13 with RowFormatInfo

use of org.apache.inlong.sort.formats.common.RowFormatInfo in project incubator-inlong by apache.

the class MultiTenancyDeserializer method generateDeserializer.

@VisibleForTesting
Deserializer<SerializedRecord, Record> generateDeserializer(FieldInfo[] fields, DeserializationInfo deserializationInfo) {
    final RowFormatInfo rowFormatInfo = CommonUtils.generateDeserializationRowFormatInfo(fields);
    final Deserializer<SerializedRecord, Record> deserializer;
    if (deserializationInfo instanceof InLongMsgCsvDeserializationInfo) {
        InLongMsgCsvDeserializationInfo inLongMsgCsvDeserializationInfo = (InLongMsgCsvDeserializationInfo) deserializationInfo;
        InLongMsgCsvFormatDeserializer inLongMsgCsvFormatDeserializer = new InLongMsgCsvFormatDeserializer(rowFormatInfo, DEFAULT_TIME_FIELD_NAME, DEFAULT_ATTRIBUTES_FIELD_NAME, TableFormatConstants.DEFAULT_CHARSET, inLongMsgCsvDeserializationInfo.getDelimiter(), null, null, null, inLongMsgCsvDeserializationInfo.isDeleteHeadDelimiter(), TableFormatConstants.DEFAULT_IGNORE_ERRORS);
        deserializer = new InLongMsgDeserializer(inLongMsgCsvFormatDeserializer);
    } else {
        // TODO, support more formats here
        throw new UnsupportedOperationException("Not supported yet " + deserializationInfo.getClass().getSimpleName());
    }
    return deserializer;
}
Also used : SerializedRecord(org.apache.inlong.sort.flink.SerializedRecord) InLongMsgCsvFormatDeserializer(org.apache.inlong.sort.formats.inlongmsgcsv.InLongMsgCsvFormatDeserializer) RowFormatInfo(org.apache.inlong.sort.formats.common.RowFormatInfo) SerializedRecord(org.apache.inlong.sort.flink.SerializedRecord) Record(org.apache.inlong.sort.flink.Record) InLongMsgCsvDeserializationInfo(org.apache.inlong.sort.protocol.deserialization.InLongMsgCsvDeserializationInfo) VisibleForTesting(com.google.common.annotations.VisibleForTesting)

Example 14 with RowFormatInfo

use of org.apache.inlong.sort.formats.common.RowFormatInfo in project incubator-inlong by apache.

the class CommonUtilsTest method testBuildAvroRecordSchemaInJsonForRecursiveFields.

@Test
public void testBuildAvroRecordSchemaInJsonForRecursiveFields() throws IOException {
    FieldInfo[] testFieldInfos = new FieldInfo[] { new FieldInfo("f1", new ArrayFormatInfo(new MapFormatInfo(new StringFormatInfo(), new ArrayFormatInfo(new ArrayFormatInfo(new ShortFormatInfo()))))), new FieldInfo("f2", new MapFormatInfo(new StringFormatInfo(), new MapFormatInfo(new StringFormatInfo(), new RowFormatInfo(new String[] { "f21", "f22" }, new FormatInfo[] { new IntFormatInfo(), new ArrayFormatInfo(new ByteFormatInfo()) })))), new FieldInfo("f3", new RowFormatInfo(new String[] { "f31", "f32" }, new FormatInfo[] { new ArrayFormatInfo(new StringFormatInfo()), new RowFormatInfo(new String[] { "f321", "f322" }, new FormatInfo[] { new ArrayFormatInfo(new IntFormatInfo()), new MapFormatInfo(new StringFormatInfo(), new ArrayFormatInfo(new ByteFormatInfo())) }) })) };
    JsonNode expectedJsonNode = objectMapper.readTree("{\n" + "    \"type\":\"record\",\n" + "    \"name\":\"record\",\n" + "    \"fields\":[\n" + "        {\n" + "            \"name\":\"f1\",\n" + "            \"type\":[\n" + "                \"null\",\n" + "                {\n" + "                    \"type\":\"array\",\n" + "                    \"items\":[\n" + "                        \"null\",\n" + "                        {\n" + "                            \"type\":\"map\",\n" + "                            \"values\":[\n" + "                                \"null\",\n" + "                                {\n" + "                                    \"type\":\"array\",\n" + "                                    \"items\":[\n" + "                                        \"null\",\n" + "                                        {\n" + "                                            \"type\":\"array\",\n" + "                                            \"items\":[\n" + "                                                \"null\",\n" + "                                                \"int\"\n" + "                                            ]\n" + "                                        }\n" + "                                    ]\n" + "                                }\n" + "                            ]\n" + "                        }\n" + "                    ]\n" + "                }\n" + "            ],\n" + "            \"default\":null\n" + "        },\n" + "        {\n" + "            \"name\":\"f2\",\n" + "            \"type\":[\n" + "                \"null\",\n" + "                {\n" + "                    \"type\":\"map\",\n" + "                    \"values\":[\n" + "                        \"null\",\n" + "                        {\n" + "                            \"type\":\"map\",\n" + "                            \"values\":[\n" + "                                \"null\",\n" + "                                {\n" + "                                    \"type\":\"record\",\n" + "                                    \"name\":\"record_f2\",\n" + "                                    \"fields\":[\n" + "                                        {\n" + "                                            \"name\":\"f21\",\n" + "                                            \"type\":[\n" + "                                                \"null\",\n" + "                                                \"int\"\n" + "                                            ],\n" + "                                            \"default\":null\n" + "                                        },\n" + "                                        {\n" + "                                            \"name\":\"f22\",\n" + "                                            \"type\":[\n" + "                                                \"null\",\n" + "                                                {\n" + "                                                    \"type\":\"array\",\n" + "                                                    \"items\":[\n" + "                                                        \"null\",\n" + "                                                        \"int\"\n" + "                                                    ]\n" + "                                                }\n" + "                                            ],\n" + "                                            \"default\":null\n" + "                                        }\n" + "                                    ]\n" + "                                }\n" + "                            ]\n" + "                        }\n" + "                    ]\n" + "                }\n" + "            ],\n" + "            \"default\":null\n" + "        },\n" + "        {\n" + "            \"name\":\"f3\",\n" + "            \"type\":[\n" + "                \"null\",\n" + "                {\n" + "                    \"type\":\"record\",\n" + "                    \"name\":\"record_f3\",\n" + "                    \"fields\":[\n" + "                        {\n" + "                            \"name\":\"f31\",\n" + "                            \"type\":[\n" + "                                \"null\",\n" + "                                {\n" + "                                    \"type\":\"array\",\n" + "                                    \"items\":[\n" + "                                        \"null\",\n" + "                                        \"string\"\n" + "                                    ]\n" + "                                }\n" + "                            ],\n" + "                            \"default\":null\n" + "                        },\n" + "                        {\n" + "                            \"name\":\"f32\",\n" + "                            \"type\":[\n" + "                                \"null\",\n" + "                                {\n" + "                                    \"type\":\"record\",\n" + "                                    \"name\":\"record_f3_f32\",\n" + "                                    \"fields\":[\n" + "                                        {\n" + "                                            \"name\":\"f321\",\n" + "                                            \"type\":[\n" + "                                                \"null\",\n" + "                                                {\n" + "                                                    \"type\":\"array\",\n" + "                                                    \"items\":[\n" + "                                                        \"null\",\n" + "                                                        \"int\"\n" + "                                                    ]\n" + "                                                }\n" + "                                            ],\n" + "                                            \"default\":null\n" + "                                        },\n" + "                                        {\n" + "                                            \"name\":\"f322\",\n" + "                                            \"type\":[\n" + "                                                \"null\",\n" + "                                                {\n" + "                                                    \"type\":\"map\",\n" + "                                                    \"values\":[\n" + "                                                        \"null\",\n" + "                                                        {\n" + "                                                            \"type\":\"array\",\n" + "                                                            \"items\":[\n" + "                                                                \"null\",\n" + "                                                                \"int\"\n" + "                                                            ]\n" + "                                                        }\n" + "                                                    ]\n" + "                                                }\n" + "                                            ],\n" + "                                            \"default\":null\n" + "                                        }\n" + "                                    ]\n" + "                                }\n" + "                            ],\n" + "                            \"default\":null\n" + "                        }\n" + "                    ]\n" + "                }\n" + "            ],\n" + "            \"default\":null\n" + "        }\n" + "    ]\n" + "}");
    String actualJson = buildAvroRecordSchemaInJson(testFieldInfos);
    JsonNode actualJsonNode = objectMapper.readTree(actualJson);
    assertEquals(expectedJsonNode, actualJsonNode);
}
Also used : MapFormatInfo(org.apache.inlong.sort.formats.common.MapFormatInfo) ShortFormatInfo(org.apache.inlong.sort.formats.common.ShortFormatInfo) ByteFormatInfo(org.apache.inlong.sort.formats.common.ByteFormatInfo) RowFormatInfo(org.apache.inlong.sort.formats.common.RowFormatInfo) IntFormatInfo(org.apache.inlong.sort.formats.common.IntFormatInfo) ArrayFormatInfo(org.apache.inlong.sort.formats.common.ArrayFormatInfo) JsonNode(org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.JsonNode) FormatInfo(org.apache.inlong.sort.formats.common.FormatInfo) ShortFormatInfo(org.apache.inlong.sort.formats.common.ShortFormatInfo) StringFormatInfo(org.apache.inlong.sort.formats.common.StringFormatInfo) ArrayFormatInfo(org.apache.inlong.sort.formats.common.ArrayFormatInfo) BooleanFormatInfo(org.apache.inlong.sort.formats.common.BooleanFormatInfo) ByteFormatInfo(org.apache.inlong.sort.formats.common.ByteFormatInfo) RowFormatInfo(org.apache.inlong.sort.formats.common.RowFormatInfo) MapFormatInfo(org.apache.inlong.sort.formats.common.MapFormatInfo) IntFormatInfo(org.apache.inlong.sort.formats.common.IntFormatInfo) StringFormatInfo(org.apache.inlong.sort.formats.common.StringFormatInfo) BuiltInFieldInfo(org.apache.inlong.sort.protocol.BuiltInFieldInfo) FieldInfo(org.apache.inlong.sort.protocol.FieldInfo) Test(org.junit.Test)

Example 15 with RowFormatInfo

use of org.apache.inlong.sort.formats.common.RowFormatInfo in project incubator-inlong by apache.

the class RowToAvroKafkaSinkTest method prepareData.

@Override
protected void prepareData() throws IOException, ClassNotFoundException {
    fieldInfos = new FieldInfo[] { new FieldInfo("f1", new StringFormatInfo()), new FieldInfo("f2", new IntFormatInfo()), new FieldInfo("f3", new NullFormatInfo()), new FieldInfo("f4", new BinaryFormatInfo()), new FieldInfo("f5", new MapFormatInfo(new StringFormatInfo(), new RowFormatInfo(new String[] { "f51", "f52" }, new FormatInfo[] { new IntFormatInfo(), new ArrayFormatInfo(new DoubleFormatInfo()) }))) };
    topic = "test_kafka_row_to_avro";
    serializationSchema = SerializationSchemaFactory.build(fieldInfos, new AvroSerializationInfo());
    prepareTestRows();
}
Also used : NullFormatInfo(org.apache.inlong.sort.formats.common.NullFormatInfo) MapFormatInfo(org.apache.inlong.sort.formats.common.MapFormatInfo) IntFormatInfo(org.apache.inlong.sort.formats.common.IntFormatInfo) RowFormatInfo(org.apache.inlong.sort.formats.common.RowFormatInfo) ArrayFormatInfo(org.apache.inlong.sort.formats.common.ArrayFormatInfo) DoubleFormatInfo(org.apache.inlong.sort.formats.common.DoubleFormatInfo) FormatInfo(org.apache.inlong.sort.formats.common.FormatInfo) DoubleFormatInfo(org.apache.inlong.sort.formats.common.DoubleFormatInfo) BinaryFormatInfo(org.apache.inlong.sort.formats.common.BinaryFormatInfo) StringFormatInfo(org.apache.inlong.sort.formats.common.StringFormatInfo) ArrayFormatInfo(org.apache.inlong.sort.formats.common.ArrayFormatInfo) RowFormatInfo(org.apache.inlong.sort.formats.common.RowFormatInfo) NullFormatInfo(org.apache.inlong.sort.formats.common.NullFormatInfo) MapFormatInfo(org.apache.inlong.sort.formats.common.MapFormatInfo) IntFormatInfo(org.apache.inlong.sort.formats.common.IntFormatInfo) AvroSerializationInfo(org.apache.inlong.sort.protocol.serialization.AvroSerializationInfo) StringFormatInfo(org.apache.inlong.sort.formats.common.StringFormatInfo) FieldInfo(org.apache.inlong.sort.protocol.FieldInfo) BinaryFormatInfo(org.apache.inlong.sort.formats.common.BinaryFormatInfo)

Aggregations

RowFormatInfo (org.apache.inlong.sort.formats.common.RowFormatInfo)34 DescriptorProperties (org.apache.flink.table.descriptors.DescriptorProperties)14 FormatInfo (org.apache.inlong.sort.formats.common.FormatInfo)14 BasicFormatInfo (org.apache.inlong.sort.formats.common.BasicFormatInfo)8 StringFormatInfo (org.apache.inlong.sort.formats.common.StringFormatInfo)8 ArrayFormatInfo (org.apache.inlong.sort.formats.common.ArrayFormatInfo)6 IntFormatInfo (org.apache.inlong.sort.formats.common.IntFormatInfo)6 MapFormatInfo (org.apache.inlong.sort.formats.common.MapFormatInfo)6 ValidationException (org.apache.flink.table.api.ValidationException)5 BinaryFormatInfo (org.apache.inlong.sort.formats.common.BinaryFormatInfo)5 BooleanFormatInfo (org.apache.inlong.sort.formats.common.BooleanFormatInfo)5 ByteFormatInfo (org.apache.inlong.sort.formats.common.ByteFormatInfo)5 DateFormatInfo (org.apache.inlong.sort.formats.common.DateFormatInfo)5 DoubleFormatInfo (org.apache.inlong.sort.formats.common.DoubleFormatInfo)5 NullFormatInfo (org.apache.inlong.sort.formats.common.NullFormatInfo)5 ShortFormatInfo (org.apache.inlong.sort.formats.common.ShortFormatInfo)5 TimeFormatInfo (org.apache.inlong.sort.formats.common.TimeFormatInfo)5 TimestampFormatInfo (org.apache.inlong.sort.formats.common.TimestampFormatInfo)5 Row (org.apache.flink.types.Row)4 DecimalFormatInfo (org.apache.inlong.sort.formats.common.DecimalFormatInfo)4