Search in sources :

Example 1 with InLongMsgMixedSerializedRecord

use of org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord in project incubator-inlong by apache.

the class InLongMsgMixedDeserializerTest method testDeserialize.

@Test
public void testDeserialize() throws Exception {
    final InLongMsgMixedDeserializer mixedDeserializer = new InLongMsgMixedDeserializer();
    mixedDeserializer.updateDataFlow(dataFlowId1, tid, preDeserializer, deserializer);
    mixedDeserializer.updateDataFlow(dataFlowId2, tid, preDeserializer, deserializer);
    final Row row = new Row(3);
    row.setField(0, null);
    row.setField(1, new byte[0]);
    row.setField(2, tid);
    preDeserializer.records.add(row);
    mixedDeserializer.deserialize(new InLongMsgMixedSerializedRecord(), collector);
    assertEquals(2, collector.results.size());
    assertEquals(dataFlowId1, collector.results.get(0).getDataflowId());
    assertEquals(tid, collector.results.get(0).getRow().getField(2));
    assertEquals(dataFlowId2, collector.results.get(1).getDataflowId());
    assertEquals(tid, collector.results.get(1).getRow().getField(2));
}
Also used : InLongMsgMixedSerializedRecord(org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord) Row(org.apache.flink.types.Row) Test(org.junit.Test)

Example 2 with InLongMsgMixedSerializedRecord

use of org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord in project incubator-inlong by apache.

the class DeserializationSchema method processElement.

@Override
public void processElement(SerializedRecord serializedRecord, Context context, Collector<SerializedRecord> collector) throws Exception {
    try {
        if (enableOutputMetrics && !config.getString(Constants.SOURCE_TYPE).equals(Constants.SOURCE_TYPE_TUBE)) {
            // If source is tube, we do not output metrics of package number
            final MetricData metricData = new MetricData(// since source could not have side-outputs, so it outputs metrics for source here
            MetricSource.SOURCE, MetricType.SUCCESSFUL, serializedRecord.getTimestampMillis(), serializedRecord.getDataFlowId(), "", 1L);
            context.output(METRIC_DATA_OUTPUT_TAG, metricData);
        }
        final CallbackCollector<Record> transformCollector = new CallbackCollector<>(sourceRecord -> {
            final Record sinkRecord = fieldMappingTransformer.transform(sourceRecord);
            if (enableOutputMetrics) {
                MetricData metricData = new MetricData(// TODO, outputs this metric in Sink Function
                MetricSource.SINK, MetricType.SUCCESSFUL, sinkRecord.getTimestampMillis(), sinkRecord.getDataflowId(), "", 1);
                context.output(METRIC_DATA_OUTPUT_TAG, metricData);
            }
            SerializedRecord serializedSinkRecord = recordTransformer.toSerializedRecord(sinkRecord);
            if (auditImp != null) {
                Pair<String, String> groupIdAndStreamId = inLongGroupIdAndStreamIdMap.getOrDefault(serializedRecord.getDataFlowId(), Pair.of("", ""));
                auditImp.add(Constants.METRIC_AUDIT_ID_FOR_INPUT, groupIdAndStreamId.getLeft(), groupIdAndStreamId.getRight(), sinkRecord.getTimestampMillis(), 1, serializedSinkRecord.getData().length);
            }
            collector.collect(serializedSinkRecord);
        });
        if (serializedRecord instanceof InLongMsgMixedSerializedRecord) {
            final InLongMsgMixedSerializedRecord inlongmsgRecord = (InLongMsgMixedSerializedRecord) serializedRecord;
            synchronized (schemaLock) {
                multiTenancyInLongMsgMixedDeserializer.deserialize(inlongmsgRecord, transformCollector);
            }
        } else {
            synchronized (schemaLock) {
                multiTenancyDeserializer.deserialize(serializedRecord, transformCollector);
            }
        }
    } catch (Exception e) {
        if (enableOutputMetrics && !config.getString(Constants.SOURCE_TYPE).equals(Constants.SOURCE_TYPE_TUBE)) {
            MetricData metricData = new MetricData(MetricSource.DESERIALIZATION, MetricType.ABANDONED, serializedRecord.getTimestampMillis(), serializedRecord.getDataFlowId(), (e.getMessage() == null || e.getMessage().isEmpty()) ? "Exception caught" : e.getMessage(), 1);
            context.output(METRIC_DATA_OUTPUT_TAG, metricData);
        }
        LOG.warn("Abandon data", e);
    }
}
Also used : InLongMsgMixedSerializedRecord(org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord) SerializedRecord(org.apache.inlong.sort.flink.SerializedRecord) InLongMsgMixedSerializedRecord(org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord) Record(org.apache.inlong.sort.flink.Record) InLongMsgMixedSerializedRecord(org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord) SerializedRecord(org.apache.inlong.sort.flink.SerializedRecord) MetricData(org.apache.inlong.sort.flink.metrics.MetricData)

Example 3 with InLongMsgMixedSerializedRecord

use of org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord in project incubator-inlong by apache.

the class MultiTenancyInLongMsgMixedDeserializerTest method testDeserialize.

@Test
public void testDeserialize() throws Exception {
    final MultiTenancyInLongMsgMixedDeserializer deserializer = new MultiTenancyInLongMsgMixedDeserializer();
    final FieldInfo stringField = new FieldInfo("not_important", new StringFormatInfo());
    final FieldInfo longField = new FieldInfo("id", new LongFormatInfo());
    final TubeSourceInfo tubeSourceInfo = new TubeSourceInfo("topic", "address", null, new InLongMsgCsvDeserializationInfo("tid", '|', false), new FieldInfo[] { stringField, longField });
    final EmptySinkInfo sinkInfo = new EmptySinkInfo();
    final DataFlowInfo dataFlowInfo = new DataFlowInfo(1L, tubeSourceInfo, sinkInfo);
    deserializer.addDataFlow(dataFlowInfo);
    final InLongMsg inLongMsg = InLongMsg.newInLongMsg();
    final String attrs = "m=0&" + InLongMsgUtils.INLONGMSG_ATTR_STREAM_ID + "=tid&t=20210513";
    final String body1 = "tianqiwan|29";
    inLongMsg.addMsg(attrs, body1.getBytes());
    final TestingCollector<Record> collector = new TestingCollector<>();
    deserializer.deserialize(new InLongMsgMixedSerializedRecord("topic", 0, inLongMsg.buildArray()), collector);
    assertEquals(1, collector.results.size());
    assertEquals(1L, collector.results.get(0).getDataflowId());
    assertEquals(4, collector.results.get(0).getRow().getArity());
    final long time = new SimpleDateFormat("yyyyMMdd").parse("20210513").getTime();
    assertEquals(new Timestamp(time), collector.results.get(0).getRow().getField(0));
    final Map<String, String> attributes = new HashMap<>();
    attributes.put("m", "0");
    attributes.put(InLongMsgUtils.INLONGMSG_ATTR_STREAM_ID, "tid");
    attributes.put("t", "20210513");
    assertEquals(attributes, collector.results.get(0).getRow().getField(1));
    assertEquals("tianqiwan", collector.results.get(0).getRow().getField(2));
    assertEquals(29L, collector.results.get(0).getRow().getField(3));
}
Also used : EmptySinkInfo(org.apache.inlong.sort.util.TestingUtils.EmptySinkInfo) HashMap(java.util.HashMap) InLongMsg(org.apache.inlong.common.msg.InLongMsg) Timestamp(java.sql.Timestamp) TestingCollector(org.apache.inlong.sort.util.TestingUtils.TestingCollector) InLongMsgMixedSerializedRecord(org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord) Record(org.apache.inlong.sort.flink.Record) InLongMsgMixedSerializedRecord(org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord) LongFormatInfo(org.apache.inlong.sort.formats.common.LongFormatInfo) InLongMsgCsvDeserializationInfo(org.apache.inlong.sort.protocol.deserialization.InLongMsgCsvDeserializationInfo) TubeSourceInfo(org.apache.inlong.sort.protocol.source.TubeSourceInfo) StringFormatInfo(org.apache.inlong.sort.formats.common.StringFormatInfo) SimpleDateFormat(java.text.SimpleDateFormat) FieldInfo(org.apache.inlong.sort.protocol.FieldInfo) DataFlowInfo(org.apache.inlong.sort.protocol.DataFlowInfo) Test(org.junit.Test)

Aggregations

InLongMsgMixedSerializedRecord (org.apache.inlong.sort.flink.InLongMsgMixedSerializedRecord)3 Record (org.apache.inlong.sort.flink.Record)2 Test (org.junit.Test)2 Timestamp (java.sql.Timestamp)1 SimpleDateFormat (java.text.SimpleDateFormat)1 HashMap (java.util.HashMap)1 Row (org.apache.flink.types.Row)1 InLongMsg (org.apache.inlong.common.msg.InLongMsg)1 SerializedRecord (org.apache.inlong.sort.flink.SerializedRecord)1 MetricData (org.apache.inlong.sort.flink.metrics.MetricData)1 LongFormatInfo (org.apache.inlong.sort.formats.common.LongFormatInfo)1 StringFormatInfo (org.apache.inlong.sort.formats.common.StringFormatInfo)1 DataFlowInfo (org.apache.inlong.sort.protocol.DataFlowInfo)1 FieldInfo (org.apache.inlong.sort.protocol.FieldInfo)1 InLongMsgCsvDeserializationInfo (org.apache.inlong.sort.protocol.deserialization.InLongMsgCsvDeserializationInfo)1 TubeSourceInfo (org.apache.inlong.sort.protocol.source.TubeSourceInfo)1 EmptySinkInfo (org.apache.inlong.sort.util.TestingUtils.EmptySinkInfo)1 TestingCollector (org.apache.inlong.sort.util.TestingUtils.TestingCollector)1