Search in sources :

Example 1 with FieldMappingRule

use of org.apache.inlong.sort.protocol.transformation.FieldMappingRule in project incubator-inlong by apache.

the class TransformerTest method testTransformWithNotExistSinkFieldName.

@Test
public void testTransformWithNotExistSinkFieldName() {
    TransformationInfo transformationInfo = new TransformationInfo(new FieldMappingRule(new FieldMappingUnit[] { new FieldMappingUnit(new FieldInfo("age", StringFormatInfo.INSTANCE), new FieldInfo("age_out", StringFormatInfo.INSTANCE)), new FieldMappingUnit(new FieldInfo("name", StringFormatInfo.INSTANCE), // not exist sink field name
    new FieldInfo("name_out_not_exist", StringFormatInfo.INSTANCE)) }));
    Transformer transformer = new Transformer(transformationInfo, sourceFieldInfos, sinkFieldInfos);
    transformer.open(new Configuration());
    Row input = Row.of("name", 29, 179);
    ListCollector<Row> collector = new ListCollector<>();
    transformer.processElement(input, null, collector);
    assertEquals(Row.of(29, null), collector.getInnerList().get(0));
}
Also used : ListCollector(org.apache.inlong.sort.singletenant.flink.deserialization.ListCollector) FieldMappingRule(org.apache.inlong.sort.protocol.transformation.FieldMappingRule) Configuration(org.apache.flink.configuration.Configuration) Row(org.apache.flink.types.Row) FieldMappingUnit(org.apache.inlong.sort.protocol.transformation.FieldMappingRule.FieldMappingUnit) FieldInfo(org.apache.inlong.sort.protocol.FieldInfo) TransformationInfo(org.apache.inlong.sort.protocol.transformation.TransformationInfo) Test(org.junit.Test)

Example 2 with FieldMappingRule

use of org.apache.inlong.sort.protocol.transformation.FieldMappingRule in project incubator-inlong by apache.

the class Transformer method getRowTransformer.

private Function<Row, Row> getRowTransformer() {
    TransformationRule transformationRule = transformationInfo.getTransformRule();
    if (transformationRule instanceof FieldMappingRule) {
        // Get 'sink field name' => 'source field name' map
        FieldMappingRule fieldMappingRule = (FieldMappingRule) transformationRule;
        final Map<String, String> sinkFieldNameToSourceFieldName = new HashMap<>();
        for (FieldMappingUnit fieldMappingUnit : fieldMappingRule.getFieldMappingUnits()) {
            sinkFieldNameToSourceFieldName.put(fieldMappingUnit.getSinkFieldInfo().getName(), fieldMappingUnit.getSourceFieldInfo().getName());
        }
        // Get 'source field name' => 'index in source row' map
        final Map<String, Integer> sourceFieldNameToIndex = new HashMap<>();
        for (int i = 0; i < sourceFieldInfos.length; i++) {
            sourceFieldNameToIndex.put(sourceFieldInfos[i].getName(), i);
        }
        return row -> {
            final int sinkFieldsLength = sinkFieldInfos.length;
            Row sinkRow = new Row(sinkFieldsLength);
            for (int i = 0; i < sinkFieldsLength; i++) {
                String sinkFieldName = sinkFieldInfos[i].getName();
                String sourceFieldName = sinkFieldNameToSourceFieldName.get(sinkFieldName);
                if (sourceFieldName == null) {
                    LOGGER.warn("Mapping failed! Can't find correspond source field! sink field name `{}`", sinkFieldName);
                    sinkRow.setField(i, null);
                    continue;
                }
                Integer sourceFieldIndex = sourceFieldNameToIndex.get(sourceFieldName);
                if (sourceFieldIndex == null) {
                    LOGGER.warn("Mapping failed! Can't find correspond source field!" + " source field name `{}`, sink field name `{}`", sourceFieldName, sinkFieldName);
                    sinkRow.setField(i, null);
                    continue;
                }
                sinkRow.setField(i, row.getField(sourceFieldIndex));
            }
            return sinkRow;
        };
    } else {
        throw new IllegalArgumentException("Unsupported transformation rule " + transformationRule.getClass());
    }
}
Also used : FieldMappingUnit(org.apache.inlong.sort.protocol.transformation.FieldMappingRule.FieldMappingUnit) TransformationRule(org.apache.inlong.sort.protocol.transformation.TransformationRule) Logger(org.slf4j.Logger) Configuration(org.apache.flink.configuration.Configuration) LoggerFactory(org.slf4j.LoggerFactory) HashMap(java.util.HashMap) Function(java.util.function.Function) FieldMappingRule(org.apache.inlong.sort.protocol.transformation.FieldMappingRule) FieldInfo(org.apache.inlong.sort.protocol.FieldInfo) Collector(org.apache.flink.util.Collector) TransformationInfo(org.apache.inlong.sort.protocol.transformation.TransformationInfo) Map(java.util.Map) Preconditions(com.google.common.base.Preconditions) ProcessFunction(org.apache.flink.streaming.api.functions.ProcessFunction) Row(org.apache.flink.types.Row) FieldMappingRule(org.apache.inlong.sort.protocol.transformation.FieldMappingRule) HashMap(java.util.HashMap) TransformationRule(org.apache.inlong.sort.protocol.transformation.TransformationRule) Row(org.apache.flink.types.Row) FieldMappingUnit(org.apache.inlong.sort.protocol.transformation.FieldMappingRule.FieldMappingUnit)

Example 3 with FieldMappingRule

use of org.apache.inlong.sort.protocol.transformation.FieldMappingRule in project incubator-inlong by apache.

the class TransformerTest method testTransform.

@Test
public void testTransform() {
    TransformationInfo transformationInfo = new TransformationInfo(new FieldMappingRule(new FieldMappingUnit[] { new FieldMappingUnit(new FieldInfo("age", StringFormatInfo.INSTANCE), new FieldInfo("age_out", StringFormatInfo.INSTANCE)), new FieldMappingUnit(new FieldInfo("name", StringFormatInfo.INSTANCE), new FieldInfo("name_out", StringFormatInfo.INSTANCE)) }));
    Transformer transformer = new Transformer(transformationInfo, sourceFieldInfos, sinkFieldInfos);
    transformer.open(new Configuration());
    Row input = Row.of("name", 29, 179);
    ListCollector<Row> collector = new ListCollector<>();
    transformer.processElement(input, null, collector);
    assertEquals(Row.of(29, "name"), collector.getInnerList().get(0));
}
Also used : ListCollector(org.apache.inlong.sort.singletenant.flink.deserialization.ListCollector) FieldMappingRule(org.apache.inlong.sort.protocol.transformation.FieldMappingRule) Configuration(org.apache.flink.configuration.Configuration) Row(org.apache.flink.types.Row) FieldMappingUnit(org.apache.inlong.sort.protocol.transformation.FieldMappingRule.FieldMappingUnit) FieldInfo(org.apache.inlong.sort.protocol.FieldInfo) TransformationInfo(org.apache.inlong.sort.protocol.transformation.TransformationInfo) Test(org.junit.Test)

Example 4 with FieldMappingRule

use of org.apache.inlong.sort.protocol.transformation.FieldMappingRule in project incubator-inlong by apache.

the class TransformerTest method testTransformWithNotExistSourceFieldName.

@Test
public void testTransformWithNotExistSourceFieldName() {
    TransformationInfo transformationInfo = new TransformationInfo(new FieldMappingRule(new FieldMappingUnit[] { new FieldMappingUnit(new FieldInfo("age", StringFormatInfo.INSTANCE), new FieldInfo("age_out", StringFormatInfo.INSTANCE)), new FieldMappingUnit(// not exist source field name
    new FieldInfo("name_not_exist", StringFormatInfo.INSTANCE), new FieldInfo("name_out", StringFormatInfo.INSTANCE)) }));
    Transformer transformer = new Transformer(transformationInfo, sourceFieldInfos, sinkFieldInfos);
    transformer.open(new Configuration());
    Row input = Row.of("name", 29, 179);
    ListCollector<Row> collector = new ListCollector<>();
    transformer.processElement(input, null, collector);
    assertEquals(Row.of(29, null), collector.getInnerList().get(0));
}
Also used : ListCollector(org.apache.inlong.sort.singletenant.flink.deserialization.ListCollector) FieldMappingRule(org.apache.inlong.sort.protocol.transformation.FieldMappingRule) Configuration(org.apache.flink.configuration.Configuration) Row(org.apache.flink.types.Row) FieldMappingUnit(org.apache.inlong.sort.protocol.transformation.FieldMappingRule.FieldMappingUnit) FieldInfo(org.apache.inlong.sort.protocol.FieldInfo) TransformationInfo(org.apache.inlong.sort.protocol.transformation.TransformationInfo) Test(org.junit.Test)

Example 5 with FieldMappingRule

use of org.apache.inlong.sort.protocol.transformation.FieldMappingRule in project incubator-inlong by apache.

the class CommonOperateService method createDataFlow.

/**
 * Create dataflow info for sort.
 */
public DataFlowInfo createDataFlow(InlongGroupInfo groupInfo, SinkResponse sinkResponse) {
    String groupId = sinkResponse.getInlongGroupId();
    String streamId = sinkResponse.getInlongStreamId();
    // TODO Support all source type, include AUTO_PUSH.
    List<SourceResponse> sourceList = streamSourceService.listSource(groupId, streamId);
    if (CollectionUtils.isEmpty(sourceList)) {
        throw new WorkflowListenerException(String.format("Source not found by groupId=%s and streamId=%s", groupId, streamId));
    }
    // Get all field info
    List<FieldInfo> sourceFields = new ArrayList<>();
    List<FieldInfo> sinkFields = new ArrayList<>();
    String partition = null;
    if (SinkType.forType(sinkResponse.getSinkType()) == SinkType.HIVE) {
        HiveSinkResponse hiveSink = (HiveSinkResponse) sinkResponse;
        partition = hiveSink.getPrimaryPartition();
    }
    // TODO Support more than one source and one sink
    final SourceResponse sourceResponse = sourceList.get(0);
    boolean isAllMigration = SourceInfoUtils.isBinlogAllMigration(sourceResponse);
    FieldMappingRule fieldMappingRule = FieldInfoUtils.createFieldInfo(isAllMigration, sinkResponse.getFieldList(), sourceFields, sinkFields, partition);
    // Get source info
    String masterAddress = getSpecifiedParam(Constant.TUBE_MASTER_URL);
    PulsarClusterInfo pulsarCluster = getPulsarClusterInfo(groupInfo.getMiddlewareType());
    InlongStreamInfo streamInfo = streamService.get(groupId, streamId);
    SourceInfo sourceInfo = SourceInfoUtils.createSourceInfo(pulsarCluster, masterAddress, clusterBean, groupInfo, streamInfo, sourceResponse, sourceFields);
    // Get sink info
    SinkInfo sinkInfo = SinkInfoUtils.createSinkInfo(sourceResponse, sinkResponse, sinkFields);
    // Get transformation info
    TransformationInfo transInfo = new TransformationInfo(fieldMappingRule);
    // Get properties
    Map<String, Object> properties = new HashMap<>();
    if (MapUtils.isNotEmpty(sinkResponse.getProperties())) {
        properties.putAll(sinkResponse.getProperties());
    }
    properties.put(Constant.DATA_FLOW_GROUP_ID_KEY, groupId);
    return new DataFlowInfo(sinkResponse.getId(), sourceInfo, transInfo, sinkInfo, properties);
}
Also used : SourceResponse(org.apache.inlong.manager.common.pojo.source.SourceResponse) FieldMappingRule(org.apache.inlong.sort.protocol.transformation.FieldMappingRule) SourceInfo(org.apache.inlong.sort.protocol.source.SourceInfo) HashMap(java.util.HashMap) PulsarClusterInfo(org.apache.inlong.common.pojo.dataproxy.PulsarClusterInfo) ArrayList(java.util.ArrayList) SinkInfo(org.apache.inlong.sort.protocol.sink.SinkInfo) HiveSinkResponse(org.apache.inlong.manager.common.pojo.sink.hive.HiveSinkResponse) WorkflowListenerException(org.apache.inlong.manager.common.exceptions.WorkflowListenerException) FieldInfo(org.apache.inlong.sort.protocol.FieldInfo) InlongStreamInfo(org.apache.inlong.manager.common.pojo.stream.InlongStreamInfo) TransformationInfo(org.apache.inlong.sort.protocol.transformation.TransformationInfo) DataFlowInfo(org.apache.inlong.sort.protocol.DataFlowInfo)

Aggregations

FieldInfo (org.apache.inlong.sort.protocol.FieldInfo)6 FieldMappingRule (org.apache.inlong.sort.protocol.transformation.FieldMappingRule)6 FieldMappingUnit (org.apache.inlong.sort.protocol.transformation.FieldMappingRule.FieldMappingUnit)5 TransformationInfo (org.apache.inlong.sort.protocol.transformation.TransformationInfo)5 Configuration (org.apache.flink.configuration.Configuration)4 Row (org.apache.flink.types.Row)4 ListCollector (org.apache.inlong.sort.singletenant.flink.deserialization.ListCollector)3 Test (org.junit.Test)3 ArrayList (java.util.ArrayList)2 HashMap (java.util.HashMap)2 Preconditions (com.google.common.base.Preconditions)1 Map (java.util.Map)1 Function (java.util.function.Function)1 ProcessFunction (org.apache.flink.streaming.api.functions.ProcessFunction)1 Collector (org.apache.flink.util.Collector)1 PulsarClusterInfo (org.apache.inlong.common.pojo.dataproxy.PulsarClusterInfo)1 WorkflowListenerException (org.apache.inlong.manager.common.exceptions.WorkflowListenerException)1 SinkFieldResponse (org.apache.inlong.manager.common.pojo.sink.SinkFieldResponse)1 HiveSinkResponse (org.apache.inlong.manager.common.pojo.sink.hive.HiveSinkResponse)1 SourceResponse (org.apache.inlong.manager.common.pojo.source.SourceResponse)1