Search in sources :

Example 6 with TestingSourceSettings

use of org.apache.flink.connector.testframe.external.source.TestingSourceSettings in project flink by apache.

the class SourceTestSuiteBase method testMultipleSplits.

/**
 * Test connector source with multiple splits in the external system
 *
 * <p>This test will create 4 splits in the external system, write test data to all splits, and
 * consume back via a Flink job with 4 parallelism.
 *
 * <p>The number and order of records in each split consumed by Flink need to be identical to
 * the test data written into the external system to pass this test. There's no requirement for
 * record order across splits.
 *
 * <p>A bounded source is required for this test.
 */
@TestTemplate
@DisplayName("Test source with multiple splits")
public void testMultipleSplits(TestEnvironment testEnv, DataStreamSourceExternalContext<T> externalContext, CheckpointingMode semantic) throws Exception {
    // Step 1: Preparation
    TestingSourceSettings sourceSettings = TestingSourceSettings.builder().setBoundedness(Boundedness.BOUNDED).setCheckpointingMode(semantic).build();
    TestEnvironmentSettings envOptions = TestEnvironmentSettings.builder().setConnectorJarPaths(externalContext.getConnectorJarPaths()).build();
    Source<T, ?, ?> source = tryCreateSource(externalContext, sourceSettings);
    // Step 2: Write test data to external system
    int splitNumber = 4;
    List<List<T>> testRecordsLists = new ArrayList<>();
    for (int i = 0; i < splitNumber; i++) {
        testRecordsLists.add(generateAndWriteTestData(i, externalContext, sourceSettings));
    }
    // Step 3: Build and execute Flink job
    StreamExecutionEnvironment execEnv = testEnv.createExecutionEnvironment(envOptions);
    DataStreamSource<T> stream = execEnv.fromSource(source, WatermarkStrategy.noWatermarks(), "Tested Source").setParallelism(splitNumber);
    CollectIteratorBuilder<T> iteratorBuilder = addCollectSink(stream);
    JobClient jobClient = submitJob(execEnv, "Source Multiple Split Test");
    // Step 4: Validate test data
    try (CloseableIterator<T> resultIterator = iteratorBuilder.build(jobClient)) {
        // Check test result
        LOG.info("Checking test results");
        checkResultWithSemantic(resultIterator, testRecordsLists, semantic, null);
    }
}
Also used : ArrayList(java.util.ArrayList) TestEnvironmentSettings(org.apache.flink.connector.testframe.environment.TestEnvironmentSettings) JobClient(org.apache.flink.core.execution.JobClient) DEFAULT_COLLECT_DATA_TIMEOUT(org.apache.flink.connector.testframe.utils.ConnectorTestConstants.DEFAULT_COLLECT_DATA_TIMEOUT) DEFAULT_JOB_STATUS_CHANGE_TIMEOUT(org.apache.flink.connector.testframe.utils.ConnectorTestConstants.DEFAULT_JOB_STATUS_CHANGE_TIMEOUT) List(java.util.List) ArrayList(java.util.ArrayList) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) TestingSourceSettings(org.apache.flink.connector.testframe.external.source.TestingSourceSettings) TestTemplate(org.junit.jupiter.api.TestTemplate) DisplayName(org.junit.jupiter.api.DisplayName)

Example 7 with TestingSourceSettings

use of org.apache.flink.connector.testframe.external.source.TestingSourceSettings in project flink by apache.

the class SourceTestSuiteBase method testSourceSingleSplit.

// ----------------------------- Basic test cases ---------------------------------
/**
 * Test connector source with only one split in the external system.
 *
 * <p>This test will create one split in the external system, write test data into it, and
 * consume back via a Flink job with 1 parallelism.
 *
 * <p>The number and order of records consumed by Flink need to be identical to the test data
 * written to the external system in order to pass this test.
 *
 * <p>A bounded source is required for this test.
 */
@TestTemplate
@DisplayName("Test source with single split")
public void testSourceSingleSplit(TestEnvironment testEnv, DataStreamSourceExternalContext<T> externalContext, CheckpointingMode semantic) throws Exception {
    // Step 1: Preparation
    TestingSourceSettings sourceSettings = TestingSourceSettings.builder().setBoundedness(Boundedness.BOUNDED).setCheckpointingMode(semantic).build();
    TestEnvironmentSettings envSettings = TestEnvironmentSettings.builder().setConnectorJarPaths(externalContext.getConnectorJarPaths()).build();
    Source<T, ?, ?> source = tryCreateSource(externalContext, sourceSettings);
    // Step 2: Write test data to external system
    List<T> testRecords = generateAndWriteTestData(0, externalContext, sourceSettings);
    // Step 3: Build and execute Flink job
    StreamExecutionEnvironment execEnv = testEnv.createExecutionEnvironment(envSettings);
    DataStreamSource<T> stream = execEnv.fromSource(source, WatermarkStrategy.noWatermarks(), "Tested Source").setParallelism(1);
    CollectIteratorBuilder<T> iteratorBuilder = addCollectSink(stream);
    JobClient jobClient = submitJob(execEnv, "Source Single Split Test");
    // Step 5: Validate test data
    try (CollectResultIterator<T> resultIterator = iteratorBuilder.build(jobClient)) {
        // Check test result
        LOG.info("Checking test results");
        checkResultWithSemantic(resultIterator, Arrays.asList(testRecords), semantic, null);
    }
    // Step 5: Clean up
    waitForJobStatus(jobClient, Collections.singletonList(JobStatus.FINISHED), Deadline.fromNow(DEFAULT_JOB_STATUS_CHANGE_TIMEOUT));
}
Also used : DEFAULT_COLLECT_DATA_TIMEOUT(org.apache.flink.connector.testframe.utils.ConnectorTestConstants.DEFAULT_COLLECT_DATA_TIMEOUT) DEFAULT_JOB_STATUS_CHANGE_TIMEOUT(org.apache.flink.connector.testframe.utils.ConnectorTestConstants.DEFAULT_JOB_STATUS_CHANGE_TIMEOUT) StreamExecutionEnvironment(org.apache.flink.streaming.api.environment.StreamExecutionEnvironment) TestEnvironmentSettings(org.apache.flink.connector.testframe.environment.TestEnvironmentSettings) TestingSourceSettings(org.apache.flink.connector.testframe.external.source.TestingSourceSettings) JobClient(org.apache.flink.core.execution.JobClient) TestTemplate(org.junit.jupiter.api.TestTemplate) DisplayName(org.junit.jupiter.api.DisplayName)

Aggregations

TestEnvironmentSettings (org.apache.flink.connector.testframe.environment.TestEnvironmentSettings)7 TestingSourceSettings (org.apache.flink.connector.testframe.external.source.TestingSourceSettings)7 StreamExecutionEnvironment (org.apache.flink.streaming.api.environment.StreamExecutionEnvironment)7 DEFAULT_COLLECT_DATA_TIMEOUT (org.apache.flink.connector.testframe.utils.ConnectorTestConstants.DEFAULT_COLLECT_DATA_TIMEOUT)6 DEFAULT_JOB_STATUS_CHANGE_TIMEOUT (org.apache.flink.connector.testframe.utils.ConnectorTestConstants.DEFAULT_JOB_STATUS_CHANGE_TIMEOUT)6 JobClient (org.apache.flink.core.execution.JobClient)6 DisplayName (org.junit.jupiter.api.DisplayName)6 TestTemplate (org.junit.jupiter.api.TestTemplate)6 ArrayList (java.util.ArrayList)4 List (java.util.List)4 TestAbortedException (org.opentest4j.TestAbortedException)2 ExecutorService (java.util.concurrent.ExecutorService)1 Configuration (org.apache.flink.configuration.Configuration)1 ExternalSystemSplitDataWriter (org.apache.flink.connector.testframe.external.ExternalSystemSplitDataWriter)1 MetricQuerier (org.apache.flink.connector.testframe.utils.MetricQuerier)1 RestClient (org.apache.flink.runtime.rest.RestClient)1