Search in sources :

Example 1 with DistributedTestDataSource

use of org.apache.hudi.utilities.testutils.sources.DistributedTestDataSource in project hudi by apache.

the class TestHoodieDeltaStreamer method testDistributedTestDataSource.

@Test
public void testDistributedTestDataSource() {
    TypedProperties props = new TypedProperties();
    props.setProperty(SourceConfigs.MAX_UNIQUE_RECORDS_PROP, "1000");
    props.setProperty(SourceConfigs.NUM_SOURCE_PARTITIONS_PROP, "1");
    props.setProperty(SourceConfigs.USE_ROCKSDB_FOR_TEST_DATAGEN_KEYS, "true");
    DistributedTestDataSource distributedTestDataSource = new DistributedTestDataSource(props, jsc, sparkSession, null);
    InputBatch<JavaRDD<GenericRecord>> batch = distributedTestDataSource.fetchNext(Option.empty(), 10000000);
    batch.getBatch().get().cache();
    long c = batch.getBatch().get().count();
    assertEquals(1000, c);
}
Also used : TypedProperties(org.apache.hudi.common.config.TypedProperties) DistributedTestDataSource(org.apache.hudi.utilities.testutils.sources.DistributedTestDataSource) JavaRDD(org.apache.spark.api.java.JavaRDD) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest) Test(org.junit.jupiter.api.Test)

Aggregations

TypedProperties (org.apache.hudi.common.config.TypedProperties)1 DistributedTestDataSource (org.apache.hudi.utilities.testutils.sources.DistributedTestDataSource)1 JavaRDD (org.apache.spark.api.java.JavaRDD)1 Test (org.junit.jupiter.api.Test)1 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)1