use of org.apache.hudi.utilities.testutils.sources.DistributedTestDataSource in project hudi by apache.
the class TestHoodieDeltaStreamer method testDistributedTestDataSource.
@Test
public void testDistributedTestDataSource() {
TypedProperties props = new TypedProperties();
props.setProperty(SourceConfigs.MAX_UNIQUE_RECORDS_PROP, "1000");
props.setProperty(SourceConfigs.NUM_SOURCE_PARTITIONS_PROP, "1");
props.setProperty(SourceConfigs.USE_ROCKSDB_FOR_TEST_DATAGEN_KEYS, "true");
DistributedTestDataSource distributedTestDataSource = new DistributedTestDataSource(props, jsc, sparkSession, null);
InputBatch<JavaRDD<GenericRecord>> batch = distributedTestDataSource.fetchNext(Option.empty(), 10000000);
batch.getBatch().get().cache();
long c = batch.getBatch().get().count();
assertEquals(1000, c);
}
Aggregations