Search in sources :

Example 1 with IterableDatasetFinder

use of org.apache.gobblin.dataset.IterableDatasetFinder in project incubator-gobblin by apache.

the class DatasetFinderSource method createWorkUnitStream.

private Stream<WorkUnit> createWorkUnitStream(SourceState state) throws IOException {
    IterableDatasetFinder datasetsFinder = createDatasetsFinder(state);
    Stream<Dataset> datasetStream = datasetsFinder.getDatasetsStream(0, null);
    if (this.drilldownIntoPartitions) {
        return datasetStream.flatMap(dataset -> {
            if (dataset instanceof PartitionableDataset) {
                try {
                    return (Stream<PartitionableDataset.DatasetPartition>) ((PartitionableDataset) dataset).getPartitions(0, null);
                } catch (IOException ioe) {
                    log.error("Failed to get partitions for dataset " + dataset.getUrn());
                    return Stream.empty();
                }
            } else {
                return Stream.of(new DatasetWrapper(dataset));
            }
        }).map(this::workUnitForPartitionInternal);
    } else {
        return datasetStream.map(this::workUnitForDataset);
    }
}
Also used : DatasetUtils(org.apache.gobblin.data.management.dataset.DatasetUtils) WorkUnitStream(org.apache.gobblin.source.workunit.WorkUnitStream) Getter(lombok.Getter) IOException(java.io.IOException) Collectors(java.util.stream.Collectors) PartitionableDataset(org.apache.gobblin.dataset.PartitionableDataset) IterableDatasetFinder(org.apache.gobblin.dataset.IterableDatasetFinder) List(java.util.List) Slf4j(lombok.extern.slf4j.Slf4j) Stream(java.util.stream.Stream) BasicWorkUnitStream(org.apache.gobblin.source.workunit.BasicWorkUnitStream) SourceState(org.apache.gobblin.configuration.SourceState) WorkUnitStreamSource(org.apache.gobblin.source.WorkUnitStreamSource) HadoopUtils(org.apache.gobblin.util.HadoopUtils) AllArgsConstructor(lombok.AllArgsConstructor) Dataset(org.apache.gobblin.dataset.Dataset) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) PartitionableDataset(org.apache.gobblin.dataset.PartitionableDataset) IterableDatasetFinder(org.apache.gobblin.dataset.IterableDatasetFinder) PartitionableDataset(org.apache.gobblin.dataset.PartitionableDataset) Dataset(org.apache.gobblin.dataset.Dataset) IOException(java.io.IOException)

Example 2 with IterableDatasetFinder

use of org.apache.gobblin.dataset.IterableDatasetFinder in project incubator-gobblin by apache.

the class CopySource method getWorkunits.

/**
 * <ul>
 * Does the following:
 * <li>Instantiate a {@link DatasetsFinder}.
 * <li>Find all {@link Dataset} using {@link DatasetsFinder}.
 * <li>For each {@link CopyableDataset} get all {@link CopyEntity}s.
 * <li>Create a {@link WorkUnit} per {@link CopyEntity}.
 * </ul>
 *
 * <p>
 * In this implementation, one workunit is created for every {@link CopyEntity} found. But the extractor/converters
 * and writers are built to support multiple {@link CopyEntity}s per workunit
 * </p>
 *
 * @param state see {@link org.apache.gobblin.configuration.SourceState}
 * @return Work units for copying files.
 */
@Override
public List<WorkUnit> getWorkunits(final SourceState state) {
    this.metricContext = Instrumented.getMetricContext(state, CopySource.class);
    this.lineageInfo = LineageInfo.getLineageInfo(state.getBroker());
    try {
        DeprecationUtils.renameDeprecatedKeys(state, CopyConfiguration.MAX_COPY_PREFIX + "." + CopyResourcePool.ENTITIES_KEY, Lists.newArrayList(MAX_FILES_COPIED_KEY));
        final FileSystem sourceFs = HadoopUtils.getSourceFileSystem(state);
        final FileSystem targetFs = HadoopUtils.getWriterFileSystem(state, 1, 0);
        state.setProp(SlaEventKeys.SOURCE_URI, sourceFs.getUri());
        state.setProp(SlaEventKeys.DESTINATION_URI, targetFs.getUri());
        log.info("Identified source file system at {} and target file system at {}.", sourceFs.getUri(), targetFs.getUri());
        long maxSizePerBin = state.getPropAsLong(MAX_SIZE_MULTI_WORKUNITS, 0);
        long maxWorkUnitsPerMultiWorkUnit = state.getPropAsLong(MAX_WORK_UNITS_PER_BIN, 50);
        final long minWorkUnitWeight = Math.max(1, maxSizePerBin / maxWorkUnitsPerMultiWorkUnit);
        final Optional<CopyableFileWatermarkGenerator> watermarkGenerator = CopyableFileWatermarkHelper.getCopyableFileWatermarkGenerator(state);
        int maxThreads = state.getPropAsInt(MAX_CONCURRENT_LISTING_SERVICES, DEFAULT_MAX_CONCURRENT_LISTING_SERVICES);
        final CopyConfiguration copyConfiguration = CopyConfiguration.builder(targetFs, state.getProperties()).build();
        this.eventSubmitter = new EventSubmitter.Builder(this.metricContext, CopyConfiguration.COPY_PREFIX).build();
        DatasetsFinder<CopyableDatasetBase> datasetFinder = DatasetUtils.instantiateDatasetFinder(state.getProperties(), sourceFs, DEFAULT_DATASET_PROFILE_CLASS_KEY, this.eventSubmitter, state);
        IterableDatasetFinder<CopyableDatasetBase> iterableDatasetFinder = datasetFinder instanceof IterableDatasetFinder ? (IterableDatasetFinder<CopyableDatasetBase>) datasetFinder : new IterableDatasetFinderImpl<>(datasetFinder);
        Iterator<CopyableDatasetRequestor> requestorIteratorWithNulls = Iterators.transform(iterableDatasetFinder.getDatasetsIterator(), new CopyableDatasetRequestor.Factory(targetFs, copyConfiguration, log));
        Iterator<CopyableDatasetRequestor> requestorIterator = Iterators.filter(requestorIteratorWithNulls, Predicates.<CopyableDatasetRequestor>notNull());
        final SetMultimap<FileSet<CopyEntity>, WorkUnit> workUnitsMap = Multimaps.<FileSet<CopyEntity>, WorkUnit>synchronizedSetMultimap(HashMultimap.<FileSet<CopyEntity>, WorkUnit>create());
        RequestAllocator<FileSet<CopyEntity>> allocator = createRequestAllocator(copyConfiguration, maxThreads);
        Iterator<FileSet<CopyEntity>> prioritizedFileSets = allocator.allocateRequests(requestorIterator, copyConfiguration.getMaxToCopy());
        // Submit alertable events for unfulfilled requests
        submitUnfulfilledRequestEvents(allocator);
        Iterator<Callable<Void>> callableIterator = Iterators.transform(prioritizedFileSets, new Function<FileSet<CopyEntity>, Callable<Void>>() {

            @Nullable
            @Override
            public Callable<Void> apply(FileSet<CopyEntity> input) {
                return new FileSetWorkUnitGenerator((CopyableDatasetBase) input.getDataset(), input, state, workUnitsMap, watermarkGenerator, minWorkUnitWeight);
            }
        });
        try {
            List<Future<Void>> futures = new IteratorExecutor<>(callableIterator, maxThreads, ExecutorsUtils.newDaemonThreadFactory(Optional.of(log), Optional.of("Copy-file-listing-pool-%d"))).execute();
            for (Future<Void> future : futures) {
                try {
                    future.get();
                } catch (ExecutionException exc) {
                    log.error("Failed to get work units for dataset.", exc.getCause());
                }
            }
        } catch (InterruptedException ie) {
            log.error("Retrieval of work units was interrupted. Aborting.");
            return Lists.newArrayList();
        }
        log.info(String.format("Created %s workunits ", workUnitsMap.size()));
        copyConfiguration.getCopyContext().logCacheStatistics();
        if (state.contains(SIMULATE) && state.getPropAsBoolean(SIMULATE)) {
            log.info("Simulate mode enabled. Will not execute the copy.");
            for (Map.Entry<FileSet<CopyEntity>, Collection<WorkUnit>> entry : workUnitsMap.asMap().entrySet()) {
                log.info(String.format("Actions for dataset %s file set %s.", entry.getKey().getDataset().datasetURN(), entry.getKey().getName()));
                for (WorkUnit workUnit : entry.getValue()) {
                    CopyEntity copyEntity = deserializeCopyEntity(workUnit);
                    log.info(copyEntity.explain());
                }
            }
            return Lists.newArrayList();
        }
        List<? extends WorkUnit> workUnits = new WorstFitDecreasingBinPacking(maxSizePerBin).pack(Lists.newArrayList(workUnitsMap.values()), this.weighter);
        log.info(String.format("Bin packed work units. Initial work units: %d, packed work units: %d, max weight per bin: %d, " + "max work units per bin: %d.", workUnitsMap.size(), workUnits.size(), maxSizePerBin, maxWorkUnitsPerMultiWorkUnit));
        return ImmutableList.copyOf(workUnits);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}
Also used : IterableDatasetFinder(org.apache.gobblin.dataset.IterableDatasetFinder) Callable(java.util.concurrent.Callable) WorstFitDecreasingBinPacking(org.apache.gobblin.util.binpacking.WorstFitDecreasingBinPacking) FileSystem(org.apache.hadoop.fs.FileSystem) ExecutionException(java.util.concurrent.ExecutionException) CopyableDatasetRequestor(org.apache.gobblin.data.management.partition.CopyableDatasetRequestor) FileSet(org.apache.gobblin.data.management.partition.FileSet) IOException(java.io.IOException) CopyableFileWatermarkGenerator(org.apache.gobblin.data.management.copy.watermark.CopyableFileWatermarkGenerator) Future(java.util.concurrent.Future) Collection(java.util.Collection) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) Map(java.util.Map) ImmutableMap(com.google.common.collect.ImmutableMap) Nullable(javax.annotation.Nullable)

Example 3 with IterableDatasetFinder

use of org.apache.gobblin.dataset.IterableDatasetFinder in project incubator-gobblin by apache.

the class CopySourceTest method testSubmitUnfulfilledRequestEvents.

@Test
public void testSubmitUnfulfilledRequestEvents() throws IOException, NoSuchMethodException, InvocationTargetException, IllegalAccessException {
    SourceState state = new SourceState();
    state.setProp(ConfigurationKeys.SOURCE_FILEBASED_FS_URI, "file:///");
    state.setProp(ConfigurationKeys.WRITER_FILE_SYSTEM_URI, "file:///");
    state.setProp(ConfigurationKeys.DATA_PUBLISHER_FINAL_DIR, "/target/dir");
    state.setProp(DatasetUtils.DATASET_PROFILE_CLASS_KEY, TestCopyablePartitionableDatasedFinder.class.getCanonicalName());
    state.setProp(CopySource.MAX_CONCURRENT_LISTING_SERVICES, 2);
    state.setProp(CopyConfiguration.MAX_COPY_PREFIX + ".size", "50");
    state.setProp(CopyConfiguration.MAX_COPY_PREFIX + ".copyEntities", 2);
    state.setProp(CopyConfiguration.STORE_REJECTED_REQUESTS_KEY, RequestAllocatorConfig.StoreRejectedRequestsConfig.ALL.name().toLowerCase());
    state.setProp(ConfigurationKeys.METRICS_CUSTOM_BUILDERS, "org.apache.gobblin.metrics.ConsoleEventReporterFactory");
    CopySource source = new CopySource();
    final FileSystem sourceFs = HadoopUtils.getSourceFileSystem(state);
    final FileSystem targetFs = HadoopUtils.getWriterFileSystem(state, 1, 0);
    int maxThreads = state.getPropAsInt(CopySource.MAX_CONCURRENT_LISTING_SERVICES, CopySource.DEFAULT_MAX_CONCURRENT_LISTING_SERVICES);
    final CopyConfiguration copyConfiguration = CopyConfiguration.builder(targetFs, state.getProperties()).build();
    MetricContext metricContext = Instrumented.getMetricContext(state, CopySource.class);
    EventSubmitter eventSubmitter = new EventSubmitter.Builder(metricContext, CopyConfiguration.COPY_PREFIX).build();
    DatasetsFinder<CopyableDatasetBase> datasetFinder = DatasetUtils.instantiateDatasetFinder(state.getProperties(), sourceFs, CopySource.DEFAULT_DATASET_PROFILE_CLASS_KEY, eventSubmitter, state);
    IterableDatasetFinder<CopyableDatasetBase> iterableDatasetFinder = datasetFinder instanceof IterableDatasetFinder ? (IterableDatasetFinder<CopyableDatasetBase>) datasetFinder : new IterableDatasetFinderImpl<>(datasetFinder);
    Iterator<CopyableDatasetRequestor> requestorIteratorWithNulls = Iterators.transform(iterableDatasetFinder.getDatasetsIterator(), new CopyableDatasetRequestor.Factory(targetFs, copyConfiguration, log));
    Iterator<CopyableDatasetRequestor> requestorIterator = Iterators.filter(requestorIteratorWithNulls, Predicates.<CopyableDatasetRequestor>notNull());
    Method m = CopySource.class.getDeclaredMethod("createRequestAllocator", CopyConfiguration.class, int.class);
    m.setAccessible(true);
    PriorityIterableBasedRequestAllocator<FileSet<CopyEntity>> allocator = (PriorityIterableBasedRequestAllocator<FileSet<CopyEntity>>) m.invoke(source, copyConfiguration, maxThreads);
    Iterator<FileSet<CopyEntity>> prioritizedFileSets = allocator.allocateRequests(requestorIterator, copyConfiguration.getMaxToCopy());
    List<FileSet<CopyEntity>> fileSetList = allocator.getRequestsExceedingAvailableResourcePool();
    Assert.assertEquals(fileSetList.size(), 2);
    FileSet<CopyEntity> fileSet = fileSetList.get(0);
    Assert.assertEquals(fileSet.getDataset().getUrn(), "/test");
    Assert.assertEquals(fileSet.getTotalEntities(), 5);
    Assert.assertEquals(fileSet.getTotalSizeInBytes(), 50);
    fileSet = fileSetList.get(1);
    Assert.assertEquals(fileSet.getDataset().getUrn(), "/test");
    Assert.assertEquals(fileSet.getTotalEntities(), 5);
    Assert.assertEquals(fileSet.getTotalSizeInBytes(), 50);
}
Also used : IterableDatasetFinder(org.apache.gobblin.dataset.IterableDatasetFinder) MetricContext(org.apache.gobblin.metrics.MetricContext) FileSystem(org.apache.hadoop.fs.FileSystem) CopyableDatasetRequestor(org.apache.gobblin.data.management.partition.CopyableDatasetRequestor) SourceState(org.apache.gobblin.configuration.SourceState) FileSet(org.apache.gobblin.data.management.partition.FileSet) PriorityIterableBasedRequestAllocator(org.apache.gobblin.util.request_allocation.PriorityIterableBasedRequestAllocator) EventSubmitter(org.apache.gobblin.metrics.event.EventSubmitter) Method(java.lang.reflect.Method) Test(org.testng.annotations.Test)

Example 4 with IterableDatasetFinder

use of org.apache.gobblin.dataset.IterableDatasetFinder in project incubator-gobblin by apache.

the class DatasetFinderSourceTest method testDrilledDown.

@Test
public void testDrilledDown() {
    Dataset dataset1 = new SimpleDatasetForTesting("dataset1");
    Dataset dataset2 = new SimplePartitionableDatasetForTesting("dataset2", Lists.newArrayList(new SimpleDatasetPartitionForTesting("p1"), new SimpleDatasetPartitionForTesting("p2")));
    Dataset dataset3 = new SimpleDatasetForTesting("dataset3");
    IterableDatasetFinder finder = new StaticDatasetsFinderForTesting(Lists.newArrayList(dataset1, dataset2, dataset3));
    MySource mySource = new MySource(true, finder);
    List<WorkUnit> workUnits = mySource.getWorkunits(new SourceState());
    Assert.assertEquals(workUnits.size(), 4);
    Assert.assertEquals(workUnits.get(0).getProp(DATASET_URN), "dataset1");
    Assert.assertNull(workUnits.get(0).getProp(PARTITION_URN));
    Assert.assertEquals(workUnits.get(1).getProp(DATASET_URN), "dataset2");
    Assert.assertEquals(workUnits.get(1).getProp(PARTITION_URN), "p1");
    Assert.assertEquals(workUnits.get(2).getProp(DATASET_URN), "dataset2");
    Assert.assertEquals(workUnits.get(2).getProp(PARTITION_URN), "p2");
    Assert.assertEquals(workUnits.get(3).getProp(DATASET_URN), "dataset3");
    Assert.assertNull(workUnits.get(3).getProp(PARTITION_URN));
    WorkUnitStream workUnitStream = mySource.getWorkunitStream(new SourceState());
    Assert.assertEquals(Lists.newArrayList(workUnitStream.getWorkUnits()), workUnits);
}
Also used : SimpleDatasetPartitionForTesting(org.apache.gobblin.dataset.test.SimpleDatasetPartitionForTesting) WorkUnitStream(org.apache.gobblin.source.workunit.WorkUnitStream) SimpleDatasetForTesting(org.apache.gobblin.dataset.test.SimpleDatasetForTesting) SourceState(org.apache.gobblin.configuration.SourceState) IterableDatasetFinder(org.apache.gobblin.dataset.IterableDatasetFinder) PartitionableDataset(org.apache.gobblin.dataset.PartitionableDataset) Dataset(org.apache.gobblin.dataset.Dataset) SimplePartitionableDatasetForTesting(org.apache.gobblin.dataset.test.SimplePartitionableDatasetForTesting) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) StaticDatasetsFinderForTesting(org.apache.gobblin.dataset.test.StaticDatasetsFinderForTesting) Test(org.testng.annotations.Test)

Example 5 with IterableDatasetFinder

use of org.apache.gobblin.dataset.IterableDatasetFinder in project incubator-gobblin by apache.

the class LoopingDatasetFinderSourceTest method testNonDrilldown.

@Test
public void testNonDrilldown() {
    Dataset dataset1 = new SimpleDatasetForTesting("dataset1");
    Dataset dataset2 = new SimplePartitionableDatasetForTesting("dataset2", Lists.newArrayList(new SimpleDatasetPartitionForTesting("p1"), new SimpleDatasetPartitionForTesting("p2")));
    Dataset dataset3 = new SimpleDatasetForTesting("dataset3");
    Dataset dataset4 = new SimpleDatasetForTesting("dataset4");
    Dataset dataset5 = new SimpleDatasetForTesting("dataset5");
    IterableDatasetFinder finder = new StaticDatasetsFinderForTesting(Lists.newArrayList(dataset5, dataset4, dataset3, dataset2, dataset1));
    MySource mySource = new MySource(false, finder);
    SourceState sourceState = new SourceState();
    sourceState.setProp(LoopingDatasetFinderSource.MAX_WORK_UNITS_PER_RUN_KEY, 3);
    WorkUnitStream workUnitStream = mySource.getWorkunitStream(sourceState);
    List<WorkUnit> workUnits = Lists.newArrayList(workUnitStream.getWorkUnits());
    Assert.assertEquals(workUnits.size(), 3);
    Assert.assertEquals(workUnits.get(0).getProp(DatasetFinderSourceTest.DATASET_URN), "dataset1");
    Assert.assertNull(workUnits.get(0).getProp(DatasetFinderSourceTest.PARTITION_URN));
    Assert.assertEquals(workUnits.get(1).getProp(DatasetFinderSourceTest.DATASET_URN), "dataset2");
    Assert.assertNull(workUnits.get(1).getProp(DatasetFinderSourceTest.PARTITION_URN));
    Assert.assertEquals(workUnits.get(2).getProp(DatasetFinderSourceTest.DATASET_URN), "dataset3");
    Assert.assertNull(workUnits.get(2).getProp(DatasetFinderSourceTest.PARTITION_URN));
    // Second run should continue where it left off
    List<WorkUnitState> workUnitStates = workUnits.stream().map(WorkUnitState::new).collect(Collectors.toList());
    SourceState sourceStateSpy = Mockito.spy(sourceState);
    Mockito.doReturn(workUnitStates).when(sourceStateSpy).getPreviousWorkUnitStates();
    workUnitStream = mySource.getWorkunitStream(sourceStateSpy);
    workUnits = Lists.newArrayList(workUnitStream.getWorkUnits());
    Assert.assertEquals(workUnits.size(), 3);
    Assert.assertEquals(workUnits.get(0).getProp(DatasetFinderSourceTest.DATASET_URN), "dataset4");
    Assert.assertNull(workUnits.get(0).getProp(DatasetFinderSourceTest.PARTITION_URN));
    Assert.assertEquals(workUnits.get(1).getProp(DatasetFinderSourceTest.DATASET_URN), "dataset5");
    Assert.assertNull(workUnits.get(1).getProp(DatasetFinderSourceTest.PARTITION_URN));
    Assert.assertTrue(workUnits.get(2).getPropAsBoolean(LoopingDatasetFinderSource.END_OF_DATASETS_KEY));
    // Loop around
    workUnitStates = workUnits.stream().map(WorkUnitState::new).collect(Collectors.toList());
    Mockito.doReturn(workUnitStates).when(sourceStateSpy).getPreviousWorkUnitStates();
    workUnitStream = mySource.getWorkunitStream(sourceStateSpy);
    workUnits = Lists.newArrayList(workUnitStream.getWorkUnits());
    Assert.assertEquals(workUnits.size(), 3);
    Assert.assertEquals(workUnits.get(0).getProp(DatasetFinderSourceTest.DATASET_URN), "dataset1");
    Assert.assertNull(workUnits.get(0).getProp(DatasetFinderSourceTest.PARTITION_URN));
    Assert.assertEquals(workUnits.get(1).getProp(DatasetFinderSourceTest.DATASET_URN), "dataset2");
    Assert.assertNull(workUnits.get(1).getProp(DatasetFinderSourceTest.PARTITION_URN));
    Assert.assertEquals(workUnits.get(2).getProp(DatasetFinderSourceTest.DATASET_URN), "dataset3");
    Assert.assertNull(workUnits.get(2).getProp(DatasetFinderSourceTest.PARTITION_URN));
}
Also used : SimpleDatasetPartitionForTesting(org.apache.gobblin.dataset.test.SimpleDatasetPartitionForTesting) WorkUnitStream(org.apache.gobblin.source.workunit.WorkUnitStream) SimpleDatasetForTesting(org.apache.gobblin.dataset.test.SimpleDatasetForTesting) SourceState(org.apache.gobblin.configuration.SourceState) IterableDatasetFinder(org.apache.gobblin.dataset.IterableDatasetFinder) PartitionableDataset(org.apache.gobblin.dataset.PartitionableDataset) Dataset(org.apache.gobblin.dataset.Dataset) WorkUnitState(org.apache.gobblin.configuration.WorkUnitState) SimplePartitionableDatasetForTesting(org.apache.gobblin.dataset.test.SimplePartitionableDatasetForTesting) WorkUnit(org.apache.gobblin.source.workunit.WorkUnit) StaticDatasetsFinderForTesting(org.apache.gobblin.dataset.test.StaticDatasetsFinderForTesting) Test(org.testng.annotations.Test)

Aggregations

IterableDatasetFinder (org.apache.gobblin.dataset.IterableDatasetFinder)8 SourceState (org.apache.gobblin.configuration.SourceState)7 WorkUnit (org.apache.gobblin.source.workunit.WorkUnit)7 Dataset (org.apache.gobblin.dataset.Dataset)6 PartitionableDataset (org.apache.gobblin.dataset.PartitionableDataset)6 WorkUnitStream (org.apache.gobblin.source.workunit.WorkUnitStream)6 Test (org.testng.annotations.Test)5 SimpleDatasetForTesting (org.apache.gobblin.dataset.test.SimpleDatasetForTesting)4 SimpleDatasetPartitionForTesting (org.apache.gobblin.dataset.test.SimpleDatasetPartitionForTesting)4 SimplePartitionableDatasetForTesting (org.apache.gobblin.dataset.test.SimplePartitionableDatasetForTesting)4 StaticDatasetsFinderForTesting (org.apache.gobblin.dataset.test.StaticDatasetsFinderForTesting)4 IOException (java.io.IOException)3 WorkUnitState (org.apache.gobblin.configuration.WorkUnitState)3 List (java.util.List)2 Stream (java.util.stream.Stream)2 Nullable (javax.annotation.Nullable)2 Slf4j (lombok.extern.slf4j.Slf4j)2 CopyableDatasetRequestor (org.apache.gobblin.data.management.partition.CopyableDatasetRequestor)2 FileSet (org.apache.gobblin.data.management.partition.FileSet)2 BasicWorkUnitStream (org.apache.gobblin.source.workunit.BasicWorkUnitStream)2