Search in sources :

Example 46 with FileSourceSplit

use of org.apache.flink.connector.file.src.FileSourceSplit in project flink by apache.

the class LocalityAwareSplitAssigner method addSplits.

@Override
public void addSplits(Collection<FileSourceSplit> splits) {
    for (FileSourceSplit split : splits) {
        SplitWithInfo sc = new SplitWithInfo(split);
        remoteSplitChooser.addInputSplit(sc);
        unassigned.add(sc);
    }
}
Also used : FileSourceSplit(org.apache.flink.connector.file.src.FileSourceSplit)

Example 47 with FileSourceSplit

use of org.apache.flink.connector.file.src.FileSourceSplit in project flink by apache.

the class NonSplittingRecursiveEnumerator method convertToSourceSplits.

protected void convertToSourceSplits(final FileStatus file, final FileSystem fs, final List<FileSourceSplit> target) throws IOException {
    final String[] hosts = getHostsFromBlockLocations(fs.getFileBlockLocations(file, 0L, file.getLen()));
    target.add(new FileSourceSplit(getNextId(), file.getPath(), 0, file.getLen(), file.getModificationTime(), file.getLen(), hosts));
}
Also used : FileSourceSplit(org.apache.flink.connector.file.src.FileSourceSplit)

Example 48 with FileSourceSplit

use of org.apache.flink.connector.file.src.FileSourceSplit in project flink by apache.

the class NonSplittingRecursiveEnumerator method enumerateSplits.

// ------------------------------------------------------------------------
@Override
public Collection<FileSourceSplit> enumerateSplits(Path[] paths, int minDesiredSplits) throws IOException {
    final ArrayList<FileSourceSplit> splits = new ArrayList<>();
    for (Path path : paths) {
        final FileSystem fs = path.getFileSystem();
        final FileStatus status = fs.getFileStatus(path);
        addSplitsForPath(status, fs, splits);
    }
    return splits;
}
Also used : Path(org.apache.flink.core.fs.Path) FileStatus(org.apache.flink.core.fs.FileStatus) FileSourceSplit(org.apache.flink.connector.file.src.FileSourceSplit) FileSystem(org.apache.flink.core.fs.FileSystem) ArrayList(java.util.ArrayList)

Example 49 with FileSourceSplit

use of org.apache.flink.connector.file.src.FileSourceSplit in project flink by apache.

the class ContinuousFileSplitEnumerator method assignSplits.

private void assignSplits() {
    final Iterator<Map.Entry<Integer, String>> awaitingReader = readersAwaitingSplit.entrySet().iterator();
    while (awaitingReader.hasNext()) {
        final Map.Entry<Integer, String> nextAwaiting = awaitingReader.next();
        // it from the list of waiting readers
        if (!context.registeredReaders().containsKey(nextAwaiting.getKey())) {
            awaitingReader.remove();
            continue;
        }
        final String hostname = nextAwaiting.getValue();
        final int awaitingSubtask = nextAwaiting.getKey();
        final Optional<FileSourceSplit> nextSplit = splitAssigner.getNext(hostname);
        if (nextSplit.isPresent()) {
            context.assignSplit(nextSplit.get(), awaitingSubtask);
            awaitingReader.remove();
        } else {
            break;
        }
    }
}
Also used : FileSourceSplit(org.apache.flink.connector.file.src.FileSourceSplit) LinkedHashMap(java.util.LinkedHashMap) Map(java.util.Map) PendingSplitsCheckpoint(org.apache.flink.connector.file.src.PendingSplitsCheckpoint)

Example 50 with FileSourceSplit

use of org.apache.flink.connector.file.src.FileSourceSplit in project flink by apache.

the class StaticFileSplitEnumerator method handleSplitRequest.

@Override
public void handleSplitRequest(int subtask, @Nullable String hostname) {
    if (!context.registeredReaders().containsKey(subtask)) {
        // reader failed between sending the request and now. skip this request.
        return;
    }
    if (LOG.isInfoEnabled()) {
        final String hostInfo = hostname == null ? "(no host locality info)" : "(on host '" + hostname + "')";
        LOG.info("Subtask {} {} is requesting a file source split", subtask, hostInfo);
    }
    final Optional<FileSourceSplit> nextSplit = splitAssigner.getNext(hostname);
    if (nextSplit.isPresent()) {
        final FileSourceSplit split = nextSplit.get();
        context.assignSplit(split, subtask);
        LOG.info("Assigned split to subtask {} : {}", subtask, split);
    } else {
        context.signalNoMoreSplits(subtask);
        LOG.info("No more splits available for subtask {}", subtask);
    }
}
Also used : FileSourceSplit(org.apache.flink.connector.file.src.FileSourceSplit)

Aggregations

FileSourceSplit (org.apache.flink.connector.file.src.FileSourceSplit)50 Test (org.junit.Test)32 Path (org.apache.flink.core.fs.Path)20 AtomicInteger (java.util.concurrent.atomic.AtomicInteger)11 BulkFormat (org.apache.flink.connector.file.src.reader.BulkFormat)11 Configuration (org.apache.flink.configuration.Configuration)10 ArrayList (java.util.ArrayList)9 TestingSplitEnumeratorContext (org.apache.flink.connector.testutils.source.reader.TestingSplitEnumeratorContext)7 IOException (java.io.IOException)6 RowData (org.apache.flink.table.data.RowData)6 LogicalType (org.apache.flink.table.types.logical.LogicalType)6 LinkedHashMap (java.util.LinkedHashMap)5 TestingFileSystem (org.apache.flink.connector.file.src.testutils.TestingFileSystem)5 FileStatus (org.apache.flink.core.fs.FileStatus)5 AtomicLong (java.util.concurrent.atomic.AtomicLong)4 BigIntType (org.apache.flink.table.types.logical.BigIntType)4 DoubleType (org.apache.flink.table.types.logical.DoubleType)4 IntType (org.apache.flink.table.types.logical.IntType)4 SmallIntType (org.apache.flink.table.types.logical.SmallIntType)4 TinyIntType (org.apache.flink.table.types.logical.TinyIntType)4