Search in sources :

Example 1 with FileSystem

use of org.apache.beam.sdk.io.FileSystem in project beam by apache.

the class HadoopFileSystemRegistrar method fromOptions.

@Override
public Iterable<FileSystem<?>> fromOptions(@Nonnull PipelineOptions options) {
    final List<Configuration> configurations = options.as(HadoopFileSystemOptions.class).getHdfsConfiguration();
    if (configurations == null) {
        // nothing to register
        return Collections.emptyList();
    }
    checkArgument(configurations.size() == 1, String.format("The %s currently only supports at most a single Hadoop configuration.", HadoopFileSystemRegistrar.class.getSimpleName()));
    final ImmutableList.Builder<FileSystem<?>> builder = ImmutableList.builder();
    final Set<String> registeredSchemes = new HashSet<>();
    // this will only do zero or one loop
    final Configuration configuration = Iterables.getOnlyElement(configurations);
    final String defaultFs = configuration.get(org.apache.hadoop.fs.FileSystem.FS_DEFAULT_NAME_KEY);
    if (defaultFs != null && !defaultFs.isEmpty()) {
        final String scheme = Objects.requireNonNull(URI.create(defaultFs).getScheme(), String.format("Empty scheme for %s value.", org.apache.hadoop.fs.FileSystem.FS_DEFAULT_NAME_KEY));
        builder.add(new HadoopFileSystem(scheme, configuration));
        registeredSchemes.add(scheme);
    }
    final String nameServices = configuration.get(CONFIG_KEY_DFS_NAMESERVICES);
    if (nameServices != null && !nameServices.isEmpty()) {
        // we can register schemes that are support by HA cluster
        for (String scheme : HA_SCHEMES) {
            if (!registeredSchemes.contains(scheme)) {
                builder.add(new HadoopFileSystem(scheme, configuration));
            }
        }
    }
    return builder.build();
}
Also used : Configuration(org.apache.hadoop.conf.Configuration) ImmutableList(org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList) FileSystem(org.apache.beam.sdk.io.FileSystem) HashSet(java.util.HashSet)

Example 2 with FileSystem

use of org.apache.beam.sdk.io.FileSystem in project beam by apache.

the class HadoopFileSystemRegistrarTest method testServiceLoader.

@Test
public void testServiceLoader() {
    HadoopFileSystemOptions options = PipelineOptionsFactory.as(HadoopFileSystemOptions.class);
    options.setHdfsConfiguration(ImmutableList.of(configuration));
    for (FileSystemRegistrar registrar : Lists.newArrayList(ServiceLoader.load(FileSystemRegistrar.class).iterator())) {
        if (registrar instanceof HadoopFileSystemRegistrar) {
            Iterable<FileSystem<?>> fileSystems = registrar.fromOptions(options);
            assertEquals(hdfsClusterBaseUri.getScheme(), ((HadoopFileSystem) Iterables.getOnlyElement(fileSystems)).getScheme());
            return;
        }
    }
    fail("Expected to find " + HadoopFileSystemRegistrar.class);
}
Also used : FileSystemRegistrar(org.apache.beam.sdk.io.FileSystemRegistrar) FileSystem(org.apache.beam.sdk.io.FileSystem) Test(org.junit.Test)

Example 3 with FileSystem

use of org.apache.beam.sdk.io.FileSystem in project beam by apache.

the class GcsFileSystemRegistrarTest method testServiceLoader.

@Test
public void testServiceLoader() {
    for (FileSystemRegistrar registrar : Lists.newArrayList(ServiceLoader.load(FileSystemRegistrar.class).iterator())) {
        if (registrar instanceof GcsFileSystemRegistrar) {
            Iterable<FileSystem<?>> fileSystems = registrar.fromOptions(PipelineOptionsFactory.create());
            assertThat(fileSystems, contains(instanceOf(GcsFileSystem.class)));
            return;
        }
    }
    fail("Expected to find " + GcsFileSystemRegistrar.class);
}
Also used : FileSystemRegistrar(org.apache.beam.sdk.io.FileSystemRegistrar) FileSystem(org.apache.beam.sdk.io.FileSystem) Test(org.junit.Test)

Aggregations

FileSystem (org.apache.beam.sdk.io.FileSystem)3 FileSystemRegistrar (org.apache.beam.sdk.io.FileSystemRegistrar)2 Test (org.junit.Test)2 HashSet (java.util.HashSet)1 ImmutableList (org.apache.beam.vendor.guava.v26_0_jre.com.google.common.collect.ImmutableList)1 Configuration (org.apache.hadoop.conf.Configuration)1