Search in sources :

Example 1 with FileSourceConfiguration

use of com.hazelcast.jet.pipeline.file.impl.FileSourceConfiguration in project hazelcast by hazelcast.

the class HadoopFileSourceFactory method configureFn.

private static <T> ConsumerEx<Configuration> configureFn(FileSourceConfiguration<T> fsc, JobConfigurer configurer, FileFormat<T> fileFormat) {
    return new ConsumerEx<Configuration>() {

        @Override
        public void acceptEx(Configuration configuration) throws Exception {
            try {
                configuration.setBoolean(FileInputFormat.INPUT_DIR_NONRECURSIVE_IGNORE_SUBDIRS, true);
                configuration.setBoolean(FileInputFormat.INPUT_DIR_RECURSIVE, false);
                configuration.setBoolean(HadoopSources.SHARED_LOCAL_FS, fsc.isSharedFileSystem());
                configuration.setBoolean(HadoopSources.IGNORE_FILE_NOT_FOUND, fsc.isIgnoreFileNotFound());
                for (Entry<String, String> option : fsc.getOptions().entrySet()) {
                    configuration.set(option.getKey(), option.getValue());
                }
                // Some methods we use to configure actually take a Job
                Job job = Job.getInstance(configuration);
                Path inputPath = getInputPath(fsc, configuration);
                FileInputFormat.addInputPath(job, inputPath);
                configurer.configure(job, fileFormat);
                // original configuration instance
                for (Entry<String, String> entry : job.getConfiguration()) {
                    configuration.set(entry.getKey(), entry.getValue());
                }
            } catch (IOException e) {
                throw new JetException("Could not create a source", e);
            }
        }

        @Override
        public List<Permission> permissions() {
            String keyFile = fsc.getOptions().get("google.cloud.auth.service.account.json.keyfile");
            if (keyFile != null) {
                return asList(ConnectorPermission.file(keyFile, ACTION_READ), ConnectorPermission.file(fsc.getPath(), ACTION_READ));
            }
            return singletonList(ConnectorPermission.file(fsc.getPath(), ACTION_READ));
        }
    };
}
Also used : ConsumerEx(com.hazelcast.function.ConsumerEx) Path(org.apache.hadoop.fs.Path) Configuration(org.apache.hadoop.conf.Configuration) FileSourceConfiguration(com.hazelcast.jet.pipeline.file.impl.FileSourceConfiguration) ConnectorPermission(com.hazelcast.security.permission.ConnectorPermission) Permission(java.security.Permission) IOException(java.io.IOException) JetException(com.hazelcast.jet.JetException) Job(org.apache.hadoop.mapreduce.Job) AvroJob(org.apache.avro.mapreduce.AvroJob)

Example 2 with FileSourceConfiguration

use of com.hazelcast.jet.pipeline.file.impl.FileSourceConfiguration in project hazelcast by hazelcast.

the class FileSourceBuilder method buildMetaSupplier.

/**
 * Builds a {@link ProcessorMetaSupplier} based on the current state of the
 * builder. Use for integration with the Core API.
 * <p>
 * This method is a part of Core API and has lower backward-compatibility
 * guarantees (we can change it in minor version).
 */
@Nonnull
public ProcessorMetaSupplier buildMetaSupplier() {
    if (path == null) {
        throw new IllegalStateException("Parameter 'path' is required");
    }
    if (format == null) {
        throw new IllegalStateException("Parameter 'format' is required");
    }
    FileSourceConfiguration<T> fsc = new FileSourceConfiguration<>(path, glob, format, sharedFileSystem, ignoreFileNotFound, options);
    if (shouldUseHadoop()) {
        ServiceLoader<FileSourceFactory> loader = ServiceLoader.load(FileSourceFactory.class);
        // Only one implementation is expected to be present on classpath
        Iterator<FileSourceFactory> iterator = loader.iterator();
        if (!iterator.hasNext()) {
            throw new JetException("No suitable FileSourceFactory found. " + "Do you have Jet's Hadoop module on classpath?");
        }
        FileSourceFactory fileSourceFactory = iterator.next();
        if (iterator.hasNext()) {
            throw new JetException("Multiple FileSourceFactory implementations found");
        }
        return fileSourceFactory.create(fsc);
    }
    return new LocalFileSourceFactory().create(fsc);
}
Also used : FileSourceConfiguration(com.hazelcast.jet.pipeline.file.impl.FileSourceConfiguration) LocalFileSourceFactory(com.hazelcast.jet.pipeline.file.impl.LocalFileSourceFactory) FileSourceFactory(com.hazelcast.jet.pipeline.file.impl.FileSourceFactory) LocalFileSourceFactory(com.hazelcast.jet.pipeline.file.impl.LocalFileSourceFactory) JetException(com.hazelcast.jet.JetException) Nonnull(javax.annotation.Nonnull)

Aggregations

JetException (com.hazelcast.jet.JetException)2 FileSourceConfiguration (com.hazelcast.jet.pipeline.file.impl.FileSourceConfiguration)2 ConsumerEx (com.hazelcast.function.ConsumerEx)1 FileSourceFactory (com.hazelcast.jet.pipeline.file.impl.FileSourceFactory)1 LocalFileSourceFactory (com.hazelcast.jet.pipeline.file.impl.LocalFileSourceFactory)1 ConnectorPermission (com.hazelcast.security.permission.ConnectorPermission)1 IOException (java.io.IOException)1 Permission (java.security.Permission)1 Nonnull (javax.annotation.Nonnull)1 AvroJob (org.apache.avro.mapreduce.AvroJob)1 Configuration (org.apache.hadoop.conf.Configuration)1 Path (org.apache.hadoop.fs.Path)1 Job (org.apache.hadoop.mapreduce.Job)1