Search in sources :

Example 46 with ConfigSource

use of org.embulk.config.ConfigSource in project embulk by embulk.

the class EmbulkRunner method runInternal.

private void runInternal(final ConfigSource originalConfigSource, final Path configDiffPath, // deprecated
final Path outputPath, final Path resumeStatePath) throws IOException {
    try {
        checkFileWritable(outputPath);
    } catch (IOException ex) {
        throw new RuntimeException("Not writable: " + outputPath.toString());
    }
    try {
        checkFileWritable(configDiffPath);
    } catch (IOException ex) {
        throw new RuntimeException("Not writable: " + configDiffPath.toString());
    }
    try {
        checkFileWritable(resumeStatePath);
    } catch (IOException ex) {
        throw new RuntimeException("Not writable: " + resumeStatePath.toString());
    }
    final ConfigSource configSource;
    if (configDiffPath != null && Files.size(configDiffPath) > 0L) {
        configSource = originalConfigSource.merge(readConfig(configDiffPath, Collections.<String, Object>emptyMap(), null));
    } else {
        configSource = originalConfigSource;
    }
    final ConfigSource resumeConfig;
    if (resumeStatePath != null) {
        ConfigSource resumeConfigTemp = null;
        try {
            resumeConfigTemp = readYamlConfigFile(resumeStatePath);
        } catch (Throwable ex) {
            // TODO log?
            resumeConfigTemp = null;
        }
        if (resumeConfigTemp == null || resumeConfigTemp.isEmpty()) {
            resumeConfig = null;
        } else {
            resumeConfig = resumeConfigTemp;
        }
    } else {
        resumeConfig = null;
    }
    final EmbulkEmbed.ResumableResult resumableResult;
    final ExecutionResult executionResultTemp;
    if (resumeConfig != null) {
        resumableResult = this.embed.resumeState(configSource, resumeConfig).resume();
        executionResultTemp = null;
    } else if (resumeStatePath != null) {
        resumableResult = this.embed.runResumable(configSource);
        executionResultTemp = null;
    } else {
        resumableResult = null;
        executionResultTemp = this.embed.run(configSource);
    }
    final ExecutionResult executionResult;
    if (executionResultTemp == null) {
        if (!resumableResult.isSuccessful()) {
            if (resumableResult.getTransactionStage().isBefore(TransactionStage.RUN)) {
                // delete resume file
                if (resumeStatePath != null) {
                    try {
                        Files.deleteIfExists(resumeStatePath);
                    } catch (Throwable ex) {
                        System.err.println("Failed to delete: " + resumeStatePath.toString());
                    }
                }
            } else {
                rootLogger.info("Writing resume state to '" + resumeStatePath.toString() + "'");
                try {
                    writeResumeState(resumeStatePath, resumableResult.getResumeState());
                } catch (IOException ex) {
                    throw new RuntimeException(ex);
                }
                rootLogger.info("Resume state is written. Run the transaction again with -r option to resume or use \"cleanup\" subcommand to delete intermediate data.");
            }
            throw new RuntimeException(resumableResult.getCause());
        }
        executionResult = resumableResult.getSuccessfulResult();
    } else {
        executionResult = executionResultTemp;
    }
    // delete resume file
    if (resumeStatePath != null) {
        try {
            Files.deleteIfExists(resumeStatePath);
        } catch (Throwable ex) {
            System.err.println("Failed to delete: " + resumeStatePath.toString());
        }
    }
    final ConfigDiff configDiff = executionResult.getConfigDiff();
    rootLogger.info("Committed.");
    rootLogger.info("Next config diff: " + configDiff.toString());
    writeConfig(configDiffPath, configDiff);
    // deprecated
    writeConfig(outputPath, configSource.merge(configDiff));
}
Also used : ConfigSource(org.embulk.config.ConfigSource) ExecutionResult(org.embulk.exec.ExecutionResult) IOException(java.io.IOException) ConfigDiff(org.embulk.config.ConfigDiff)

Example 47 with ConfigSource

use of org.embulk.config.ConfigSource in project embulk by embulk.

the class EmbulkRunner method guessInternal.

private void guessInternal(final ConfigSource configSource, final Path outputPath) throws IOException {
    try {
        checkFileWritable(outputPath);
    } catch (IOException ex) {
        throw new RuntimeException("Not writable: " + outputPath.toString());
    }
    final ConfigDiff configDiff = this.embed.guess(configSource);
    final ConfigSource guessedConfigSource = configSource.merge(configDiff);
    final String yaml = writeConfig(outputPath, guessedConfigSource);
    System.err.println(yaml);
    if (outputPath != null) {
        System.out.println("Created '" + outputPath + "' file.");
    } else {
        System.out.println("Use -o PATH option to write the guessed config file to a file.");
    }
}
Also used : ConfigSource(org.embulk.config.ConfigSource) IOException(java.io.IOException) ConfigDiff(org.embulk.config.ConfigDiff)

Example 48 with ConfigSource

use of org.embulk.config.ConfigSource in project embulk by embulk.

the class SamplingParserPlugin method runFileInputSampling.

public static Buffer runFileInputSampling(final FileInputRunner runner, ConfigSource inputConfig, ConfigSource sampleBufferConfig) {
    final SampleBufferTask sampleBufferTask = sampleBufferConfig.loadConfig(SampleBufferTask.class);
    // override in.parser.type so that FileInputRunner creates SamplingParserPlugin
    ConfigSource samplingInputConfig = inputConfig.deepCopy();
    samplingInputConfig.getNestedOrSetEmpty("parser").set("type", "system_sampling").set("sample_buffer_bytes", sampleBufferTask.getSampleBufferBytes());
    samplingInputConfig.set("decoders", null);
    try {
        runner.transaction(samplingInputConfig, new InputPlugin.Control() {

            public List<TaskReport> run(TaskSource taskSource, Schema schema, int taskCount) {
                if (taskCount == 0) {
                    throw new NoSampleException("No input files to read sample data");
                }
                int maxSize = -1;
                int maxSizeTaskIndex = -1;
                for (int taskIndex = 0; taskIndex < taskCount; taskIndex++) {
                    try {
                        runner.run(taskSource, schema, taskIndex, new PageOutput() {

                            @Override
                            public void add(Page page) {
                                // TODO exception class
                                throw new RuntimeException("Input plugin must be a FileInputPlugin to guess parser configuration");
                            }

                            public void finish() {
                            }

                            public void close() {
                            }
                        });
                    } catch (NotEnoughSampleError ex) {
                        if (maxSize < ex.getSize()) {
                            maxSize = ex.getSize();
                            maxSizeTaskIndex = taskIndex;
                        }
                        continue;
                    }
                }
                if (maxSize <= 0) {
                    throw new NoSampleException("All input files are empty");
                }
                taskSource.getNested("ParserTaskSource").set("force", true);
                try {
                    runner.run(taskSource, schema, maxSizeTaskIndex, new PageOutput() {

                        @Override
                        public void add(Page page) {
                            // TODO exception class
                            throw new RuntimeException("Input plugin must be a FileInputPlugin to guess parser configuration");
                        }

                        public void finish() {
                        }

                        public void close() {
                        }
                    });
                } catch (NotEnoughSampleError ex) {
                    throw new NoSampleException("All input files are smaller than minimum sampling size");
                }
                throw new NoSampleException("All input files are smaller than minimum sampling size");
            }
        });
        throw new AssertionError("SamplingParserPlugin must throw SampledNoticeError");
    } catch (SampledNoticeError error) {
        return error.getSample();
    }
}
Also used : InputPlugin(org.embulk.spi.InputPlugin) Schema(org.embulk.spi.Schema) Page(org.embulk.spi.Page) ConfigSource(org.embulk.config.ConfigSource) PageOutput(org.embulk.spi.PageOutput) List(java.util.List) TaskSource(org.embulk.config.TaskSource)

Example 49 with ConfigSource

use of org.embulk.config.ConfigSource in project embulk by embulk.

the class FileInputRunner method guess.

public ConfigDiff guess(ConfigSource execConfig, ConfigSource inputConfig) {
    final ConfigSource sampleBufferConfig = createSampleBufferConfigFromExecConfig(execConfig);
    final Buffer sample = SamplingParserPlugin.runFileInputSampling(this, inputConfig, sampleBufferConfig);
    // SamplingParserPlugin.runFileInputSampling throws NoSampleException if there're
    // no files or all files are smaller than minSampleSize (40 bytes).
    GuessExecutor guessExecutor = Exec.getInjector().getInstance(GuessExecutor.class);
    return guessExecutor.guessParserConfig(sample, inputConfig, execConfig);
}
Also used : ConfigSource(org.embulk.config.ConfigSource) GuessExecutor(org.embulk.exec.GuessExecutor)

Example 50 with ConfigSource

use of org.embulk.config.ConfigSource in project embulk by embulk.

the class ColumnConfig method getConfigSource.

@JsonValue
public ConfigSource getConfigSource() {
    ConfigSource conf = option.deepCopy();
    conf.set("name", name);
    conf.set("type", type);
    return conf;
}
Also used : ConfigSource(org.embulk.config.ConfigSource) JsonValue(com.fasterxml.jackson.annotation.JsonValue)

Aggregations

ConfigSource (org.embulk.config.ConfigSource)50 Test (org.junit.Test)33 TaskSource (org.embulk.config.TaskSource)12 Schema (org.embulk.spi.Schema)9 HashMap (java.util.HashMap)8 ArrayList (java.util.ArrayList)6 List (java.util.List)6 ConfigDiff (org.embulk.config.ConfigDiff)6 FilterPlugin (org.embulk.spi.FilterPlugin)6 ImmutableList (com.google.common.collect.ImmutableList)5 SchemaConfigException (org.embulk.spi.SchemaConfigException)4 ConfigException (org.embulk.config.ConfigException)3 Column (org.embulk.spi.Column)3 InputPlugin (org.embulk.spi.InputPlugin)3 ImmutableMap (com.google.common.collect.ImmutableMap)2 IOException (java.io.IOException)2 Path (java.nio.file.Path)2 LinkedList (java.util.LinkedList)2 DataSource (org.embulk.config.DataSource)2 TaskReport (org.embulk.config.TaskReport)2