Search in sources :

Example 1 with Workflow

use of com.datastax.oss.dsbulk.workflow.api.Workflow in project dsbulk by datastax.

the class UnloadWorkflow method manyWriters.

private Flux<Record> manyWriters() {
    // writeConcurrency and readConcurrency are >= 0.5C here
    int actualConcurrency = Math.min(readConcurrency, writeConcurrency);
    int numThreads = Math.min(numCores * 2, actualConcurrency);
    Scheduler scheduler = Schedulers.newParallel(numThreads, new DefaultThreadFactory("workflow"));
    schedulers.add(scheduler);
    return Flux.fromIterable(readStatements).flatMap(results -> {
        Flux<Record> records = Flux.from(executor.readReactive(results)).publishOn(scheduler, 500).transform(queryWarningsHandler).transform(totalItemsMonitor).transform(totalItemsCounter).transform(failedReadResultsMonitor).transform(failedReadsHandler).map(readResultMapper::map).transform(failedRecordsMonitor).transform(unmappableRecordsHandler);
        if (actualConcurrency == writeConcurrency) {
            records = records.transform(writer);
        } else {
            // If the actual concurrency is lesser than the connector's desired write
            // concurrency, we need to give the connector a chance to switch writers
            // frequently so that it can really redirect records to all the final destinations
            // (to that many files on disk for example). If the connector is correctly
            // implemented, each window will be redirected to a different destination
            // in a round-robin fashion.
            records = records.window(500).flatMap(window -> window.transform(writer), 1, 500);
        }
        return records.transform(failedRecordsMonitor).transform(failedRecordsHandler);
    }, actualConcurrency, 500);
}
Also used : DefaultThreadFactory(io.netty.util.concurrent.DefaultThreadFactory) ReadResult(com.datastax.oss.dsbulk.executor.api.result.ReadResult) Connector(com.datastax.oss.dsbulk.connectors.api.Connector) DefaultThreadFactory(io.netty.util.concurrent.DefaultThreadFactory) BulkReader(com.datastax.oss.dsbulk.executor.api.reader.BulkReader) DriverSettings(com.datastax.oss.dsbulk.workflow.commons.settings.DriverSettings) LoggerFactory(org.slf4j.LoggerFactory) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) Workflow(com.datastax.oss.dsbulk.workflow.api.Workflow) Scheduler(reactor.core.scheduler.Scheduler) Function(java.util.function.Function) ExecutorSettings(com.datastax.oss.dsbulk.workflow.commons.settings.ExecutorSettings) SchemaSettings(com.datastax.oss.dsbulk.workflow.commons.settings.SchemaSettings) HashSet(java.util.HashSet) RecordMetadata(com.datastax.oss.dsbulk.connectors.api.RecordMetadata) CqlSession(com.datastax.oss.driver.api.core.CqlSession) ConnectorSettings(com.datastax.oss.dsbulk.workflow.commons.settings.ConnectorSettings) Duration(java.time.Duration) SchemaGenerationStrategy(com.datastax.oss.dsbulk.workflow.commons.settings.SchemaGenerationStrategy) Schedulers(reactor.core.scheduler.Schedulers) Record(com.datastax.oss.dsbulk.connectors.api.Record) Stopwatch(com.datastax.oss.driver.shaded.guava.common.base.Stopwatch) CommonConnectorFeature(com.datastax.oss.dsbulk.connectors.api.CommonConnectorFeature) Logger(org.slf4j.Logger) Config(com.typesafe.config.Config) LogSettings(com.datastax.oss.dsbulk.workflow.commons.settings.LogSettings) Publisher(org.reactivestreams.Publisher) ConvertingCodecFactory(com.datastax.oss.dsbulk.codecs.api.ConvertingCodecFactory) SettingsManager(com.datastax.oss.dsbulk.workflow.commons.settings.SettingsManager) EngineSettings(com.datastax.oss.dsbulk.workflow.commons.settings.EngineSettings) Set(java.util.Set) ClusterInformationUtils(com.datastax.oss.dsbulk.workflow.commons.utils.ClusterInformationUtils) CodecSettings(com.datastax.oss.dsbulk.workflow.commons.settings.CodecSettings) MonitoringSettings(com.datastax.oss.dsbulk.workflow.commons.settings.MonitoringSettings) TimeUnit(java.util.concurrent.TimeUnit) Flux(reactor.core.publisher.Flux) List(java.util.List) CloseableUtils(com.datastax.oss.dsbulk.workflow.commons.utils.CloseableUtils) ReadResultMapper(com.datastax.oss.dsbulk.workflow.commons.schema.ReadResultMapper) DurationUtils(com.datastax.oss.dsbulk.workflow.api.utils.DurationUtils) MetricsManager(com.datastax.oss.dsbulk.workflow.commons.metrics.MetricsManager) Statement(com.datastax.oss.driver.api.core.cql.Statement) LogManager(com.datastax.oss.dsbulk.workflow.commons.log.LogManager) Scheduler(reactor.core.scheduler.Scheduler) Record(com.datastax.oss.dsbulk.connectors.api.Record)

Example 2 with Workflow

use of com.datastax.oss.dsbulk.workflow.api.Workflow in project dsbulk by datastax.

the class UnloadWorkflow method oneWriter.

private Flux<Record> oneWriter() {
    int numThreads = Math.min(numCores * 2, readConcurrency);
    Scheduler scheduler = numThreads == 1 ? Schedulers.immediate() : Schedulers.newParallel(numThreads, new DefaultThreadFactory("workflow"));
    schedulers.add(scheduler);
    return Flux.fromIterable(readStatements).flatMap(results -> Flux.from(executor.readReactive(results)).publishOn(scheduler, 500).transform(queryWarningsHandler).transform(totalItemsMonitor).transform(totalItemsCounter).transform(failedReadResultsMonitor).transform(failedReadsHandler).map(readResultMapper::map).transform(failedRecordsMonitor).transform(unmappableRecordsHandler), readConcurrency, 500).transform(writer).transform(failedRecordsMonitor).transform(failedRecordsHandler);
}
Also used : DefaultThreadFactory(io.netty.util.concurrent.DefaultThreadFactory) ReadResult(com.datastax.oss.dsbulk.executor.api.result.ReadResult) Connector(com.datastax.oss.dsbulk.connectors.api.Connector) DefaultThreadFactory(io.netty.util.concurrent.DefaultThreadFactory) BulkReader(com.datastax.oss.dsbulk.executor.api.reader.BulkReader) DriverSettings(com.datastax.oss.dsbulk.workflow.commons.settings.DriverSettings) LoggerFactory(org.slf4j.LoggerFactory) AtomicBoolean(java.util.concurrent.atomic.AtomicBoolean) Workflow(com.datastax.oss.dsbulk.workflow.api.Workflow) Scheduler(reactor.core.scheduler.Scheduler) Function(java.util.function.Function) ExecutorSettings(com.datastax.oss.dsbulk.workflow.commons.settings.ExecutorSettings) SchemaSettings(com.datastax.oss.dsbulk.workflow.commons.settings.SchemaSettings) HashSet(java.util.HashSet) RecordMetadata(com.datastax.oss.dsbulk.connectors.api.RecordMetadata) CqlSession(com.datastax.oss.driver.api.core.CqlSession) ConnectorSettings(com.datastax.oss.dsbulk.workflow.commons.settings.ConnectorSettings) Duration(java.time.Duration) SchemaGenerationStrategy(com.datastax.oss.dsbulk.workflow.commons.settings.SchemaGenerationStrategy) Schedulers(reactor.core.scheduler.Schedulers) Record(com.datastax.oss.dsbulk.connectors.api.Record) Stopwatch(com.datastax.oss.driver.shaded.guava.common.base.Stopwatch) CommonConnectorFeature(com.datastax.oss.dsbulk.connectors.api.CommonConnectorFeature) Logger(org.slf4j.Logger) Config(com.typesafe.config.Config) LogSettings(com.datastax.oss.dsbulk.workflow.commons.settings.LogSettings) Publisher(org.reactivestreams.Publisher) ConvertingCodecFactory(com.datastax.oss.dsbulk.codecs.api.ConvertingCodecFactory) SettingsManager(com.datastax.oss.dsbulk.workflow.commons.settings.SettingsManager) EngineSettings(com.datastax.oss.dsbulk.workflow.commons.settings.EngineSettings) Set(java.util.Set) ClusterInformationUtils(com.datastax.oss.dsbulk.workflow.commons.utils.ClusterInformationUtils) CodecSettings(com.datastax.oss.dsbulk.workflow.commons.settings.CodecSettings) MonitoringSettings(com.datastax.oss.dsbulk.workflow.commons.settings.MonitoringSettings) TimeUnit(java.util.concurrent.TimeUnit) Flux(reactor.core.publisher.Flux) List(java.util.List) CloseableUtils(com.datastax.oss.dsbulk.workflow.commons.utils.CloseableUtils) ReadResultMapper(com.datastax.oss.dsbulk.workflow.commons.schema.ReadResultMapper) DurationUtils(com.datastax.oss.dsbulk.workflow.api.utils.DurationUtils) MetricsManager(com.datastax.oss.dsbulk.workflow.commons.metrics.MetricsManager) Statement(com.datastax.oss.driver.api.core.cql.Statement) LogManager(com.datastax.oss.dsbulk.workflow.commons.log.LogManager) Scheduler(reactor.core.scheduler.Scheduler)

Example 3 with Workflow

use of com.datastax.oss.dsbulk.workflow.api.Workflow in project dsbulk by datastax.

the class DataStaxBulkLoader method run.

@NonNull
public ExitStatus run() {
    Workflow workflow = null;
    try {
        AnsiConfigurator.configureAnsi(args);
        CommandLineParser parser = new CommandLineParser(args);
        ParsedCommandLine result = parser.parse();
        Config config = result.getConfig();
        workflow = result.getWorkflowProvider().newWorkflow(config);
        WorkflowThread workflowThread = new WorkflowThread(workflow);
        Runtime.getRuntime().addShutdownHook(new CleanupThread(workflow, workflowThread));
        // start the workflow and wait for its completion
        workflowThread.start();
        workflowThread.join();
        return workflowThread.getExitStatus();
    } catch (GlobalHelpRequestException e) {
        HelpEmitter.emitGlobalHelp(e.getConnectorName());
        return STATUS_OK;
    } catch (SectionHelpRequestException e) {
        try {
            HelpEmitter.emitSectionHelp(e.getSectionName(), e.getConnectorName());
            return STATUS_OK;
        } catch (Exception e2) {
            LOGGER.error(e2.getMessage(), e2);
            return STATUS_CRASHED;
        }
    } catch (VersionRequestException e) {
        // Use the OS charset
        PrintWriter pw = new PrintWriter(new BufferedWriter(new OutputStreamWriter(System.out, Charset.defaultCharset())));
        pw.println(WorkflowUtils.getBulkLoaderNameAndVersion());
        pw.flush();
        return STATUS_OK;
    } catch (Throwable t) {
        return ErrorHandler.handleUnexpectedError(workflow, t);
    }
}
Also used : Config(com.typesafe.config.Config) Workflow(com.datastax.oss.dsbulk.workflow.api.Workflow) GlobalHelpRequestException(com.datastax.oss.dsbulk.runner.cli.GlobalHelpRequestException) SectionHelpRequestException(com.datastax.oss.dsbulk.runner.cli.SectionHelpRequestException) VersionRequestException(com.datastax.oss.dsbulk.runner.cli.VersionRequestException) GlobalHelpRequestException(com.datastax.oss.dsbulk.runner.cli.GlobalHelpRequestException) BufferedWriter(java.io.BufferedWriter) VersionRequestException(com.datastax.oss.dsbulk.runner.cli.VersionRequestException) ParsedCommandLine(com.datastax.oss.dsbulk.runner.cli.ParsedCommandLine) OutputStreamWriter(java.io.OutputStreamWriter) CommandLineParser(com.datastax.oss.dsbulk.runner.cli.CommandLineParser) SectionHelpRequestException(com.datastax.oss.dsbulk.runner.cli.SectionHelpRequestException) PrintWriter(java.io.PrintWriter) NonNull(edu.umd.cs.findbugs.annotations.NonNull)

Aggregations

Workflow (com.datastax.oss.dsbulk.workflow.api.Workflow)3 Config (com.typesafe.config.Config)3 CqlSession (com.datastax.oss.driver.api.core.CqlSession)2 Statement (com.datastax.oss.driver.api.core.cql.Statement)2 Stopwatch (com.datastax.oss.driver.shaded.guava.common.base.Stopwatch)2 ConvertingCodecFactory (com.datastax.oss.dsbulk.codecs.api.ConvertingCodecFactory)2 CommonConnectorFeature (com.datastax.oss.dsbulk.connectors.api.CommonConnectorFeature)2 Connector (com.datastax.oss.dsbulk.connectors.api.Connector)2 Record (com.datastax.oss.dsbulk.connectors.api.Record)2 RecordMetadata (com.datastax.oss.dsbulk.connectors.api.RecordMetadata)2 BulkReader (com.datastax.oss.dsbulk.executor.api.reader.BulkReader)2 ReadResult (com.datastax.oss.dsbulk.executor.api.result.ReadResult)2 DurationUtils (com.datastax.oss.dsbulk.workflow.api.utils.DurationUtils)2 LogManager (com.datastax.oss.dsbulk.workflow.commons.log.LogManager)2 MetricsManager (com.datastax.oss.dsbulk.workflow.commons.metrics.MetricsManager)2 ReadResultMapper (com.datastax.oss.dsbulk.workflow.commons.schema.ReadResultMapper)2 CodecSettings (com.datastax.oss.dsbulk.workflow.commons.settings.CodecSettings)2 ConnectorSettings (com.datastax.oss.dsbulk.workflow.commons.settings.ConnectorSettings)2 DriverSettings (com.datastax.oss.dsbulk.workflow.commons.settings.DriverSettings)2 EngineSettings (com.datastax.oss.dsbulk.workflow.commons.settings.EngineSettings)2