Search in sources :

Example 1 with Either

use of org.apache.gobblin.util.Either in project incubator-gobblin by apache.

the class ConfigBasedDatasetsFinder method executeItertorExecutor.

protected void executeItertorExecutor(Iterator<Callable<Void>> callableIterator) throws IOException {
    try {
        IteratorExecutor<Void> executor = new IteratorExecutor<>(callableIterator, this.threadPoolSize, ExecutorsUtils.newDaemonThreadFactory(Optional.of(log), Optional.of(this.getClass().getSimpleName())));
        List<Either<Void, ExecutionException>> results = executor.executeAndGetResults();
        IteratorExecutor.logFailures(results, log, 10);
    } catch (InterruptedException ie) {
        throw new IOException("Dataset finder is interrupted.", ie);
    }
}
Also used : IteratorExecutor(org.apache.gobblin.util.executors.IteratorExecutor) Either(org.apache.gobblin.util.Either) IOException(java.io.IOException)

Example 2 with Either

use of org.apache.gobblin.util.Either in project incubator-gobblin by apache.

the class FsDatasetStateStore method getLatestDatasetStatesByUrns.

/**
 * Get a {@link Map} from dataset URNs to the latest {@link JobState.DatasetState}s.
 *
 * @param jobName the job name
 * @return a {@link Map} from dataset URNs to the latest {@link JobState.DatasetState}s
 * @throws IOException if there's something wrong reading the {@link JobState.DatasetState}s
 */
public Map<String, JobState.DatasetState> getLatestDatasetStatesByUrns(final String jobName) throws IOException {
    Path stateStorePath = new Path(this.storeRootDir, jobName);
    if (!this.fs.exists(stateStorePath)) {
        return ImmutableMap.of();
    }
    FileStatus[] stateStoreFileStatuses = this.fs.listStatus(stateStorePath, new PathFilter() {

        @Override
        public boolean accept(Path path) {
            return path.getName().endsWith(CURRENT_DATASET_STATE_FILE_SUFFIX + DATASET_STATE_STORE_TABLE_SUFFIX);
        }
    });
    if (stateStoreFileStatuses == null || stateStoreFileStatuses.length == 0) {
        return ImmutableMap.of();
    }
    final Map<String, JobState.DatasetState> datasetStatesByUrns = new ConcurrentHashMap<>();
    Iterator<Callable<Void>> callableIterator = Iterators.transform(Arrays.asList(stateStoreFileStatuses).iterator(), new Function<FileStatus, Callable<Void>>() {

        @Override
        public Callable<Void> apply(final FileStatus stateStoreFileStatus) {
            return new Callable<Void>() {

                @Override
                public Void call() throws Exception {
                    Path stateStoreFilePath = stateStoreFileStatus.getPath();
                    LOGGER.info("Getting dataset states from: {}", stateStoreFilePath);
                    List<JobState.DatasetState> previousDatasetStates = getAll(jobName, stateStoreFilePath.getName());
                    if (!previousDatasetStates.isEmpty()) {
                        // There should be a single dataset state on the list if the list is not empty
                        JobState.DatasetState previousDatasetState = previousDatasetStates.get(0);
                        datasetStatesByUrns.put(previousDatasetState.getDatasetUrn(), previousDatasetState);
                    }
                    return null;
                }
            };
        }
    });
    try {
        List<Either<Void, ExecutionException>> results = new IteratorExecutor<>(callableIterator, this.threadPoolOfGettingDatasetState, ExecutorsUtils.newDaemonThreadFactory(Optional.of(LOGGER), Optional.of("GetFsDatasetStateStore-"))).executeAndGetResults();
        int maxNumberOfErrorLogs = 10;
        IteratorExecutor.logAndThrowFailures(results, LOGGER, maxNumberOfErrorLogs);
    } catch (InterruptedException e) {
        throw new IOException("Failed to get latest dataset states.", e);
    }
    // the job has transitioned to the new dataset-based mechanism
    if (datasetStatesByUrns.size() > 1) {
        datasetStatesByUrns.remove(ConfigurationKeys.DEFAULT_DATASET_URN);
    }
    return datasetStatesByUrns;
}
Also used : Path(org.apache.hadoop.fs.Path) PathFilter(org.apache.hadoop.fs.PathFilter) FileStatus(org.apache.hadoop.fs.FileStatus) IOException(java.io.IOException) Callable(java.util.concurrent.Callable) IOException(java.io.IOException) ExecutionException(java.util.concurrent.ExecutionException) Either(org.apache.gobblin.util.Either) List(java.util.List) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap)

Example 3 with Either

use of org.apache.gobblin.util.Either in project incubator-gobblin by apache.

the class KafkaAvroJobMonitorTest method testWrongSchema.

@Test
public void testWrongSchema() throws Exception {
    TestKafkaAvroJobMonitor monitor = new TestKafkaAvroJobMonitor(GobblinTrackingEvent.SCHEMA$, new NoopSchemaVersionWriter());
    monitor.buildMetricsContextAndMetrics();
    AvroSerializer<MetricReport> serializer = new AvroBinarySerializer<>(MetricReport.SCHEMA$, new NoopSchemaVersionWriter());
    MetricReport event = new MetricReport(Maps.<String, String>newHashMap(), 0L, Lists.<Metric>newArrayList());
    Collection<Either<JobSpec, URI>> results = monitor.parseJobSpec(serializer.serializeRecord(event));
    Assert.assertEquals(results.size(), 0);
    Assert.assertEquals(monitor.events.size(), 0);
    Assert.assertEquals(monitor.getMessageParseFailures().getCount(), 1);
    monitor.shutdownMetrics();
}
Also used : Either(org.apache.gobblin.util.Either) MetricReport(org.apache.gobblin.metrics.MetricReport) NoopSchemaVersionWriter(org.apache.gobblin.metrics.reporter.util.NoopSchemaVersionWriter) AvroBinarySerializer(org.apache.gobblin.metrics.reporter.util.AvroBinarySerializer) HighLevelConsumerTest(org.apache.gobblin.runtime.kafka.HighLevelConsumerTest) Test(org.testng.annotations.Test)

Example 4 with Either

use of org.apache.gobblin.util.Either in project incubator-gobblin by apache.

the class KafkaAvroJobMonitorTest method testWrongSchemaVersionWriter.

@Test
public void testWrongSchemaVersionWriter() throws Exception {
    TestKafkaAvroJobMonitor monitor = new TestKafkaAvroJobMonitor(GobblinTrackingEvent.SCHEMA$, new NoopSchemaVersionWriter());
    monitor.buildMetricsContextAndMetrics();
    AvroSerializer<GobblinTrackingEvent> serializer = new AvroBinarySerializer<>(GobblinTrackingEvent.SCHEMA$, new FixedSchemaVersionWriter());
    GobblinTrackingEvent event = new GobblinTrackingEvent(0L, "namespace", "event", Maps.<String, String>newHashMap());
    Collection<Either<JobSpec, URI>> results = monitor.parseJobSpec(serializer.serializeRecord(event));
    Assert.assertEquals(results.size(), 0);
    Assert.assertEquals(monitor.events.size(), 0);
    Assert.assertEquals(monitor.getMessageParseFailures().getCount(), 1);
    monitor.shutdownMetrics();
}
Also used : GobblinTrackingEvent(org.apache.gobblin.metrics.GobblinTrackingEvent) FixedSchemaVersionWriter(org.apache.gobblin.metrics.reporter.util.FixedSchemaVersionWriter) Either(org.apache.gobblin.util.Either) NoopSchemaVersionWriter(org.apache.gobblin.metrics.reporter.util.NoopSchemaVersionWriter) AvroBinarySerializer(org.apache.gobblin.metrics.reporter.util.AvroBinarySerializer) HighLevelConsumerTest(org.apache.gobblin.runtime.kafka.HighLevelConsumerTest) Test(org.testng.annotations.Test)

Example 5 with Either

use of org.apache.gobblin.util.Either in project incubator-gobblin by apache.

the class KafkaAvroJobMonitorTest method testUsingSchemaVersion.

@Test
public void testUsingSchemaVersion() throws Exception {
    TestKafkaAvroJobMonitor monitor = new TestKafkaAvroJobMonitor(GobblinTrackingEvent.SCHEMA$, new FixedSchemaVersionWriter());
    monitor.buildMetricsContextAndMetrics();
    AvroSerializer<GobblinTrackingEvent> serializer = new AvroBinarySerializer<>(GobblinTrackingEvent.SCHEMA$, new FixedSchemaVersionWriter());
    GobblinTrackingEvent event = new GobblinTrackingEvent(0L, "namespace", "event", Maps.<String, String>newHashMap());
    Collection<Either<JobSpec, URI>> results = monitor.parseJobSpec(serializer.serializeRecord(event));
    Assert.assertEquals(results.size(), 1);
    Assert.assertEquals(monitor.events.size(), 1);
    Assert.assertEquals(monitor.events.get(0), event);
    monitor.shutdownMetrics();
}
Also used : GobblinTrackingEvent(org.apache.gobblin.metrics.GobblinTrackingEvent) FixedSchemaVersionWriter(org.apache.gobblin.metrics.reporter.util.FixedSchemaVersionWriter) Either(org.apache.gobblin.util.Either) AvroBinarySerializer(org.apache.gobblin.metrics.reporter.util.AvroBinarySerializer) HighLevelConsumerTest(org.apache.gobblin.runtime.kafka.HighLevelConsumerTest) Test(org.testng.annotations.Test)

Aggregations

Either (org.apache.gobblin.util.Either)14 HighLevelConsumerTest (org.apache.gobblin.runtime.kafka.HighLevelConsumerTest)7 Test (org.testng.annotations.Test)7 GobblinTrackingEvent (org.apache.gobblin.metrics.GobblinTrackingEvent)6 NoopSchemaVersionWriter (org.apache.gobblin.metrics.reporter.util.NoopSchemaVersionWriter)6 IOException (java.io.IOException)5 URI (java.net.URI)4 ExecutionException (java.util.concurrent.ExecutionException)4 AvroBinarySerializer (org.apache.gobblin.metrics.reporter.util.AvroBinarySerializer)4 Callable (java.util.concurrent.Callable)2 FixedSchemaVersionWriter (org.apache.gobblin.metrics.reporter.util.FixedSchemaVersionWriter)2 JobSpec (org.apache.gobblin.runtime.api.JobSpec)2 Logger (org.slf4j.Logger)2 Function (com.google.common.base.Function)1 ImmutableMap (com.google.common.collect.ImmutableMap)1 Config (com.typesafe.config.Config)1 URISyntaxException (java.net.URISyntaxException)1 List (java.util.List)1 Map (java.util.Map)1 Properties (java.util.Properties)1