Search in sources :

Example 1 with Sink

use of com.hazelcast.jet.pipeline.Sink in project hazelcast-jet by hazelcast.

the class PipelineImpl method drainTo.

@Override
public <T> SinkStage drainTo(@Nonnull Sink<T> sink, GeneralStage<?>... stagesToDrain) {
    if (stagesToDrain == null || stagesToDrain.length == 0) {
        throw new IllegalArgumentException("No stages supplied to Pipeline.drainTo()");
    }
    List<Transform> upstream = Arrays.stream(stagesToDrain).map(s -> (AbstractStage) s).map(s -> s.transform).collect(toList());
    int[] ordinalsToAdapt = IntStream.range(0, stagesToDrain.length).filter(i -> ((ComputeStageImplBase) stagesToDrain[i]).fnAdapter == ADAPT_TO_JET_EVENT).toArray();
    SinkImpl sinkImpl = (SinkImpl) sink;
    SinkTransform sinkTransform = new SinkTransform(sinkImpl, upstream, ordinalsToAdapt);
    SinkStageImpl sinkStage = new SinkStageImpl(sinkTransform, this);
    sinkImpl.onAssignToStage();
    connect(upstream, sinkTransform);
    return sinkStage;
}
Also used : GeneralStage(com.hazelcast.jet.pipeline.GeneralStage) IntStream(java.util.stream.IntStream) DONT_ADAPT(com.hazelcast.jet.impl.pipeline.ComputeStageImplBase.DONT_ADAPT) StreamStage(com.hazelcast.jet.pipeline.StreamStage) Arrays(java.util.Arrays) BatchSource(com.hazelcast.jet.pipeline.BatchSource) BatchSourceTransform(com.hazelcast.jet.impl.pipeline.transform.BatchSourceTransform) Pipeline(com.hazelcast.jet.pipeline.Pipeline) HashMap(java.util.HashMap) StreamSource(com.hazelcast.jet.pipeline.StreamSource) Transform(com.hazelcast.jet.impl.pipeline.transform.Transform) ADAPT_TO_JET_EVENT(com.hazelcast.jet.impl.pipeline.ComputeStageImplBase.ADAPT_TO_JET_EVENT) ArrayList(java.util.ArrayList) BatchStage(com.hazelcast.jet.pipeline.BatchStage) SinkTransform(com.hazelcast.jet.impl.pipeline.transform.SinkTransform) SinkStage(com.hazelcast.jet.pipeline.SinkStage) List(java.util.List) Collectors.toList(java.util.stream.Collectors.toList) Map(java.util.Map) StreamSourceTransform(com.hazelcast.jet.impl.pipeline.transform.StreamSourceTransform) DAG(com.hazelcast.jet.core.DAG) Nonnull(javax.annotation.Nonnull) Sink(com.hazelcast.jet.pipeline.Sink) SinkTransform(com.hazelcast.jet.impl.pipeline.transform.SinkTransform) BatchSourceTransform(com.hazelcast.jet.impl.pipeline.transform.BatchSourceTransform) Transform(com.hazelcast.jet.impl.pipeline.transform.Transform) SinkTransform(com.hazelcast.jet.impl.pipeline.transform.SinkTransform) StreamSourceTransform(com.hazelcast.jet.impl.pipeline.transform.StreamSourceTransform)

Example 2 with Sink

use of com.hazelcast.jet.pipeline.Sink in project hazelcast by hazelcast.

the class PipelineImpl method writeTo.

@Nonnull
@Override
@SuppressWarnings({ "rawtypes", "unchecked" })
public <T> SinkStage writeTo(@Nonnull Sink<? super T> sink, @Nonnull GeneralStage<? extends T> stage0, @Nonnull GeneralStage<? extends T> stage1, @Nonnull GeneralStage<? extends T>... moreStages) {
    List<GeneralStage> stages = new ArrayList<>(asList(moreStages));
    stages.add(0, stage0);
    stages.add(1, stage1);
    List<Transform> upstream = stages.stream().map(s -> (AbstractStage) s).map(s -> s.transform).collect(toList());
    int[] ordinalsToAdapt = IntStream.range(0, stages.size()).filter(i -> ((ComputeStageImplBase) stages.get(i)).fnAdapter == ADAPT_TO_JET_EVENT).toArray();
    SinkImpl sinkImpl = (SinkImpl) sink;
    SinkTransform sinkTransform = new SinkTransform(sinkImpl, upstream, ordinalsToAdapt);
    SinkStageImpl sinkStage = new SinkStageImpl(sinkTransform, this);
    sinkImpl.onAssignToStage();
    connectGeneralStages(stages, sinkTransform);
    return sinkStage;
}
Also used : IntStream(java.util.stream.IntStream) Util.escapeGraphviz(com.hazelcast.jet.impl.util.Util.escapeGraphviz) HashMap(java.util.HashMap) StreamSource(com.hazelcast.jet.pipeline.StreamSource) ArrayList(java.util.ArrayList) Collections.singletonList(java.util.Collections.singletonList) BatchStage(com.hazelcast.jet.pipeline.BatchStage) HashSet(java.util.HashSet) LinkedHashMap(java.util.LinkedHashMap) SinkTransform(com.hazelcast.jet.impl.pipeline.transform.SinkTransform) Arrays.asList(java.util.Arrays.asList) Map(java.util.Map) DAG(com.hazelcast.jet.core.DAG) Nonnull(javax.annotation.Nonnull) GeneralStage(com.hazelcast.jet.pipeline.GeneralStage) BatchSource(com.hazelcast.jet.pipeline.BatchSource) BatchSourceTransform(com.hazelcast.jet.impl.pipeline.transform.BatchSourceTransform) Pipeline(com.hazelcast.jet.pipeline.Pipeline) Set(java.util.Set) AbstractTransform(com.hazelcast.jet.impl.pipeline.transform.AbstractTransform) Transform(com.hazelcast.jet.impl.pipeline.transform.Transform) File(java.io.File) ADAPT_TO_JET_EVENT(com.hazelcast.jet.impl.pipeline.ComputeStageImplBase.ADAPT_TO_JET_EVENT) SinkStage(com.hazelcast.jet.pipeline.SinkStage) List(java.util.List) Collectors.toList(java.util.stream.Collectors.toList) Entry(java.util.Map.Entry) Util.addOrIncrementIndexInName(com.hazelcast.jet.impl.util.Util.addOrIncrementIndexInName) StreamSourceTransform(com.hazelcast.jet.impl.pipeline.transform.StreamSourceTransform) StreamSourceStage(com.hazelcast.jet.pipeline.StreamSourceStage) Collections(java.util.Collections) Sink(com.hazelcast.jet.pipeline.Sink) GeneralStage(com.hazelcast.jet.pipeline.GeneralStage) SinkTransform(com.hazelcast.jet.impl.pipeline.transform.SinkTransform) ArrayList(java.util.ArrayList) SinkTransform(com.hazelcast.jet.impl.pipeline.transform.SinkTransform) BatchSourceTransform(com.hazelcast.jet.impl.pipeline.transform.BatchSourceTransform) AbstractTransform(com.hazelcast.jet.impl.pipeline.transform.AbstractTransform) Transform(com.hazelcast.jet.impl.pipeline.transform.Transform) StreamSourceTransform(com.hazelcast.jet.impl.pipeline.transform.StreamSourceTransform) Nonnull(javax.annotation.Nonnull)

Example 3 with Sink

use of com.hazelcast.jet.pipeline.Sink in project hazelcast by hazelcast.

the class SinkStressTestUtil method test_withRestarts.

public static void test_withRestarts(@Nonnull HazelcastInstance instance, @Nonnull ILogger logger, @Nonnull Sink<Integer> sink, boolean graceful, boolean exactlyOnce, @Nonnull SupplierEx<List<Integer>> actualItemsSupplier) {
    int numItems = 1000;
    Pipeline p = Pipeline.create();
    p.readFrom(SourceBuilder.stream("src", procCtx -> new int[] { procCtx.globalProcessorIndex() == 0 ? 0 : Integer.MAX_VALUE }).<Integer>fillBufferFn((ctx, buf) -> {
        if (ctx[0] < numItems) {
            buf.add(ctx[0]++);
            sleepMillis(5);
        }
    }).distributed(1).createSnapshotFn(ctx -> ctx[0] < Integer.MAX_VALUE ? ctx[0] : null).restoreSnapshotFn((ctx, state) -> ctx[0] = ctx[0] != Integer.MAX_VALUE ? state.get(0) : Integer.MAX_VALUE).build()).withoutTimestamps().peek().writeTo(sink);
    JobConfig config = new JobConfig().setProcessingGuarantee(exactlyOnce ? EXACTLY_ONCE : AT_LEAST_ONCE).setSnapshotIntervalMillis(50);
    JobProxy job = (JobProxy) instance.getJet().newJob(p, config);
    long endTime = System.nanoTime() + SECONDS.toNanos(TEST_TIMEOUT_SECONDS);
    int lastCount = 0;
    String expectedRows = IntStream.range(0, numItems).mapToObj(i -> i + (exactlyOnce ? "=1" : "")).collect(joining("\n"));
    // We'll restart once, then restart again after a short sleep (possibly during initialization),
    // and then assert some output so that the test isn't constantly restarting without any progress
    Long lastExecutionId = null;
    for (; ; ) {
        lastExecutionId = assertJobRunningEventually(instance, job, lastExecutionId);
        job.restart(graceful);
        lastExecutionId = assertJobRunningEventually(instance, job, lastExecutionId);
        sleepMillis(ThreadLocalRandom.current().nextInt(400));
        job.restart(graceful);
        try {
            List<Integer> actualItems;
            Set<Integer> distinctActualItems;
            do {
                actualItems = actualItemsSupplier.get();
                distinctActualItems = new HashSet<>(actualItems);
            } while (distinctActualItems.size() < Math.min(numItems, 100 + lastCount) && System.nanoTime() < endTime);
            lastCount = distinctActualItems.size();
            logger.info("number of committed items in the sink so far: " + lastCount);
            if (exactlyOnce) {
                String actualItemsStr = actualItems.stream().collect(groupingBy(identity(), TreeMap::new, counting())).entrySet().stream().map(Object::toString).collect(joining("\n"));
                assertEquals(expectedRows, actualItemsStr);
            } else {
                assertEquals(expectedRows, distinctActualItems.stream().map(Objects::toString).collect(joining("\n")));
            }
            // if content matches, break the loop. Otherwise restart and try again
            break;
        } catch (AssertionError e) {
            if (System.nanoTime() >= endTime) {
                throw e;
            }
        }
    }
}
Also used : IntStream(java.util.stream.IntStream) Collectors.counting(java.util.stream.Collectors.counting) Collectors.groupingBy(java.util.stream.Collectors.groupingBy) JobProxy(com.hazelcast.jet.impl.JobProxy) HashSet(java.util.HashSet) ILogger(com.hazelcast.logging.ILogger) ThreadLocalRandom(java.util.concurrent.ThreadLocalRandom) Nonnull(javax.annotation.Nonnull) HazelcastInstance(com.hazelcast.core.HazelcastInstance) HazelcastTestSupport.sleepMillis(com.hazelcast.test.HazelcastTestSupport.sleepMillis) Pipeline(com.hazelcast.jet.pipeline.Pipeline) EXACTLY_ONCE(com.hazelcast.jet.config.ProcessingGuarantee.EXACTLY_ONCE) JobConfig(com.hazelcast.jet.config.JobConfig) Set(java.util.Set) SupplierEx(com.hazelcast.function.SupplierEx) Collectors.joining(java.util.stream.Collectors.joining) Objects(java.util.Objects) List(java.util.List) TreeMap(java.util.TreeMap) JetTestSupport.assertJobRunningEventually(com.hazelcast.jet.core.JetTestSupport.assertJobRunningEventually) Function.identity(java.util.function.Function.identity) SourceBuilder(com.hazelcast.jet.pipeline.SourceBuilder) AT_LEAST_ONCE(com.hazelcast.jet.config.ProcessingGuarantee.AT_LEAST_ONCE) Sink(com.hazelcast.jet.pipeline.Sink) SECONDS(java.util.concurrent.TimeUnit.SECONDS) Assert.assertEquals(org.junit.Assert.assertEquals) JobConfig(com.hazelcast.jet.config.JobConfig) Pipeline(com.hazelcast.jet.pipeline.Pipeline) JobProxy(com.hazelcast.jet.impl.JobProxy) Objects(java.util.Objects)

Example 4 with Sink

use of com.hazelcast.jet.pipeline.Sink in project hazelcast by hazelcast.

the class ElasticSinkBuilder method build.

/**
 * Create a sink that writes data into Elasticsearch based on this builder configuration
 */
@Nonnull
public Sink<T> build() {
    requireNonNull(clientFn, "clientFn is not set");
    requireNonNull(mapToRequestFn, "mapToRequestFn is not set");
    return SinkBuilder.sinkBuilder(DEFAULT_NAME, ctx -> new BulkContext(new RestHighLevelClient(clientFn.get()), bulkRequestFn, optionsFn, retries, ctx.logger())).<T>receiveFn((bulkContext, item) -> bulkContext.add(mapToRequestFn.apply(item))).flushFn(BulkContext::flush).destroyFn(BulkContext::close).preferredLocalParallelism(DEFAULT_LOCAL_PARALLELISM).build();
}
Also used : FunctionEx(com.hazelcast.function.FunctionEx) ActionRequest(org.elasticsearch.action.ActionRequest) RestClientBuilder(org.elasticsearch.client.RestClientBuilder) Util.checkNonNullAndSerializable(com.hazelcast.jet.impl.util.Util.checkNonNullAndSerializable) BulkResponse(org.elasticsearch.action.bulk.BulkResponse) IOException(java.io.IOException) DocWriteRequest(org.elasticsearch.action.DocWriteRequest) RestHighLevelClient(org.elasticsearch.client.RestHighLevelClient) SupplierEx(com.hazelcast.function.SupplierEx) Serializable(java.io.Serializable) JetException(com.hazelcast.jet.JetException) ILogger(com.hazelcast.logging.ILogger) Objects.requireNonNull(java.util.Objects.requireNonNull) RequestOptions(org.elasticsearch.client.RequestOptions) SinkBuilder(com.hazelcast.jet.pipeline.SinkBuilder) BulkRequest(org.elasticsearch.action.bulk.BulkRequest) Nonnull(javax.annotation.Nonnull) Sink(com.hazelcast.jet.pipeline.Sink) RetryUtils.withRetry(com.hazelcast.jet.elastic.impl.RetryUtils.withRetry) RestHighLevelClient(org.elasticsearch.client.RestHighLevelClient) Nonnull(javax.annotation.Nonnull)

Example 5 with Sink

use of com.hazelcast.jet.pipeline.Sink in project hazelcast by hazelcast.

the class LocalElasticSinkTest method when_writeToSink_then_shouldCloseClient.

@Test
public void when_writeToSink_then_shouldCloseClient() throws IOException {
    ClientHolder.elasticClients.clear();
    Sink<String> elasticSink = new ElasticSinkBuilder<>().clientFn(() -> {
        RestClientBuilder builder = spy(RestClient.builder(HttpHost.create(ElasticSupport.elastic.get().getHttpHostAddress())));
        when(builder.build()).thenAnswer(invocation -> {
            Object result = invocation.callRealMethod();
            RestClient client = (RestClient) spy(result);
            ClientHolder.elasticClients.add(client);
            return client;
        });
        return builder;
    }).bulkRequestFn(() -> new BulkRequest().setRefreshPolicy(RefreshPolicy.IMMEDIATE)).mapToRequestFn((String item) -> new IndexRequest("my-index").source(Collections.emptyMap())).build();
    Pipeline p = Pipeline.create();
    p.readFrom(TestSources.items("a", "b", "c")).writeTo(elasticSink);
    hz.getJet().newJob(p).join();
    for (RestClient client : ClientHolder.elasticClients) {
        verify(client).close();
    }
}
Also used : RestClient(org.elasticsearch.client.RestClient) HazelcastInstance(com.hazelcast.core.HazelcastInstance) RestClientBuilder(org.elasticsearch.client.RestClientBuilder) Pipeline(com.hazelcast.jet.pipeline.Pipeline) Test(org.junit.Test) IOException(java.io.IOException) Mockito.when(org.mockito.Mockito.when) Mockito.spy(org.mockito.Mockito.spy) Mockito.verify(org.mockito.Mockito.verify) TestSources(com.hazelcast.jet.pipeline.test.TestSources) IndexRequest(org.elasticsearch.action.index.IndexRequest) ClientHolder(com.hazelcast.jet.elastic.ElasticSinkBuilderTest.ClientHolder) After(org.junit.After) TestHazelcastFactory(com.hazelcast.client.test.TestHazelcastFactory) HttpHost(org.apache.http.HttpHost) BulkRequest(org.elasticsearch.action.bulk.BulkRequest) RefreshPolicy(org.elasticsearch.action.support.WriteRequest.RefreshPolicy) Collections(java.util.Collections) Sink(com.hazelcast.jet.pipeline.Sink) BulkRequest(org.elasticsearch.action.bulk.BulkRequest) RestClient(org.elasticsearch.client.RestClient) RestClientBuilder(org.elasticsearch.client.RestClientBuilder) IndexRequest(org.elasticsearch.action.index.IndexRequest) Pipeline(com.hazelcast.jet.pipeline.Pipeline) Test(org.junit.Test)

Aggregations

Sink (com.hazelcast.jet.pipeline.Sink)7 Pipeline (com.hazelcast.jet.pipeline.Pipeline)5 Nonnull (javax.annotation.Nonnull)5 HazelcastInstance (com.hazelcast.core.HazelcastInstance)4 FunctionEx (com.hazelcast.function.FunctionEx)3 SupplierEx (com.hazelcast.function.SupplierEx)3 BatchSource (com.hazelcast.jet.pipeline.BatchSource)3 List (java.util.List)3 IntStream (java.util.stream.IntStream)3 DAG (com.hazelcast.jet.core.DAG)2 ADAPT_TO_JET_EVENT (com.hazelcast.jet.impl.pipeline.ComputeStageImplBase.ADAPT_TO_JET_EVENT)2 BatchSourceTransform (com.hazelcast.jet.impl.pipeline.transform.BatchSourceTransform)2 SinkTransform (com.hazelcast.jet.impl.pipeline.transform.SinkTransform)2 StreamSourceTransform (com.hazelcast.jet.impl.pipeline.transform.StreamSourceTransform)2 Transform (com.hazelcast.jet.impl.pipeline.transform.Transform)2 BatchStage (com.hazelcast.jet.pipeline.BatchStage)2 GeneralStage (com.hazelcast.jet.pipeline.GeneralStage)2 SinkBuilder (com.hazelcast.jet.pipeline.SinkBuilder)2 SinkStage (com.hazelcast.jet.pipeline.SinkStage)2 SourceBuilder (com.hazelcast.jet.pipeline.SourceBuilder)2