Search in sources :

Example 36 with BulkResponse

use of org.elasticsearch.action.bulk.BulkResponse in project fess-crawler by codelibs.

the class EsUrlQueueService method updateSessionId.

@Override
public void updateSessionId(final String oldSessionId, final String newSessionId) {
    SearchResponse response = null;
    while (true) {
        if (response == null) {
            response = getClient().get(c -> c.prepareSearch(index).setTypes(type).setScroll(new TimeValue(scrollTimeout)).setQuery(QueryBuilders.boolQuery().filter(QueryBuilders.termQuery(SESSION_ID, oldSessionId))).setSize(scrollSize).execute());
        } else {
            final String scrollId = response.getScrollId();
            response = getClient().get(c -> c.prepareSearchScroll(scrollId).setScroll(new TimeValue(scrollTimeout)).execute());
        }
        final SearchHits searchHits = response.getHits();
        if (searchHits.getHits().length == 0) {
            break;
        }
        final BulkResponse bulkResponse = getClient().get(c -> {
            final BulkRequestBuilder builder = c.prepareBulk();
            for (final SearchHit searchHit : searchHits) {
                final UpdateRequestBuilder updateRequest = c.prepareUpdate(index, type, searchHit.getId()).setDoc(SESSION_ID, newSessionId);
                builder.add(updateRequest);
            }
            return builder.execute();
        });
        if (bulkResponse.hasFailures()) {
            throw new EsAccessException(bulkResponse.buildFailureMessage());
        }
    }
}
Also used : SortBuilders(org.elasticsearch.search.sort.SortBuilders) SearchHits(org.elasticsearch.search.SearchHits) LoggerFactory(org.slf4j.LoggerFactory) UpdateRequestBuilder(org.elasticsearch.action.update.UpdateRequestBuilder) QueryBuilders(org.elasticsearch.index.query.QueryBuilders) ArrayList(java.util.ArrayList) PreDestroy(javax.annotation.PreDestroy) OpType(org.elasticsearch.action.DocWriteRequest.OpType) EsUrlQueue(org.codelibs.fess.crawler.entity.EsUrlQueue) Map(java.util.Map) TimeValue(org.elasticsearch.common.unit.TimeValue) SearchResponse(org.elasticsearch.action.search.SearchResponse) RefreshPolicy(org.elasticsearch.action.support.WriteRequest.RefreshPolicy) SearchHit(org.elasticsearch.search.SearchHit) Logger(org.slf4j.Logger) EsAccessException(org.codelibs.fess.crawler.exception.EsAccessException) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap) Resource(javax.annotation.Resource) StringUtil(org.codelibs.core.lang.StringUtil) BulkResponse(org.elasticsearch.action.bulk.BulkResponse) Collectors(java.util.stream.Collectors) Constants(org.codelibs.fess.crawler.Constants) UrlQueueService(org.codelibs.fess.crawler.service.UrlQueueService) List(java.util.List) PostConstruct(javax.annotation.PostConstruct) SortOrder(org.elasticsearch.search.sort.SortOrder) AccessResult(org.codelibs.fess.crawler.entity.AccessResult) Queue(java.util.Queue) UrlQueue(org.codelibs.fess.crawler.entity.UrlQueue) ConcurrentLinkedQueue(java.util.concurrent.ConcurrentLinkedQueue) BulkRequestBuilder(org.elasticsearch.action.bulk.BulkRequestBuilder) SearchHit(org.elasticsearch.search.SearchHit) EsAccessException(org.codelibs.fess.crawler.exception.EsAccessException) UpdateRequestBuilder(org.elasticsearch.action.update.UpdateRequestBuilder) BulkResponse(org.elasticsearch.action.bulk.BulkResponse) SearchHits(org.elasticsearch.search.SearchHits) BulkRequestBuilder(org.elasticsearch.action.bulk.BulkRequestBuilder) TimeValue(org.elasticsearch.common.unit.TimeValue) SearchResponse(org.elasticsearch.action.search.SearchResponse)

Example 37 with BulkResponse

use of org.elasticsearch.action.bulk.BulkResponse in project fess-crawler by codelibs.

the class EsUrlQueueService method poll.

@Override
public EsUrlQueue poll(final String sessionId) {
    final QueueHolder queueHolder = getQueueHolder(sessionId);
    final Queue<EsUrlQueue> waitingQueue = queueHolder.waitingQueue;
    final Queue<EsUrlQueue> crawlingQueue = queueHolder.crawlingQueue;
    EsUrlQueue urlQueue = waitingQueue.poll();
    if (urlQueue != null) {
        if (crawlingQueue.size() > maxCrawlingQueueSize) {
            crawlingQueue.poll();
        }
        crawlingQueue.add(urlQueue);
        return urlQueue;
    }
    synchronized (queueHolder) {
        urlQueue = waitingQueue.poll();
        if (urlQueue == null) {
            final List<EsUrlQueue> urlQueueList = getList(EsUrlQueue.class, sessionId, null, 0, pollingFetchSize, SortBuilders.fieldSort(CREATE_TIME).order(SortOrder.ASC));
            if (urlQueueList.isEmpty()) {
                return null;
            }
            if (logger.isDebugEnabled()) {
                logger.debug("Queued URL: {}", urlQueueList);
            }
            waitingQueue.addAll(urlQueueList);
            if (!urlQueueList.isEmpty()) {
                try {
                    // delete from es
                    final BulkResponse response = getClient().get(c -> {
                        final BulkRequestBuilder bulkBuilder = c.prepareBulk();
                        for (final EsUrlQueue uq : urlQueueList) {
                            bulkBuilder.add(c.prepareDelete(index, type, uq.getId()));
                        }
                        return bulkBuilder.setRefreshPolicy(RefreshPolicy.IMMEDIATE).execute();
                    });
                    if (response.hasFailures()) {
                        logger.warn(response.buildFailureMessage());
                    }
                } catch (final Exception e) {
                    throw new EsAccessException("Failed to delete " + urlQueueList, e);
                }
            }
            urlQueue = waitingQueue.poll();
            if (urlQueue == null) {
                return null;
            }
        }
    }
    if (crawlingQueue.size() > maxCrawlingQueueSize) {
        crawlingQueue.poll();
    }
    crawlingQueue.add(urlQueue);
    return urlQueue;
}
Also used : EsAccessException(org.codelibs.fess.crawler.exception.EsAccessException) BulkResponse(org.elasticsearch.action.bulk.BulkResponse) EsUrlQueue(org.codelibs.fess.crawler.entity.EsUrlQueue) BulkRequestBuilder(org.elasticsearch.action.bulk.BulkRequestBuilder) EsAccessException(org.codelibs.fess.crawler.exception.EsAccessException)

Example 38 with BulkResponse

use of org.elasticsearch.action.bulk.BulkResponse in project samza by apache.

the class ElasticsearchSystemProducer method register.

@Override
public void register(final String source) {
    BulkProcessor.Listener listener = new BulkProcessor.Listener() {

        @Override
        public void beforeBulk(long executionId, BulkRequest request) {
        // Nothing to do.
        }

        @Override
        public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {
            boolean hasFatalError = false;
            // Do not consider version conficts to be errors. Ignore old versions
            if (response.hasFailures()) {
                for (BulkItemResponse itemResp : response.getItems()) {
                    if (itemResp.isFailed()) {
                        if (itemResp.getFailure().getStatus().equals(RestStatus.CONFLICT)) {
                            LOGGER.info("Failed to index document in Elasticsearch: " + itemResp.getFailureMessage());
                        } else {
                            hasFatalError = true;
                            LOGGER.error("Failed to index document in Elasticsearch: " + itemResp.getFailureMessage());
                        }
                    }
                }
            }
            if (hasFatalError) {
                sendFailed.set(true);
            } else {
                updateSuccessMetrics(response);
            }
        }

        @Override
        public void afterBulk(long executionId, BulkRequest request, Throwable failure) {
            LOGGER.error(failure.getMessage());
            thrown.compareAndSet(null, failure);
            sendFailed.set(true);
        }

        private void updateSuccessMetrics(BulkResponse response) {
            metrics.bulkSendSuccess.inc();
            int writes = 0;
            for (BulkItemResponse itemResp : response.getItems()) {
                if (itemResp.isFailed()) {
                    if (itemResp.getFailure().getStatus().equals(RestStatus.CONFLICT)) {
                        metrics.conflicts.inc();
                    }
                } else {
                    ActionResponse resp = itemResp.getResponse();
                    if (resp instanceof IndexResponse) {
                        writes += 1;
                        if (((IndexResponse) resp).isCreated()) {
                            metrics.inserts.inc();
                        } else {
                            metrics.updates.inc();
                        }
                    } else {
                        LOGGER.error("Unexpected Elasticsearch action response type: " + resp.getClass().getSimpleName());
                    }
                }
            }
            LOGGER.info(String.format("Wrote %s messages from %s to %s.", writes, source, system));
        }
    };
    sourceBulkProcessor.put(source, bulkProcessorFactory.getBulkProcessor(client, listener));
}
Also used : IndexResponse(org.elasticsearch.action.index.IndexResponse) BulkProcessor(org.elasticsearch.action.bulk.BulkProcessor) BulkRequest(org.elasticsearch.action.bulk.BulkRequest) BulkItemResponse(org.elasticsearch.action.bulk.BulkItemResponse) BulkResponse(org.elasticsearch.action.bulk.BulkResponse) ActionResponse(org.elasticsearch.action.ActionResponse)

Example 39 with BulkResponse

use of org.elasticsearch.action.bulk.BulkResponse in project samza by apache.

the class ElasticsearchSystemProducerTest method testIgnoreVersionConficts.

@Test
public void testIgnoreVersionConficts() throws Exception {
    ArgumentCaptor<BulkProcessor.Listener> listenerCaptor = ArgumentCaptor.forClass(BulkProcessor.Listener.class);
    when(BULK_PROCESSOR_FACTORY.getBulkProcessor(eq(CLIENT), listenerCaptor.capture())).thenReturn(processorOne);
    producer.register(SOURCE_ONE);
    BulkResponse response = getRespWithFailedDocument(RestStatus.CONFLICT);
    listenerCaptor.getValue().afterBulk(0, null, response);
    assertEquals(1, metrics.conflicts.getCount());
    producer.flush(SOURCE_ONE);
}
Also used : BulkProcessor(org.elasticsearch.action.bulk.BulkProcessor) BulkResponse(org.elasticsearch.action.bulk.BulkResponse) Test(org.junit.Test)

Example 40 with BulkResponse

use of org.elasticsearch.action.bulk.BulkResponse in project flink by apache.

the class Elasticsearch7SinkBuilder method getBulkProcessorBuilderFactory.

@Override
protected BulkProcessorBuilderFactory getBulkProcessorBuilderFactory() {
    return new BulkProcessorBuilderFactory() {

        @Override
        public BulkProcessor.Builder apply(RestHighLevelClient client, BulkProcessorConfig bulkProcessorConfig, BulkProcessor.Listener listener) {
            BulkProcessor.Builder builder = BulkProcessor.builder(new // This cannot be inlined as a
            BulkRequestConsumerFactory() {

                // lambda because then
                // deserialization fails
                @Override
                public void accept(BulkRequest bulkRequest, ActionListener<BulkResponse> bulkResponseActionListener) {
                    client.bulkAsync(bulkRequest, RequestOptions.DEFAULT, bulkResponseActionListener);
                }
            }, listener);
            if (bulkProcessorConfig.getBulkFlushMaxActions() != -1) {
                builder.setBulkActions(bulkProcessorConfig.getBulkFlushMaxActions());
            }
            if (bulkProcessorConfig.getBulkFlushMaxMb() != -1) {
                builder.setBulkSize(new ByteSizeValue(bulkProcessorConfig.getBulkFlushMaxMb(), ByteSizeUnit.MB));
            }
            if (bulkProcessorConfig.getBulkFlushInterval() != -1) {
                builder.setFlushInterval(new TimeValue(bulkProcessorConfig.getBulkFlushInterval()));
            }
            BackoffPolicy backoffPolicy;
            final TimeValue backoffDelay = new TimeValue(bulkProcessorConfig.getBulkFlushBackOffDelay());
            final int maxRetryCount = bulkProcessorConfig.getBulkFlushBackoffRetries();
            switch(bulkProcessorConfig.getFlushBackoffType()) {
                case CONSTANT:
                    backoffPolicy = BackoffPolicy.constantBackoff(backoffDelay, maxRetryCount);
                    break;
                case EXPONENTIAL:
                    backoffPolicy = BackoffPolicy.exponentialBackoff(backoffDelay, maxRetryCount);
                    break;
                case NONE:
                    backoffPolicy = BackoffPolicy.noBackoff();
                    break;
                default:
                    throw new IllegalArgumentException("Received unknown backoff policy type " + bulkProcessorConfig.getFlushBackoffType());
            }
            builder.setBackoffPolicy(backoffPolicy);
            return builder;
        }
    };
}
Also used : ActionListener(org.elasticsearch.action.ActionListener) ByteSizeValue(org.elasticsearch.common.unit.ByteSizeValue) BulkResponse(org.elasticsearch.action.bulk.BulkResponse) RestHighLevelClient(org.elasticsearch.client.RestHighLevelClient) BackoffPolicy(org.elasticsearch.action.bulk.BackoffPolicy) BulkProcessor(org.elasticsearch.action.bulk.BulkProcessor) BulkRequest(org.elasticsearch.action.bulk.BulkRequest) TimeValue(org.elasticsearch.core.TimeValue)

Aggregations

BulkResponse (org.elasticsearch.action.bulk.BulkResponse)111 BulkRequestBuilder (org.elasticsearch.action.bulk.BulkRequestBuilder)60 BulkItemResponse (org.elasticsearch.action.bulk.BulkItemResponse)40 BulkRequest (org.elasticsearch.action.bulk.BulkRequest)28 IOException (java.io.IOException)21 IndexRequest (org.elasticsearch.action.index.IndexRequest)20 XContentBuilder (org.elasticsearch.common.xcontent.XContentBuilder)17 IndexRequestBuilder (org.elasticsearch.action.index.IndexRequestBuilder)15 ArrayList (java.util.ArrayList)13 List (java.util.List)11 Map (java.util.Map)11 IndexResponse (org.elasticsearch.action.index.IndexResponse)10 Test (org.junit.Test)10 SearchResponse (org.elasticsearch.action.search.SearchResponse)9 SearchHit (org.elasticsearch.search.SearchHit)9 ElasticsearchException (org.elasticsearch.ElasticsearchException)8 ElasticsearchTimeoutException (org.elasticsearch.ElasticsearchTimeoutException)8 BulkProcessor (org.elasticsearch.action.bulk.BulkProcessor)8 EsRejectedExecutionException (org.elasticsearch.common.util.concurrent.EsRejectedExecutionException)8 DeleteRequest (org.elasticsearch.action.delete.DeleteRequest)7