Examples with ChunkMeta - io.georocket.storage.ChunkMeta

Example 1 with ChunkMeta

use of io.georocket.storage.ChunkMeta in project georocket by georocket.

the class StoreEndpoint method doMerge.

/**
 * Perform a search and merge all retrieved chunks using the given merger
 * @param merger the merger
 * @param data Data to merge into the response
 * @param out the response to write the merged chunks to
 * @return a single that will emit one item when all chunks have been merged
 */
private Single<Void> doMerge(Merger<ChunkMeta> merger, Single<StoreCursor> data, WriteStream<Buffer> out) {
    return data.map(RxStoreCursor::new).flatMapObservable(RxStoreCursor::toObservable).flatMap(p -> store.rxGetOne(p.getRight()).flatMapObservable(crs -> merger.merge(crs, p.getLeft(), out).map(// left: count, right: not_accepted
    v -> Pair.of(1L, 0L)).onErrorResumeNext(t -> {
        if (t instanceof IllegalStateException) {
            // ignore it, but emit a warning later
            return Observable.just(Pair.of(0L, 1L));
        }
        return Observable.error(t);
    }).doOnTerminate(() -> {
        // don't forget to close the chunk!
        crs.close();
    })), 1).defaultIfEmpty(Pair.of(0L, 0L)).reduce((p1, p2) -> Pair.of(p1.getLeft() + p2.getLeft(), p1.getRight() + p2.getRight())).flatMap(p -> {
        long count = p.getLeft();
        long notaccepted = p.getRight();
        if (notaccepted > 0) {
            log.warn("Could not merge " + notaccepted + " chunks " + "because the merger did not accept them. Most likely " + "these are new chunks that were added while the " + "merge was in progress. If this worries you, just " + "repeat the request.");
        }
        if (count > 0) {
            merger.finish(out);
            return Observable.just(null);
        } else {
            return Observable.error(new FileNotFoundException("Not Found"));
        }
    }).toSingle().map(v -> null);
}

Also used : Arrays(java.util.Arrays) Router(io.vertx.ext.web.Router) RxStoreCursor(io.georocket.storage.RxStoreCursor) RoutingContext(io.vertx.ext.web.RoutingContext) StringUtils(org.apache.commons.lang3.StringUtils) ChunkMeta(io.georocket.storage.ChunkMeta) RxStore(io.georocket.storage.RxStore) StoreCursor(io.georocket.storage.StoreCursor) Single(rx.Single) Pair(org.apache.commons.lang3.tuple.Pair) Map(java.util.Map) Pump(io.vertx.core.streams.Pump) JsonObject(io.vertx.core.json.JsonObject) Logger(io.vertx.core.logging.Logger) Splitter(com.google.common.base.Splitter) OpenOptions(io.vertx.core.file.OpenOptions) ContentType(org.apache.http.entity.ContentType) UUID(java.util.UUID) Future(io.vertx.core.Future) FileNotFoundException(java.io.FileNotFoundException) List(java.util.List) Buffer(io.vertx.core.buffer.Buffer) HttpServerResponse(io.vertx.core.http.HttpServerResponse) FileSystem(io.vertx.core.file.FileSystem) RxHelper(io.vertx.rx.java.RxHelper) MultiMerger(io.georocket.output.MultiMerger) Pattern(java.util.regex.Pattern) AddressConstants(io.georocket.constants.AddressConstants) AsyncFile(io.vertx.core.file.AsyncFile) HttpServerRequest(io.vertx.core.http.HttpServerRequest) MimeTypeUtils(io.georocket.util.MimeTypeUtils) HashMap(java.util.HashMap) LoggerFactory(io.vertx.core.logging.LoggerFactory) Observable(rx.Observable) ServerAPIException(io.georocket.ServerAPIException) WriteStream(io.vertx.core.streams.WriteStream) StoreFactory(io.georocket.storage.StoreFactory) AsyncResult(io.vertx.core.AsyncResult) HttpException(io.georocket.util.HttpException) ParseException(org.apache.http.ParseException) ObservableFuture(io.vertx.rx.java.ObservableFuture) Vertx(io.vertx.core.Vertx) IOException(java.io.IOException) StringEscapeUtils(org.apache.commons.text.StringEscapeUtils) RxAsyncCursor(io.georocket.storage.RxAsyncCursor) File(java.io.File) JsonArray(io.vertx.core.json.JsonArray) ObjectId(org.bson.types.ObjectId) Merger(io.georocket.output.Merger) Handler(io.vertx.core.Handler) ConfigConstants(io.georocket.constants.ConfigConstants) FileNotFoundException(java.io.FileNotFoundException) RxStoreCursor(io.georocket.storage.RxStoreCursor)

Example 2 with ChunkMeta

use of io.georocket.storage.ChunkMeta in project georocket by georocket.

the class MultiMergerTest method doMerge.

private void doMerge(TestContext context, Observable<Buffer> chunks, Observable<ChunkMeta> metas, String jsonContents) {
    MultiMerger m = new MultiMerger();
    BufferWriteStream bws = new BufferWriteStream();
    Async async = context.async();
    metas.flatMap(meta -> m.init(meta).map(v -> meta)).toList().flatMap(l -> chunks.map(DelegateChunkReadStream::new).<ChunkMeta, Pair<ChunkReadStream, ChunkMeta>>zipWith(l, Pair::of)).flatMap(p -> m.merge(p.getLeft(), p.getRight(), bws)).last().subscribe(v -> {
        m.finish(bws);
        context.assertEquals(jsonContents, bws.getBuffer().toString("utf-8"));
        async.complete();
    }, err -> {
        context.fail(err);
    });
}

Also used : TestContext(io.vertx.ext.unit.TestContext) Async(io.vertx.ext.unit.Async) Arrays(java.util.Arrays) GeoJsonChunkMeta(io.georocket.storage.GeoJsonChunkMeta) RunWith(org.junit.runner.RunWith) XMLStartElement(io.georocket.util.XMLStartElement) Test(org.junit.Test) VertxUnitRunner(io.vertx.ext.unit.junit.VertxUnitRunner) XMLChunkMeta(io.georocket.storage.XMLChunkMeta) ChunkMeta(io.georocket.storage.ChunkMeta) Observable(rx.Observable) Rule(org.junit.Rule) DelegateChunkReadStream(io.georocket.util.io.DelegateChunkReadStream) Pair(org.apache.commons.lang3.tuple.Pair) Buffer(io.vertx.core.buffer.Buffer) BufferWriteStream(io.georocket.util.io.BufferWriteStream) RunTestOnContext(io.vertx.ext.unit.junit.RunTestOnContext) ChunkReadStream(io.georocket.storage.ChunkReadStream) Async(io.vertx.ext.unit.Async) BufferWriteStream(io.georocket.util.io.BufferWriteStream) GeoJsonChunkMeta(io.georocket.storage.GeoJsonChunkMeta) XMLChunkMeta(io.georocket.storage.XMLChunkMeta) ChunkMeta(io.georocket.storage.ChunkMeta) Pair(org.apache.commons.lang3.tuple.Pair)

Example 3 with ChunkMeta

use of io.georocket.storage.ChunkMeta in project georocket by georocket.

the class IndexerVerticle method onQuery.

/**
 * Write result of a query given the Elasticsearch response
 * @param body the message containing the query
 * @return an observable that emits the results of the query
 */
private Observable<JsonObject> onQuery(JsonObject body) {
    String search = body.getString("search");
    String path = body.getString("path");
    String scrollId = body.getString("scrollId");
    int pageSize = body.getInteger("size", 100);
    // one minute
    String timeout = "1m";
    JsonObject parameters = new JsonObject().put("size", pageSize);
    Observable<JsonObject> observable;
    if (scrollId == null) {
        // Execute a new search. Use a post_filter because we only want to get
        // a yes/no answer and no scoring (i.e. we only want to get matching
        // documents and not those that likely match). For the difference between
        // query and post_filter see the Elasticsearch documentation.
        JsonObject postFilter;
        try {
            postFilter = queryCompiler.compileQuery(search, path);
        } catch (Throwable t) {
            return Observable.error(t);
        }
        observable = client.beginScroll(TYPE_NAME, null, postFilter, parameters, timeout);
    } else {
        // continue searching
        observable = client.continueScroll(scrollId, timeout);
    }
    return observable.map(sr -> {
        // iterate through all hits and convert them to JSON
        JsonObject hits = sr.getJsonObject("hits");
        long totalHits = hits.getLong("total");
        JsonArray resultHits = new JsonArray();
        JsonArray hitsHits = hits.getJsonArray("hits");
        for (Object o : hitsHits) {
            JsonObject hit = (JsonObject) o;
            String id = hit.getString("_id");
            JsonObject source = hit.getJsonObject("_source");
            JsonObject jsonMeta = source.getJsonObject("chunkMeta");
            ChunkMeta meta = getMeta(jsonMeta);
            JsonObject obj = meta.toJsonObject().put("id", id);
            resultHits.add(obj);
        }
        // create result and send it to the client
        return new JsonObject().put("totalHits", totalHits).put("hits", resultHits).put("scrollId", sr.getString("_scroll_id"));
    });
}

Also used : JsonArray(io.vertx.core.json.JsonArray) JsonObject(io.vertx.core.json.JsonObject) NoStackTraceThrowable(io.vertx.core.impl.NoStackTraceThrowable) JsonObject(io.vertx.core.json.JsonObject) GeoJsonChunkMeta(io.georocket.storage.GeoJsonChunkMeta) XMLChunkMeta(io.georocket.storage.XMLChunkMeta) ChunkMeta(io.georocket.storage.ChunkMeta) JsonChunkMeta(io.georocket.storage.JsonChunkMeta)

Example 4 with ChunkMeta

use of io.georocket.storage.ChunkMeta in project georocket by georocket.

the class IndexerVerticle method onAdd.

/**
 * Will be called when chunks should be added to the index
 * @param messages the list of add messages that contain the paths to
 * the chunks to be indexed
 * @return an observable that completes when the operation has finished
 */
private Observable<Void> onAdd(List<Message<JsonObject>> messages) {
    return Observable.from(messages).flatMap(msg -> {
        // get path to chunk from message
        JsonObject body = msg.body();
        String path = body.getString("path");
        if (path == null) {
            msg.fail(400, "Missing path to the chunk to index");
            return Observable.empty();
        }
        // get chunk metadata
        JsonObject meta = body.getJsonObject("meta");
        if (meta == null) {
            msg.fail(400, "Missing metadata for chunk " + path);
            return Observable.empty();
        }
        // get tags
        JsonArray tagsArr = body.getJsonArray("tags");
        List<String> tags = tagsArr != null ? tagsArr.stream().flatMap(o -> o != null ? Stream.of(o.toString()) : Stream.of()).collect(Collectors.toList()) : null;
        // get properties
        JsonObject propertiesObj = body.getJsonObject("properties");
        Map<String, Object> properties = propertiesObj != null ? propertiesObj.getMap() : null;
        // get fallback CRS
        String fallbackCRSString = body.getString("fallbackCRSString");
        log.trace("Indexing " + path);
        String correlationId = body.getString("correlationId");
        String filename = body.getString("filename");
        long timestamp = body.getLong("timestamp", System.currentTimeMillis());
        ChunkMeta chunkMeta = getMeta(meta);
        IndexMeta indexMeta = new IndexMeta(correlationId, filename, timestamp, tags, properties, fallbackCRSString);
        // open chunk and create IndexRequest
        return openChunkToDocument(path, chunkMeta, indexMeta).map(doc -> Tuple.tuple(path, new JsonObject(doc), msg)).onErrorResumeNext(err -> {
            msg.fail(throwableToCode(err), throwableToMessage(err, ""));
            return Observable.empty();
        });
    }).toList().flatMap(l -> {
        if (!l.isEmpty()) {
            return insertDocuments(TYPE_NAME, l);
        }
        return Observable.empty();
    });
}

Also used : JsonArray(io.vertx.core.json.JsonArray) MetaIndexer(io.georocket.index.xml.MetaIndexer) GeoJsonChunkMeta(io.georocket.storage.GeoJsonChunkMeta) IndexMeta(io.georocket.storage.IndexMeta) StreamEvent(io.georocket.util.StreamEvent) XMLChunkMeta(io.georocket.storage.XMLChunkMeta) XMLParserOperator(io.georocket.util.XMLParserOperator) ChunkMeta(io.georocket.storage.ChunkMeta) RxStore(io.georocket.storage.RxStore) Tuple2(org.jooq.lambda.tuple.Tuple2) Tuple3(org.jooq.lambda.tuple.Tuple3) JsonParserOperator(io.georocket.util.JsonParserOperator) Map(java.util.Map) JsonObject(io.vertx.core.json.JsonObject) Logger(io.vertx.core.logging.Logger) MetaIndexerFactory(io.georocket.index.xml.MetaIndexerFactory) Message(io.vertx.rxjava.core.eventbus.Message) ServiceLoader(java.util.ServiceLoader) Collectors(java.util.stream.Collectors) Future(io.vertx.core.Future) List(java.util.List) ElasticsearchClientFactory(io.georocket.index.elasticsearch.ElasticsearchClientFactory) Stream(java.util.stream.Stream) Tuple(org.jooq.lambda.tuple.Tuple) MapUtils(io.georocket.util.MapUtils) Buffer(io.vertx.core.buffer.Buffer) MimeTypeUtils.belongsTo(io.georocket.util.MimeTypeUtils.belongsTo) RxHelper(io.vertx.rx.java.RxHelper) AddressConstants(io.georocket.constants.AddressConstants) ChunkReadStream(io.georocket.storage.ChunkReadStream) Operator(rx.Observable.Operator) HashMap(java.util.HashMap) Seq(org.jooq.lambda.Seq) DefaultQueryCompiler(io.georocket.query.DefaultQueryCompiler) LoggerFactory(io.vertx.core.logging.LoggerFactory) ArrayList(java.util.ArrayList) AbstractVerticle(io.vertx.rxjava.core.AbstractVerticle) Observable(rx.Observable) Func1(rx.functions.Func1) ImmutableList(com.google.common.collect.ImmutableList) XMLIndexerFactory(io.georocket.index.xml.XMLIndexerFactory) StoreFactory(io.georocket.storage.StoreFactory) JsonIndexerFactory(io.georocket.index.xml.JsonIndexerFactory) NoStackTraceThrowable(io.vertx.core.impl.NoStackTraceThrowable) ThrowableHelper.throwableToMessage(io.georocket.util.ThrowableHelper.throwableToMessage) StreamIndexer(io.georocket.index.xml.StreamIndexer) TimeUnit(java.util.concurrent.TimeUnit) JsonArray(io.vertx.core.json.JsonArray) ThrowableHelper.throwableToCode(io.georocket.util.ThrowableHelper.throwableToCode) ElasticsearchClient(io.georocket.index.elasticsearch.ElasticsearchClient) RxUtils(io.georocket.util.RxUtils) DefaultMetaIndexerFactory(io.georocket.index.generic.DefaultMetaIndexerFactory) ConfigConstants(io.georocket.constants.ConfigConstants) JsonChunkMeta(io.georocket.storage.JsonChunkMeta) JsonObject(io.vertx.core.json.JsonObject) JsonObject(io.vertx.core.json.JsonObject) IndexMeta(io.georocket.storage.IndexMeta) GeoJsonChunkMeta(io.georocket.storage.GeoJsonChunkMeta) XMLChunkMeta(io.georocket.storage.XMLChunkMeta) ChunkMeta(io.georocket.storage.ChunkMeta) JsonChunkMeta(io.georocket.storage.JsonChunkMeta)

Example 5 with ChunkMeta

use of io.georocket.storage.ChunkMeta in project georocket by georocket.

the class StoreEndpoint method getChunks.

/**
 * Retrieve all chunks matching the specified query and path
 * @param context the routing context
 */
private void getChunks(RoutingContext context) {
    HttpServerResponse response = context.response();
    Single<StoreCursor> data = prepareCursor(context);
    // Our responses must always be chunked because we cannot calculate
    // the exact content-length beforehand. We perform two searches, one to
    // initialize the merger and one to do the actual merge. The problem is
    // that the result set may change between these two searches and so we
    // cannot calculate the content-length just from looking at the result
    // from the first search.
    response.setChunked(true);
    // perform two searches: first initialize the merger and then
    // merge all retrieved chunks
    Merger<ChunkMeta> merger = createMerger(context);
    initializeMerger(merger, data).flatMapSingle(v -> doMerge(merger, data, response)).subscribe(v -> {
        response.end();
    }, err -> {
        if (!(err instanceof FileNotFoundException)) {
            log.error("Could not perform query", err);
        }
        fail(response, err);
    });
}

Also used : Arrays(java.util.Arrays) Router(io.vertx.ext.web.Router) RxStoreCursor(io.georocket.storage.RxStoreCursor) RoutingContext(io.vertx.ext.web.RoutingContext) StringUtils(org.apache.commons.lang3.StringUtils) ChunkMeta(io.georocket.storage.ChunkMeta) RxStore(io.georocket.storage.RxStore) StoreCursor(io.georocket.storage.StoreCursor) Single(rx.Single) Pair(org.apache.commons.lang3.tuple.Pair) Map(java.util.Map) Pump(io.vertx.core.streams.Pump) JsonObject(io.vertx.core.json.JsonObject) Logger(io.vertx.core.logging.Logger) Splitter(com.google.common.base.Splitter) OpenOptions(io.vertx.core.file.OpenOptions) ContentType(org.apache.http.entity.ContentType) UUID(java.util.UUID) Future(io.vertx.core.Future) FileNotFoundException(java.io.FileNotFoundException) List(java.util.List) Buffer(io.vertx.core.buffer.Buffer) HttpServerResponse(io.vertx.core.http.HttpServerResponse) FileSystem(io.vertx.core.file.FileSystem) RxHelper(io.vertx.rx.java.RxHelper) MultiMerger(io.georocket.output.MultiMerger) Pattern(java.util.regex.Pattern) AddressConstants(io.georocket.constants.AddressConstants) AsyncFile(io.vertx.core.file.AsyncFile) HttpServerRequest(io.vertx.core.http.HttpServerRequest) MimeTypeUtils(io.georocket.util.MimeTypeUtils) HashMap(java.util.HashMap) LoggerFactory(io.vertx.core.logging.LoggerFactory) Observable(rx.Observable) ServerAPIException(io.georocket.ServerAPIException) WriteStream(io.vertx.core.streams.WriteStream) StoreFactory(io.georocket.storage.StoreFactory) AsyncResult(io.vertx.core.AsyncResult) HttpException(io.georocket.util.HttpException) ParseException(org.apache.http.ParseException) ObservableFuture(io.vertx.rx.java.ObservableFuture) Vertx(io.vertx.core.Vertx) IOException(java.io.IOException) StringEscapeUtils(org.apache.commons.text.StringEscapeUtils) RxAsyncCursor(io.georocket.storage.RxAsyncCursor) File(java.io.File) JsonArray(io.vertx.core.json.JsonArray) ObjectId(org.bson.types.ObjectId) Merger(io.georocket.output.Merger) Handler(io.vertx.core.Handler) ConfigConstants(io.georocket.constants.ConfigConstants) HttpServerResponse(io.vertx.core.http.HttpServerResponse) RxStoreCursor(io.georocket.storage.RxStoreCursor) StoreCursor(io.georocket.storage.StoreCursor) FileNotFoundException(java.io.FileNotFoundException) ChunkMeta(io.georocket.storage.ChunkMeta)

Aggregations

ChunkMeta (io.georocket.storage.ChunkMeta)5 Buffer (io.vertx.core.buffer.Buffer)4 JsonArray (io.vertx.core.json.JsonArray)4 JsonObject (io.vertx.core.json.JsonObject)4 AddressConstants (io.georocket.constants.AddressConstants)3 ConfigConstants (io.georocket.constants.ConfigConstants)3 GeoJsonChunkMeta (io.georocket.storage.GeoJsonChunkMeta)3 RxStore (io.georocket.storage.RxStore)3 StoreFactory (io.georocket.storage.StoreFactory)3 XMLChunkMeta (io.georocket.storage.XMLChunkMeta)3 Future (io.vertx.core.Future)3 Logger (io.vertx.core.logging.Logger)3 LoggerFactory (io.vertx.core.logging.LoggerFactory)3 RxHelper (io.vertx.rx.java.RxHelper)3 Arrays (java.util.Arrays)3 HashMap (java.util.HashMap)3 List (java.util.List)3 Observable (rx.Observable)3 Splitter (com.google.common.base.Splitter)2 ServerAPIException (io.georocket.ServerAPIException)2