Search in sources :

Example 1 with AsyncRDFHandler

use of org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler in project wikidata-query-rdf by wikimedia.

the class Munge method run.

public void run() throws RDFHandlerException, IOException, RDFParseException, InterruptedException {
    try {
        AsyncRDFHandler chunkWriter = AsyncRDFHandler.processAsync(new RDFChunkWriter(chunkFileFormat), false, BUFFER_SIZE);
        AtomicLong actualChunk = new AtomicLong(0);
        EntityMungingRdfHandler.EntityCountListener chunker = (entities) -> {
            long currentChunk = entities / chunkSize;
            if (currentChunk != actualChunk.get()) {
                actualChunk.set(currentChunk);
                // endRDF will cause RDFChunkWriter to start writing a new chunk
                chunkWriter.endRDF();
            }
        };
        EntityMungingRdfHandler munger = new EntityMungingRdfHandler(uris, this.munger, chunkWriter, chunker);
        RDFParser parser = RDFParserSuppliers.defaultRdfParser().get(AsyncRDFHandler.processAsync(new NormalizingRdfHandler(munger), true, BUFFER_SIZE));
        parser.parse(from, uris.root());
        // thread:main: parser -> AsyncRDFHandler -> queue
        // thread:replayer1: Normalizing/Munging -> AsyncRDFHandler -> queue
        // thread:replayer2: RDFChunkWriter -> RDFWriter -> IO
        chunkWriter.waitForCompletion();
    } finally {
        try {
            from.close();
        } catch (IOException e) {
            log.error("Error closing input", e);
        }
    }
}
Also used : Statement(org.openrdf.model.Statement) Munger(org.wikidata.query.rdf.tool.rdf.Munger) LoggerFactory(org.slf4j.LoggerFactory) NormalizingRdfHandler(org.wikidata.query.rdf.tool.rdf.NormalizingRdfHandler) LinkedHashMap(java.util.LinkedHashMap) RDFFormat(org.openrdf.rio.RDFFormat) Locale(java.util.Locale) Map(java.util.Map) MungeOptions(org.wikidata.query.rdf.tool.options.MungeOptions) BasicWriterSettings(org.openrdf.rio.helpers.BasicWriterSettings) AsyncRDFHandler(org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler) OptionsUtils.mungerFromOptions(org.wikidata.query.rdf.tool.options.OptionsUtils.mungerFromOptions) FALSE(java.lang.Boolean.FALSE) Logger(org.slf4j.Logger) RDFHandlerException(org.openrdf.rio.RDFHandlerException) OptionsUtils(org.wikidata.query.rdf.tool.options.OptionsUtils) RDFParserSuppliers(org.wikidata.query.rdf.tool.rdf.RDFParserSuppliers) WriterConfig(org.openrdf.rio.WriterConfig) IOException(java.io.IOException) Rio(org.openrdf.rio.Rio) Reader(java.io.Reader) PrefixRecordingRdfHandler(org.wikidata.query.rdf.tool.rdf.PrefixRecordingRdfHandler) AtomicLong(java.util.concurrent.atomic.AtomicLong) RDFParser(org.openrdf.rio.RDFParser) OptionsUtils.handleOptions(org.wikidata.query.rdf.tool.options.OptionsUtils.handleOptions) RDFParseException(org.openrdf.rio.RDFParseException) UrisScheme(org.wikidata.query.rdf.common.uri.UrisScheme) Writer(java.io.Writer) EntityMungingRdfHandler(org.wikidata.query.rdf.tool.rdf.EntityMungingRdfHandler) RDFHandler(org.openrdf.rio.RDFHandler) RDFWriter(org.openrdf.rio.RDFWriter) AtomicLong(java.util.concurrent.atomic.AtomicLong) EntityMungingRdfHandler(org.wikidata.query.rdf.tool.rdf.EntityMungingRdfHandler) AsyncRDFHandler(org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler) IOException(java.io.IOException) RDFParser(org.openrdf.rio.RDFParser) NormalizingRdfHandler(org.wikidata.query.rdf.tool.rdf.NormalizingRdfHandler)

Aggregations

IOException (java.io.IOException)1 Reader (java.io.Reader)1 Writer (java.io.Writer)1 FALSE (java.lang.Boolean.FALSE)1 LinkedHashMap (java.util.LinkedHashMap)1 Locale (java.util.Locale)1 Map (java.util.Map)1 AtomicLong (java.util.concurrent.atomic.AtomicLong)1 Statement (org.openrdf.model.Statement)1 RDFFormat (org.openrdf.rio.RDFFormat)1 RDFHandler (org.openrdf.rio.RDFHandler)1 RDFHandlerException (org.openrdf.rio.RDFHandlerException)1 RDFParseException (org.openrdf.rio.RDFParseException)1 RDFParser (org.openrdf.rio.RDFParser)1 RDFWriter (org.openrdf.rio.RDFWriter)1 Rio (org.openrdf.rio.Rio)1 WriterConfig (org.openrdf.rio.WriterConfig)1 BasicWriterSettings (org.openrdf.rio.helpers.BasicWriterSettings)1 Logger (org.slf4j.Logger)1 LoggerFactory (org.slf4j.LoggerFactory)1