Search in sources :

Example 1 with RdfRepository

use of org.wikidata.query.rdf.tool.rdf.RdfRepository in project wikidata-query-rdf by wikimedia.

the class UpdaterUnitTest method testUpdateLeftOffTime.

@Test
public void testUpdateLeftOffTime() {
    Instant leftOffInstant1 = Instant.ofEpochMilli(25);
    Instant leftOffInstant2 = Instant.ofEpochSecond(40);
    ImmutableList<Change> changes = ImmutableList.of(new Change("Q2", 1, Instant.ofEpochSecond(10), 2), new Change("Q3", 2, Instant.ofEpochMilli(20), 3));
    TestChange batch1 = new TestChange(changes, 20, leftOffInstant1, false);
    changes = ImmutableList.of(new Change("Q2", 1, Instant.ofEpochSecond(30), 4), new Change("Q3", 2, Instant.ofEpochMilli(40), 5));
    TestChange batch2 = new TestChange(changes, 20, leftOffInstant2, true);
    TestChangeSource source = new TestChangeSource(Arrays.asList(batch1, batch2));
    WikibaseRepository wbRepo = mock(WikibaseRepository.class);
    RdfRepository rdfRepo = mock(RdfRepository.class);
    CollectedUpdateMetrics mutationCountOnlyMetrics = CollectedUpdateMetrics.getMutationCountOnlyMetrics(0);
    when(rdfRepo.syncFromChanges(anyCollectionOf(Change.class), anyBoolean())).thenReturn(mutationCountOnlyMetrics);
    Munger munger = Munger.builder(UrisSchemeFactory.WIKIDATA).build();
    ExecutorService executorService = Executors.newFixedThreadPool(2, (r) -> new Thread(r, "Thread-" + this.getClass().getSimpleName()));
    MetricRegistry metricRegistry = new MetricRegistry();
    Updater<TestChange> updater = new Updater<>(source, wbRepo, rdfRepo, munger, executorService, true, 100, UrisSchemeFactory.WIKIDATA, false, metricRegistry);
    updater.run();
    verify(rdfRepo, times(2)).updateLeftOffTime(lestOffDateCaptor.capture());
    assertThat(lestOffDateCaptor.getAllValues()).containsExactly(leftOffInstant1.minusSeconds(1), leftOffInstant2.minusSeconds(1));
    assertThat(source.isBatchMarkedDone(batch1)).isTrue();
    assertThat(source.isBatchMarkedDone(batch2)).isTrue();
}
Also used : CollectedUpdateMetrics(org.wikidata.query.rdf.tool.rdf.CollectedUpdateMetrics) Instant(java.time.Instant) Munger(org.wikidata.query.rdf.tool.rdf.Munger) MetricRegistry(com.codahale.metrics.MetricRegistry) WikibaseRepository(org.wikidata.query.rdf.tool.wikibase.WikibaseRepository) RdfRepository(org.wikidata.query.rdf.tool.rdf.RdfRepository) Change(org.wikidata.query.rdf.tool.change.Change) ExecutorService(java.util.concurrent.ExecutorService) Test(org.junit.Test)

Example 2 with RdfRepository

use of org.wikidata.query.rdf.tool.rdf.RdfRepository in project wikidata-query-rdf by wikimedia.

the class Update method initialize.

private static Updater<? extends Change.Batch> initialize(String[] args, Closer closer) throws URISyntaxException {
    try {
        UpdateOptions options = handleOptions(UpdateOptions.class, args);
        MetricRegistry metricRegistry = createMetricRegistry(closer, options.metricDomain());
        StreamDumper wikibaseStreamDumper = createStreamDumper(dumpDirPath(options));
        WikibaseRepository wikibaseRepository = new WikibaseRepository(UpdateOptions.uris(options), options.constraints(), metricRegistry, wikibaseStreamDumper, UpdateOptions.revisionDuration(options), RDFParserSuppliers.defaultRdfParser());
        closer.register(wikibaseRepository);
        UrisScheme wikibaseUris = WikibaseOptions.wikibaseUris(options);
        URI root = wikibaseRepository.getUris().builder().build();
        URI sparqlUri = UpdateOptions.sparqlUri(options);
        HttpClient httpClient = buildHttpClient(getHttpProxyHost(), getHttpProxyPort());
        closer.register(wrapHttpClient(httpClient));
        Retryer<ContentResponse> retryer = buildHttpClientRetryer();
        Duration rdfClientTimeout = getRdfClientTimeout();
        RdfClient rdfClient = new RdfClient(httpClient, sparqlUri, retryer, rdfClientTimeout);
        RdfRepository rdfRepository = new RdfRepository(wikibaseUris, rdfClient, MAX_FORM_CONTENT_SIZE);
        Instant startTime = getStartTime(startInstant(options), rdfRepository, options.init());
        Change.Source<? extends Change.Batch> changeSource = buildChangeSource(options, startTime, wikibaseRepository, rdfClient, root, metricRegistry);
        Munger munger = mungerFromOptions(options);
        ExecutorService updaterExecutorService = createUpdaterExecutorService(options.threadCount());
        Updater<? extends Change.Batch> updater = createUpdater(wikibaseRepository, wikibaseUris, rdfRepository, changeSource, munger, updaterExecutorService, options.importAsync(), options.pollDelay(), options.verify(), metricRegistry);
        closer.register(updater);
        return updater;
    } catch (Exception e) {
        log.error("Error during initialization.", e);
        throw e;
    }
}
Also used : ContentResponse(org.eclipse.jetty.client.api.ContentResponse) UrisScheme(org.wikidata.query.rdf.common.uri.UrisScheme) MetricRegistry(com.codahale.metrics.MetricRegistry) UpdateOptions.startInstant(org.wikidata.query.rdf.tool.options.UpdateOptions.startInstant) Instant(java.time.Instant) Munger(org.wikidata.query.rdf.tool.rdf.Munger) WikibaseRepository(org.wikidata.query.rdf.tool.wikibase.WikibaseRepository) Duration(java.time.Duration) RdfRepository(org.wikidata.query.rdf.tool.rdf.RdfRepository) RdfClient(org.wikidata.query.rdf.tool.rdf.client.RdfClient) Change(org.wikidata.query.rdf.tool.change.Change) URI(java.net.URI) UpdateOptions(org.wikidata.query.rdf.tool.options.UpdateOptions) URISyntaxException(java.net.URISyntaxException) IOException(java.io.IOException) FileStreamDumper(org.wikidata.query.rdf.tool.utils.FileStreamDumper) StreamDumper(org.wikidata.query.rdf.tool.utils.StreamDumper) NullStreamDumper(org.wikidata.query.rdf.tool.utils.NullStreamDumper) HttpClient(org.eclipse.jetty.client.HttpClient) HttpClientUtils.buildHttpClient(org.wikidata.query.rdf.tool.HttpClientUtils.buildHttpClient) ExecutorService(java.util.concurrent.ExecutorService)

Aggregations

MetricRegistry (com.codahale.metrics.MetricRegistry)2 Instant (java.time.Instant)2 ExecutorService (java.util.concurrent.ExecutorService)2 Change (org.wikidata.query.rdf.tool.change.Change)2 Munger (org.wikidata.query.rdf.tool.rdf.Munger)2 RdfRepository (org.wikidata.query.rdf.tool.rdf.RdfRepository)2 WikibaseRepository (org.wikidata.query.rdf.tool.wikibase.WikibaseRepository)2 IOException (java.io.IOException)1 URI (java.net.URI)1 URISyntaxException (java.net.URISyntaxException)1 Duration (java.time.Duration)1 HttpClient (org.eclipse.jetty.client.HttpClient)1 ContentResponse (org.eclipse.jetty.client.api.ContentResponse)1 Test (org.junit.Test)1 UrisScheme (org.wikidata.query.rdf.common.uri.UrisScheme)1 HttpClientUtils.buildHttpClient (org.wikidata.query.rdf.tool.HttpClientUtils.buildHttpClient)1 UpdateOptions (org.wikidata.query.rdf.tool.options.UpdateOptions)1 UpdateOptions.startInstant (org.wikidata.query.rdf.tool.options.UpdateOptions.startInstant)1 CollectedUpdateMetrics (org.wikidata.query.rdf.tool.rdf.CollectedUpdateMetrics)1 RdfClient (org.wikidata.query.rdf.tool.rdf.client.RdfClient)1