Search in sources :

Example 1 with ExaminedCellLimit

use of com.palantir.atlasdb.sweep.CellsToSweepPartitioningIterator.ExaminedCellLimit in project atlasdb by palantir.

the class SweepTaskRunner method doRun.

private SweepResults doRun(TableReference tableRef, SweepBatchConfig batchConfig, byte[] startRow, RunType runType, Sweeper sweeper) {
    Stopwatch watch = Stopwatch.createStarted();
    long timeSweepStarted = System.currentTimeMillis();
    log.info("Beginning iteration of sweep for table {} starting at row {}", LoggingArgs.tableRef(tableRef), UnsafeArg.of("startRow", PtBytes.encodeHexString(startRow)));
    // Earliest start timestamp of any currently open transaction, with two caveats:
    // (1) unreadableTimestamps are calculated via wall-clock time, and so may not be correct
    // under pathological clock conditions
    // (2) immutableTimestamps do not account for locks have timed out after checking their locks;
    // such a transaction may have a start timestamp less than the immutableTimestamp, and it
    // could still get successfully committed (its commit timestamp may or may not be less than
    // the immutableTimestamp
    // Note that this is fine, because we'll either
    // (1) force old readers to abort (if they read a garbage collection sentinel), or
    // (2) force old writers to retry (note that we must roll back any uncommitted transactions that
    // we encounter
    long sweepTs = sweeper.getSweepTimestampSupplier().getSweepTimestamp(unreadableTimestampSupplier, immutableTimestampSupplier);
    CandidateCellForSweepingRequest request = ImmutableCandidateCellForSweepingRequest.builder().startRowInclusive(startRow).batchSizeHint(batchConfig.candidateBatchSize()).maxTimestampExclusive(sweepTs).shouldCheckIfLatestValueIsEmpty(sweeper.shouldSweepLastCommitted()).shouldDeleteGarbageCollectionSentinels(!sweeper.shouldAddSentinels()).build();
    SweepableCellFilter sweepableCellFilter = new SweepableCellFilter(transactionService, sweeper, sweepTs);
    try (ClosableIterator<List<CandidateCellForSweeping>> candidates = keyValueService.getCandidateCellsForSweeping(tableRef, request)) {
        ExaminedCellLimit limit = new ExaminedCellLimit(startRow, batchConfig.maxCellTsPairsToExamine());
        Iterator<BatchOfCellsToSweep> batchesToSweep = getBatchesToSweep(candidates, batchConfig, sweepableCellFilter, limit);
        long totalCellTsPairsExamined = 0;
        long totalCellTsPairsDeleted = 0;
        metricsManager.ifPresent(SweepMetricsManager::resetBeforeDeleteBatch);
        byte[] lastRow = startRow;
        while (batchesToSweep.hasNext()) {
            BatchOfCellsToSweep batch = batchesToSweep.next();
            /*
                 * At this point cells were merged in batches of at least deleteBatchSize blocks per batch. Therefore we
                 * expect most batches to have slightly more than deleteBatchSize blocks. Partitioning such batches with
                 * deleteBatchSize as a limit results in a small second batch, which is bad for performance reasons.
                 * Therefore, deleteBatchSize is doubled.
                 */
            long cellsDeleted = sweepBatch(tableRef, batch.cells(), runType, 2 * batchConfig.deleteBatchSize());
            totalCellTsPairsDeleted += cellsDeleted;
            long cellsExamined = batch.numCellTsPairsExamined();
            totalCellTsPairsExamined += cellsExamined;
            metricsManager.ifPresent(manager -> manager.updateAfterDeleteBatch(cellsExamined, cellsDeleted));
            lastRow = batch.lastCellExamined().getRowName();
        }
        return SweepResults.builder().previousStartRow(Optional.of(startRow)).nextStartRow(Arrays.equals(startRow, lastRow) ? Optional.empty() : Optional.of(lastRow)).cellTsPairsExamined(totalCellTsPairsExamined).staleValuesDeleted(totalCellTsPairsDeleted).minSweptTimestamp(sweepTs).timeInMillis(watch.elapsed(TimeUnit.MILLISECONDS)).timeSweepStarted(timeSweepStarted).build();
    }
}
Also used : ExaminedCellLimit(com.palantir.atlasdb.sweep.CellsToSweepPartitioningIterator.ExaminedCellLimit) ImmutableCandidateCellForSweepingRequest(com.palantir.atlasdb.keyvalue.api.ImmutableCandidateCellForSweepingRequest) CandidateCellForSweepingRequest(com.palantir.atlasdb.keyvalue.api.CandidateCellForSweepingRequest) Stopwatch(com.google.common.base.Stopwatch) List(java.util.List) SweepMetricsManager(com.palantir.atlasdb.sweep.metrics.SweepMetricsManager)

Aggregations

Stopwatch (com.google.common.base.Stopwatch)1 CandidateCellForSweepingRequest (com.palantir.atlasdb.keyvalue.api.CandidateCellForSweepingRequest)1 ImmutableCandidateCellForSweepingRequest (com.palantir.atlasdb.keyvalue.api.ImmutableCandidateCellForSweepingRequest)1 ExaminedCellLimit (com.palantir.atlasdb.sweep.CellsToSweepPartitioningIterator.ExaminedCellLimit)1 SweepMetricsManager (com.palantir.atlasdb.sweep.metrics.SweepMetricsManager)1 List (java.util.List)1