Search in sources :

Example 81 with CorruptIndexException

use of org.apache.lucene.index.CorruptIndexException in project Solbase by Photobucket.

the class IndexReader method document.

public Document document(int docNum, FieldSelector selector) throws CorruptIndexException, IOException {
    // if not in sharding, we should skip this logic
    if (firstPhase.get()) {
        Document doc = new Document();
        Field docId = new Field("docId", new Integer(docNum).toString(), Field.Store.YES, Field.Index.ANALYZED);
        doc.add(docId);
        // i need to fetch sortable field values here
        return doc;
    }
    try {
        if (selector != null && selector instanceof SolbaseFieldSelector) {
            // TODO this logic should be more generic, currently this logic only gets called in shard initial request
            List<byte[]> fieldNames = ((SolbaseFieldSelector) selector).getFieldNames();
            if (fieldNames != null) {
                String fieldNamesString = "";
                for (byte[] fieldName : fieldNames) {
                    fieldNamesString += Bytes.toString(fieldName);
                }
                // this will hit ShardDocument cache
                Document doc = ReaderCache.getDocument(docNum + "~" + fieldNamesString, selector, this.indexName, this.startDocId, this.endDocId).getValue();
                return doc;
            }
        }
        CachedObjectWrapper<Document, Long> docObj = ReaderCache.getDocument(docNum, selector, this.indexName, this.startDocId, this.endDocId);
        Document doc = docObj.getValue();
        if (selector != null && selector instanceof SolbaseFieldSelector) {
            // docTable = SolbaseUtil.getDocTable();
            for (Integer docId : ((SolbaseFieldSelector) selector).getOtherDocsToCache()) {
                // pre-cache other ids
                // TODO: maybe need to pull HTablePool.getTable to here.
                // otherwise, each getDocument() will try to acquire
                // HTableInterface.
                ReaderCache.getDocument(docId, selector, this.indexName, this.startDocId, this.endDocId);
            }
        // } finally {
        // SolbaseUtil.releaseTable(docTable);
        // }
        }
        return doc;
    } catch (Exception ex) {
        throw new IOException(ex);
    }
}
Also used : Field(org.apache.lucene.document.Field) IOException(java.io.IOException) Document(org.apache.lucene.document.Document) SolbaseFieldSelector(org.solbase.SolbaseFieldSelector) IOException(java.io.IOException) CorruptIndexException(org.apache.lucene.index.CorruptIndexException)

Example 82 with CorruptIndexException

use of org.apache.lucene.index.CorruptIndexException in project neo4j by neo4j.

the class LuceneSchemaIndexCorruptionTest method shouldRequestIndexPopulationIfTheIndexIsCorrupt.

@Test
void shouldRequestIndexPopulationIfTheIndexIsCorrupt() {
    // Given
    long faultyIndexId = 1;
    CorruptIndexException error = new CorruptIndexException("It's broken.", "");
    LuceneIndexProvider provider = newFaultyIndexProvider(faultyIndexId, error);
    // When
    IndexDescriptor descriptor = forSchema(forLabel(1, 1), provider.getProviderDescriptor()).withName("index_" + faultyIndexId).materialise(faultyIndexId);
    InternalIndexState initialState = provider.getInitialState(descriptor, NULL);
    // Then
    assertThat(initialState).isEqualTo(InternalIndexState.POPULATING);
    assertThat(logProvider).containsException(error);
}
Also used : InternalIndexState(org.neo4j.internal.kernel.api.InternalIndexState) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) IndexDescriptor(org.neo4j.internal.schema.IndexDescriptor) Test(org.junit.jupiter.api.Test)

Example 83 with CorruptIndexException

use of org.apache.lucene.index.CorruptIndexException in project crate by crate.

the class RecoverySourceHandler method sendFiles.

void sendFiles(Store store, StoreFileMetadata[] files, IntSupplier translogOps, ActionListener<Void> listener) {
    // send smallest first
    ArrayUtil.timSort(files, Comparator.comparingLong(StoreFileMetadata::length));
    final MultiChunkTransfer<StoreFileMetadata, FileChunk> multiFileSender = new MultiChunkTransfer<StoreFileMetadata, FileChunk>(logger, listener, maxConcurrentFileChunks, Arrays.asList(files)) {

        final Deque<byte[]> buffers = new ConcurrentLinkedDeque<>();

        InputStreamIndexInput currentInput = null;

        long offset = 0;

        @Override
        protected void onNewResource(StoreFileMetadata md) throws IOException {
            offset = 0;
            IOUtils.close(currentInput, () -> currentInput = null);
            final IndexInput indexInput = store.directory().openInput(md.name(), IOContext.READONCE);
            currentInput = new InputStreamIndexInput(indexInput, md.length()) {

                @Override
                public void close() throws IOException {
                    // InputStreamIndexInput's close is a noop
                    IOUtils.close(indexInput, super::close);
                }
            };
        }

        private byte[] acquireBuffer() {
            final byte[] buffer = buffers.pollFirst();
            if (buffer != null) {
                return buffer;
            }
            return new byte[chunkSizeInBytes];
        }

        @Override
        protected FileChunk nextChunkRequest(StoreFileMetadata md) throws IOException {
            assert Transports.assertNotTransportThread("read file chunk");
            cancellableThreads.checkForCancel();
            final byte[] buffer = acquireBuffer();
            final int bytesRead = currentInput.read(buffer);
            if (bytesRead == -1) {
                throw new CorruptIndexException("file truncated; length=" + md.length() + " offset=" + offset, md.name());
            }
            final boolean lastChunk = offset + bytesRead == md.length();
            final FileChunk chunk = new FileChunk(md, new BytesArray(buffer, 0, bytesRead), offset, lastChunk, () -> buffers.addFirst(buffer));
            offset += bytesRead;
            return chunk;
        }

        @Override
        protected void executeChunkRequest(FileChunk request, ActionListener<Void> listener) {
            cancellableThreads.checkForCancel();
            recoveryTarget.writeFileChunk(request.md, request.position, request.content, request.lastChunk, translogOps.getAsInt(), ActionListener.runBefore(listener, request::close));
        }

        @Override
        protected void handleError(StoreFileMetadata md, Exception e) throws Exception {
            handleErrorOnSendFiles(store, e, new StoreFileMetadata[] { md });
        }

        @Override
        public void close() throws IOException {
            IOUtils.close(currentInput, () -> currentInput = null);
        }
    };
    resources.add(multiFileSender);
    multiFileSender.start();
}
Also used : BytesArray(org.elasticsearch.common.bytes.BytesArray) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) StoreFileMetadata(org.elasticsearch.index.store.StoreFileMetadata) IOException(java.io.IOException) Deque(java.util.Deque) ConcurrentLinkedDeque(java.util.concurrent.ConcurrentLinkedDeque) IndexFormatTooNewException(org.apache.lucene.index.IndexFormatTooNewException) RecoveryEngineException(org.elasticsearch.index.engine.RecoveryEngineException) RetentionLeaseNotFoundException(org.elasticsearch.index.seqno.RetentionLeaseNotFoundException) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) RemoteTransportException(org.elasticsearch.transport.RemoteTransportException) IndexShardClosedException(org.elasticsearch.index.shard.IndexShardClosedException) IndexShardRelocatedException(org.elasticsearch.index.shard.IndexShardRelocatedException) IOException(java.io.IOException) IndexFormatTooOldException(org.apache.lucene.index.IndexFormatTooOldException) ThreadedActionListener(org.elasticsearch.action.support.ThreadedActionListener) ActionListener(org.elasticsearch.action.ActionListener) InputStreamIndexInput(org.elasticsearch.common.lucene.store.InputStreamIndexInput) InputStreamIndexInput(org.elasticsearch.common.lucene.store.InputStreamIndexInput) IndexInput(org.apache.lucene.store.IndexInput)

Example 84 with CorruptIndexException

use of org.apache.lucene.index.CorruptIndexException in project crate by crate.

the class RecoverySourceHandler method phase1.

/**
 * Perform phase1 of the recovery operations. Once this {@link IndexCommit}
 * snapshot has been performed no commit operations (files being fsync'd)
 * are effectively allowed on this index until all recovery phases are done
 * <p>
 * Phase1 examines the segment files on the target node and copies over the
 * segments that are missing. Only segments that have the same size and
 * checksum can be reused
 */
void phase1(IndexCommit snapshot, long startingSeqNo, IntSupplier translogOps, ActionListener<SendFileResult> listener) {
    cancellableThreads.checkForCancel();
    final Store store = shard.store();
    try {
        final StopWatch stopWatch = new StopWatch().start();
        final Store.MetadataSnapshot recoverySourceMetadata;
        try {
            recoverySourceMetadata = store.getMetadata(snapshot);
        } catch (CorruptIndexException | IndexFormatTooOldException | IndexFormatTooNewException ex) {
            shard.failShard("recovery", ex);
            throw ex;
        }
        for (String name : snapshot.getFileNames()) {
            final StoreFileMetadata md = recoverySourceMetadata.get(name);
            if (md == null) {
                logger.info("Snapshot differs from actual index for file: {} meta: {}", name, recoverySourceMetadata.asMap());
                throw new CorruptIndexException("Snapshot differs from actual index - maybe index was removed metadata has " + recoverySourceMetadata.asMap().size() + " files", name);
            }
        }
        if (canSkipPhase1(recoverySourceMetadata, request.metadataSnapshot()) == false) {
            final List<String> phase1FileNames = new ArrayList<>();
            final List<Long> phase1FileSizes = new ArrayList<>();
            final List<String> phase1ExistingFileNames = new ArrayList<>();
            final List<Long> phase1ExistingFileSizes = new ArrayList<>();
            // Total size of segment files that are recovered
            long totalSizeInBytes = 0;
            // Total size of segment files that were able to be re-used
            long existingTotalSizeInBytes = 0;
            // Generate a "diff" of all the identical, different, and missing
            // segment files on the target node, using the existing files on
            // the source node
            final Store.RecoveryDiff diff = recoverySourceMetadata.recoveryDiff(request.metadataSnapshot());
            for (StoreFileMetadata md : diff.identical) {
                phase1ExistingFileNames.add(md.name());
                phase1ExistingFileSizes.add(md.length());
                existingTotalSizeInBytes += md.length();
                if (logger.isTraceEnabled()) {
                    logger.trace("recovery [phase1]: not recovering [{}], exist in local store and has checksum [{}]," + " size [{}]", md.name(), md.checksum(), md.length());
                }
                totalSizeInBytes += md.length();
            }
            List<StoreFileMetadata> phase1Files = new ArrayList<>(diff.different.size() + diff.missing.size());
            phase1Files.addAll(diff.different);
            phase1Files.addAll(diff.missing);
            for (StoreFileMetadata md : phase1Files) {
                if (request.metadataSnapshot().asMap().containsKey(md.name())) {
                    logger.trace("recovery [phase1]: recovering [{}], exists in local store, but is different: remote [{}], local [{}]", md.name(), request.metadataSnapshot().asMap().get(md.name()), md);
                } else {
                    logger.trace("recovery [phase1]: recovering [{}], does not exist in remote", md.name());
                }
                phase1FileNames.add(md.name());
                phase1FileSizes.add(md.length());
                totalSizeInBytes += md.length();
            }
            logger.trace("recovery [phase1]: recovering_files [{}] with total_size [{}], reusing_files [{}] with total_size [{}]", phase1FileNames.size(), new ByteSizeValue(totalSizeInBytes), phase1ExistingFileNames.size(), new ByteSizeValue(existingTotalSizeInBytes));
            final StepListener<Void> sendFileInfoStep = new StepListener<>();
            final StepListener<Void> sendFilesStep = new StepListener<>();
            final StepListener<RetentionLease> createRetentionLeaseStep = new StepListener<>();
            final StepListener<Void> cleanFilesStep = new StepListener<>();
            cancellableThreads.checkForCancel();
            recoveryTarget.receiveFileInfo(phase1FileNames, phase1FileSizes, phase1ExistingFileNames, phase1ExistingFileSizes, translogOps.getAsInt(), sendFileInfoStep);
            sendFileInfoStep.whenComplete(r -> sendFiles(store, phase1Files.toArray(new StoreFileMetadata[0]), translogOps, sendFilesStep), listener::onFailure);
            sendFilesStep.whenComplete(r -> createRetentionLease(startingSeqNo, createRetentionLeaseStep), listener::onFailure);
            createRetentionLeaseStep.whenComplete(retentionLease -> {
                final long lastKnownGlobalCheckpoint = shard.getLastKnownGlobalCheckpoint();
                assert retentionLease == null || retentionLease.retainingSequenceNumber() - 1 <= lastKnownGlobalCheckpoint : retentionLease + " vs " + lastKnownGlobalCheckpoint;
                // Establishes new empty translog on the replica with global checkpoint set to lastKnownGlobalCheckpoint. We want
                // the commit we just copied to be a safe commit on the replica, so why not set the global checkpoint on the replica
                // to the max seqno of this commit? Because (in rare corner cases) this commit might not be a safe commit here on
                // the primary, and in these cases the max seqno would be too high to be valid as a global checkpoint.
                cleanFiles(store, recoverySourceMetadata, translogOps, lastKnownGlobalCheckpoint, cleanFilesStep);
            }, listener::onFailure);
            final long totalSize = totalSizeInBytes;
            final long existingTotalSize = existingTotalSizeInBytes;
            cleanFilesStep.whenComplete(r -> {
                final TimeValue took = stopWatch.totalTime();
                logger.trace("recovery [phase1]: took [{}]", took);
                listener.onResponse(new SendFileResult(phase1FileNames, phase1FileSizes, totalSize, phase1ExistingFileNames, phase1ExistingFileSizes, existingTotalSize, took));
            }, listener::onFailure);
        } else {
            logger.trace("skipping [phase1] since source and target have identical sync id [{}]", recoverySourceMetadata.getSyncId());
            // but we must still create a retention lease
            final StepListener<RetentionLease> createRetentionLeaseStep = new StepListener<>();
            createRetentionLease(startingSeqNo, createRetentionLeaseStep);
            createRetentionLeaseStep.whenComplete(retentionLease -> {
                final TimeValue took = stopWatch.totalTime();
                logger.trace("recovery [phase1]: took [{}]", took);
                listener.onResponse(new SendFileResult(Collections.emptyList(), Collections.emptyList(), 0L, Collections.emptyList(), Collections.emptyList(), 0L, took));
            }, listener::onFailure);
        }
    } catch (Exception e) {
        throw new RecoverFilesRecoveryException(request.shardId(), 0, new ByteSizeValue(0L), e);
    }
}
Also used : CopyOnWriteArrayList(java.util.concurrent.CopyOnWriteArrayList) ArrayList(java.util.ArrayList) ByteSizeValue(org.elasticsearch.common.unit.ByteSizeValue) Store(org.elasticsearch.index.store.Store) StoreFileMetadata(org.elasticsearch.index.store.StoreFileMetadata) IndexFormatTooOldException(org.apache.lucene.index.IndexFormatTooOldException) TimeValue(io.crate.common.unit.TimeValue) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) IndexFormatTooNewException(org.apache.lucene.index.IndexFormatTooNewException) RecoveryEngineException(org.elasticsearch.index.engine.RecoveryEngineException) RetentionLeaseNotFoundException(org.elasticsearch.index.seqno.RetentionLeaseNotFoundException) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) RemoteTransportException(org.elasticsearch.transport.RemoteTransportException) IndexShardClosedException(org.elasticsearch.index.shard.IndexShardClosedException) IndexShardRelocatedException(org.elasticsearch.index.shard.IndexShardRelocatedException) IOException(java.io.IOException) IndexFormatTooOldException(org.apache.lucene.index.IndexFormatTooOldException) StopWatch(org.elasticsearch.common.StopWatch) RetentionLease(org.elasticsearch.index.seqno.RetentionLease) AtomicLong(java.util.concurrent.atomic.AtomicLong) StepListener(org.elasticsearch.action.StepListener) IndexFormatTooNewException(org.apache.lucene.index.IndexFormatTooNewException)

Example 85 with CorruptIndexException

use of org.apache.lucene.index.CorruptIndexException in project crate by crate.

the class RecoverySourceHandler method handleErrorOnSendFiles.

private void handleErrorOnSendFiles(Store store, Exception e, StoreFileMetadata[] mds) throws Exception {
    final IOException corruptIndexException = ExceptionsHelper.unwrapCorruption(e);
    assert Transports.assertNotTransportThread(RecoverySourceHandler.this + "[handle error on send/clean files]");
    if (corruptIndexException != null) {
        Exception localException = null;
        for (StoreFileMetadata md : mds) {
            cancellableThreads.checkForCancel();
            logger.debug("checking integrity for file {} after remove corruption exception", md);
            if (store.checkIntegrityNoException(md) == false) {
                // we are corrupted on the primary -- fail!
                logger.warn("{} Corrupted file detected {} checksum mismatch", shardId, md);
                if (localException == null) {
                    localException = corruptIndexException;
                }
                failEngine(corruptIndexException);
            }
        }
        if (localException != null) {
            throw localException;
        } else {
            // corruption has happened on the way to replica
            RemoteTransportException remoteException = new RemoteTransportException("File corruption occurred on recovery but checksums are ok", null);
            remoteException.addSuppressed(e);
            logger.warn(() -> new ParameterizedMessage("{} Remote file corruption on node {}, recovering {}. local checksum OK", shardId, request.targetNode(), mds), corruptIndexException);
            throw remoteException;
        }
    }
    throw e;
}
Also used : RemoteTransportException(org.elasticsearch.transport.RemoteTransportException) ParameterizedMessage(org.apache.logging.log4j.message.ParameterizedMessage) IOException(java.io.IOException) StoreFileMetadata(org.elasticsearch.index.store.StoreFileMetadata) IndexFormatTooNewException(org.apache.lucene.index.IndexFormatTooNewException) RecoveryEngineException(org.elasticsearch.index.engine.RecoveryEngineException) RetentionLeaseNotFoundException(org.elasticsearch.index.seqno.RetentionLeaseNotFoundException) CorruptIndexException(org.apache.lucene.index.CorruptIndexException) RemoteTransportException(org.elasticsearch.transport.RemoteTransportException) IndexShardClosedException(org.elasticsearch.index.shard.IndexShardClosedException) IndexShardRelocatedException(org.elasticsearch.index.shard.IndexShardRelocatedException) IOException(java.io.IOException) IndexFormatTooOldException(org.apache.lucene.index.IndexFormatTooOldException)

Aggregations

CorruptIndexException (org.apache.lucene.index.CorruptIndexException)93 IndexFormatTooNewException (org.apache.lucene.index.IndexFormatTooNewException)35 IndexFormatTooOldException (org.apache.lucene.index.IndexFormatTooOldException)35 Directory (org.apache.lucene.store.Directory)26 IOException (java.io.IOException)25 ChecksumIndexInput (org.apache.lucene.store.ChecksumIndexInput)24 IndexInput (org.apache.lucene.store.IndexInput)24 IndexOutput (org.apache.lucene.store.IndexOutput)23 ArrayList (java.util.ArrayList)16 RAMDirectory (org.apache.lucene.store.RAMDirectory)15 BytesRef (org.apache.lucene.util.BytesRef)14 FileNotFoundException (java.io.FileNotFoundException)12 ShardId (org.elasticsearch.index.shard.ShardId)12 NoSuchFileException (java.nio.file.NoSuchFileException)11 IOContext (org.apache.lucene.store.IOContext)11 EOFException (java.io.EOFException)10 HashMap (java.util.HashMap)10 AlreadyClosedException (org.apache.lucene.store.AlreadyClosedException)10 FilterDirectory (org.apache.lucene.store.FilterDirectory)10 Document (org.apache.lucene.document.Document)8