Search in sources :

Example 1 with UpdateType

use of org.wali.UpdateType in project nifi by apache.

the class HashMapSnapshot method update.

@Override
public void update(final Collection<T> records) {
    // For each update, then, we will update the record in the map.
    for (final T record : records) {
        final Object recordId = serdeFactory.getRecordIdentifier(record);
        final UpdateType updateType = serdeFactory.getUpdateType(record);
        switch(updateType) {
            case DELETE:
                recordMap.remove(recordId);
                break;
            case SWAP_OUT:
                final String location = serdeFactory.getLocation(record);
                if (location == null) {
                    logger.error("Received Record (ID=" + recordId + ") with UpdateType of SWAP_OUT but " + "no indicator of where the Record is to be Swapped Out to; these records may be " + "lost when the repository is restored!");
                } else {
                    recordMap.remove(recordId);
                    this.swapLocations.add(location);
                }
                break;
            case SWAP_IN:
                final String swapLocation = serdeFactory.getLocation(record);
                if (swapLocation == null) {
                    logger.error("Received Record (ID=" + recordId + ") with UpdateType of SWAP_IN but no " + "indicator of where the Record is to be Swapped In from; these records may be duplicated " + "when the repository is restored!");
                } else {
                    swapLocations.remove(swapLocation);
                }
                recordMap.put(recordId, record);
                break;
            default:
                recordMap.put(recordId, record);
                break;
        }
    }
}
Also used : UpdateType(org.wali.UpdateType)

Example 2 with UpdateType

use of org.wali.UpdateType in project nifi by apache.

the class LengthDelimitedJournal method recoverRecords.

@Override
public JournalRecovery recoverRecords(final Map<Object, T> recordMap, final Set<String> swapLocations) throws IOException {
    long maxTransactionId = -1L;
    int updateCount = 0;
    boolean eofException = false;
    logger.info("Recovering records from journal {}", journalFile);
    final double journalLength = journalFile.length();
    try (final InputStream fis = new FileInputStream(journalFile);
        final InputStream bufferedIn = new BufferedInputStream(fis);
        final ByteCountingInputStream byteCountingIn = new ByteCountingInputStream(bufferedIn);
        final DataInputStream in = new DataInputStream(byteCountingIn)) {
        try {
            // Validate that the header is what we expect and obtain the appropriate SerDe and Version information
            final SerDeAndVersion serdeAndVersion = validateHeader(in);
            final SerDe<T> serde = serdeAndVersion.getSerDe();
            // Ensure that we get a valid transaction indicator
            int transactionIndicator = in.read();
            if (transactionIndicator != TRANSACTION_FOLLOWS && transactionIndicator != JOURNAL_COMPLETE && transactionIndicator != -1) {
                throw new IOException("After reading " + byteCountingIn.getBytesConsumed() + " bytes from " + journalFile + ", encountered unexpected value of " + transactionIndicator + " for the Transaction Indicator. This journal may have been corrupted.");
            }
            long consumedAtLog = 0L;
            // We don't want to apply the updates in a transaction until we've finished recovering the entire
            // transaction. Otherwise, we could apply say 8 out of 10 updates and then hit an EOF. In such a case,
            // we want to rollback the entire transaction. We handle this by not updating recordMap or swapLocations
            // variables directly but instead keeping track of the things that occurred and then once we've read the
            // entire transaction, we can apply those updates to the recordMap and swapLocations.
            final Map<Object, T> transactionRecordMap = new HashMap<>();
            final Set<Object> idsRemoved = new HashSet<>();
            final Set<String> swapLocationsRemoved = new HashSet<>();
            final Set<String> swapLocationsAdded = new HashSet<>();
            int transactionUpdates = 0;
            // While we have a transaction to recover, recover it
            while (transactionIndicator == TRANSACTION_FOLLOWS) {
                transactionRecordMap.clear();
                idsRemoved.clear();
                swapLocationsRemoved.clear();
                swapLocationsAdded.clear();
                transactionUpdates = 0;
                // Format is <Transaction ID: 8 bytes> <Transaction Length: 4 bytes> <Transaction data: # of bytes indicated by Transaction Length Field>
                final long transactionId = in.readLong();
                maxTransactionId = Math.max(maxTransactionId, transactionId);
                final int transactionLength = in.readInt();
                // Use SerDe to deserialize the update. We use a LimitingInputStream to ensure that the SerDe is not able to read past its intended
                // length, in case there is a bug in the SerDe. We then use a ByteCountingInputStream so that we can ensure that all of the data has
                // been read and throw EOFException otherwise.
                final InputStream transactionLimitingIn = new LimitingInputStream(in, transactionLength);
                final ByteCountingInputStream transactionByteCountingIn = new ByteCountingInputStream(transactionLimitingIn);
                final DataInputStream transactionDis = new DataInputStream(transactionByteCountingIn);
                while (transactionByteCountingIn.getBytesConsumed() < transactionLength) {
                    final T record = serde.deserializeEdit(transactionDis, recordMap, serdeAndVersion.getVersion());
                    // Update our RecordMap so that we have the most up-to-date version of the Record.
                    final Object recordId = serde.getRecordIdentifier(record);
                    final UpdateType updateType = serde.getUpdateType(record);
                    switch(updateType) {
                        case DELETE:
                            {
                                idsRemoved.add(recordId);
                                transactionRecordMap.remove(recordId);
                                break;
                            }
                        case SWAP_IN:
                            {
                                final String location = serde.getLocation(record);
                                if (location == null) {
                                    logger.error("Recovered SWAP_IN record from edit log, but it did not contain a Location; skipping record");
                                } else {
                                    swapLocationsRemoved.add(location);
                                    swapLocationsAdded.remove(location);
                                    transactionRecordMap.put(recordId, record);
                                }
                                break;
                            }
                        case SWAP_OUT:
                            {
                                final String location = serde.getLocation(record);
                                if (location == null) {
                                    logger.error("Recovered SWAP_OUT record from edit log, but it did not contain a Location; skipping record");
                                } else {
                                    swapLocationsRemoved.remove(location);
                                    swapLocationsAdded.add(location);
                                    idsRemoved.add(recordId);
                                    transactionRecordMap.remove(recordId);
                                }
                                break;
                            }
                        default:
                            {
                                transactionRecordMap.put(recordId, record);
                                idsRemoved.remove(recordId);
                                break;
                            }
                    }
                    transactionUpdates++;
                }
                // Apply the transaction
                for (final Object id : idsRemoved) {
                    recordMap.remove(id);
                }
                recordMap.putAll(transactionRecordMap);
                swapLocations.removeAll(swapLocationsRemoved);
                swapLocations.addAll(swapLocationsAdded);
                updateCount += transactionUpdates;
                // Check if there is another transaction to read
                transactionIndicator = in.read();
                if (transactionIndicator != TRANSACTION_FOLLOWS && transactionIndicator != JOURNAL_COMPLETE && transactionIndicator != -1) {
                    throw new IOException("After reading " + byteCountingIn.getBytesConsumed() + " bytes from " + journalFile + ", encountered unexpected value of " + transactionIndicator + " for the Transaction Indicator. This journal may have been corrupted.");
                }
                // If we have a very large journal (for instance, if checkpoint is not called for a long time, or if there is a problem rolling over
                // the journal), then we want to occasionally notify the user that we are, in fact, making progress, so that it doesn't appear that
                // NiFi has become "stuck".
                final long consumed = byteCountingIn.getBytesConsumed();
                if (consumed - consumedAtLog > 50_000_000) {
                    final double percentage = consumed / journalLength * 100D;
                    final String pct = new DecimalFormat("#.00").format(percentage);
                    logger.info("{}% of the way finished recovering journal {}, having recovered {} updates", pct, journalFile, updateCount);
                    consumedAtLog = consumed;
                }
            }
        } catch (final EOFException eof) {
            eofException = true;
            logger.warn("Encountered unexpected End-of-File when reading journal file {}; assuming that NiFi was shutdown unexpectedly and continuing recovery", journalFile);
        } catch (final Exception e) {
            // In such a case, there is not much that we can do but to re-throw the Exception.
            if (remainingBytesAllNul(in)) {
                logger.warn("Failed to recover some of the data from Write-Ahead Log Journal because encountered trailing NUL bytes. " + "This will sometimes happen after a sudden power loss. The rest of this journal file will be skipped for recovery purposes." + "The following Exception was encountered while recovering the updates to the journal:", e);
            } else {
                throw e;
            }
        }
    }
    logger.info("Successfully recovered {} updates from journal {}", updateCount, journalFile);
    return new StandardJournalRecovery(updateCount, maxTransactionId, eofException);
}
Also used : HashMap(java.util.HashMap) DecimalFormat(java.text.DecimalFormat) ByteCountingInputStream(org.apache.nifi.stream.io.ByteCountingInputStream) BufferedInputStream(java.io.BufferedInputStream) EOFException(java.io.EOFException) HashSet(java.util.HashSet) DataInputStream(java.io.DataInputStream) BufferedInputStream(java.io.BufferedInputStream) LimitingInputStream(org.apache.nifi.stream.io.LimitingInputStream) ByteCountingInputStream(org.apache.nifi.stream.io.ByteCountingInputStream) FileInputStream(java.io.FileInputStream) InputStream(java.io.InputStream) LimitingInputStream(org.apache.nifi.stream.io.LimitingInputStream) IOException(java.io.IOException) DataInputStream(java.io.DataInputStream) UpdateType(org.wali.UpdateType) FileInputStream(java.io.FileInputStream) IOException(java.io.IOException) EOFException(java.io.EOFException) FileNotFoundException(java.io.FileNotFoundException)

Example 3 with UpdateType

use of org.wali.UpdateType in project nifi by apache.

the class StateMapSerDe method deserializeRecord.

@Override
public StateMapUpdate deserializeRecord(final DataInputStream in, final int version) throws IOException {
    final String componentId = in.readUTF();
    final String updateTypeName = in.readUTF();
    final UpdateType updateType = UpdateType.valueOf(updateTypeName);
    if (updateType == UpdateType.DELETE) {
        return new StateMapUpdate(null, componentId, updateType);
    }
    final long recordVersion = in.readLong();
    final int numEntries = in.readInt();
    final Map<String, String> stateValues = new HashMap<>(numEntries);
    for (int i = 0; i < numEntries; i++) {
        final boolean hasKey = in.readBoolean();
        final String key = hasKey ? in.readUTF() : null;
        final boolean hasValue = in.readBoolean();
        final String value = hasValue ? in.readUTF() : null;
        stateValues.put(key, value);
    }
    return new StateMapUpdate(new StandardStateMap(stateValues, recordVersion), componentId, updateType);
}
Also used : HashMap(java.util.HashMap) UpdateType(org.wali.UpdateType)

Example 4 with UpdateType

use of org.wali.UpdateType in project nifi by apache.

the class WriteAheadRepositoryRecordSerde method serializeEdit.

public void serializeEdit(final RepositoryRecord previousRecordState, final RepositoryRecord record, final DataOutputStream out, final boolean forceAttributesWritten) throws IOException {
    if (record.isMarkedForAbort()) {
        logger.warn("Repository Record {} is marked to be aborted; it will be persisted in the FlowFileRepository as a DELETE record", record);
        out.write(ACTION_DELETE);
        out.writeLong(getRecordIdentifier(record));
        serializeContentClaim(record.getCurrentClaim(), record.getCurrentClaimOffset(), out);
        return;
    }
    final UpdateType updateType = getUpdateType(record);
    if (updateType.equals(UpdateType.DELETE)) {
        out.write(ACTION_DELETE);
        out.writeLong(getRecordIdentifier(record));
        serializeContentClaim(record.getCurrentClaim(), record.getCurrentClaimOffset(), out);
        return;
    }
    // If there's a Destination Connection, that's the one that we want to associated with this record.
    // However, on restart, we will restore the FlowFile and set this connection to its "originalConnection".
    // If we then serialize the FlowFile again before it's transferred, it's important to allow this to happen,
    // so we use the originalConnection instead
    FlowFileQueue associatedQueue = record.getDestination();
    if (associatedQueue == null) {
        associatedQueue = record.getOriginalQueue();
    }
    if (updateType.equals(UpdateType.SWAP_OUT)) {
        out.write(ACTION_SWAPPED_OUT);
        out.writeLong(getRecordIdentifier(record));
        out.writeUTF(associatedQueue.getIdentifier());
        out.writeUTF(getLocation(record));
        return;
    }
    final FlowFile flowFile = record.getCurrent();
    final ContentClaim claim = record.getCurrentClaim();
    switch(updateType) {
        case UPDATE:
            out.write(ACTION_UPDATE);
            break;
        case CREATE:
            out.write(ACTION_CREATE);
            break;
        case SWAP_IN:
            out.write(ACTION_SWAPPED_IN);
            break;
        default:
            throw new AssertionError();
    }
    out.writeLong(getRecordIdentifier(record));
    out.writeLong(flowFile.getEntryDate());
    out.writeLong(flowFile.getLineageStartDate());
    out.writeLong(flowFile.getLineageStartIndex());
    final Long queueDate = flowFile.getLastQueueDate();
    out.writeLong(queueDate == null ? System.currentTimeMillis() : queueDate);
    out.writeLong(flowFile.getQueueDateIndex());
    out.writeLong(flowFile.getSize());
    if (associatedQueue == null) {
        logger.warn("{} Repository Record {} has no Connection associated with it; it will be destroyed on restart", new Object[] { this, record });
        writeString("", out);
    } else {
        writeString(associatedQueue.getIdentifier(), out);
    }
    serializeContentClaim(claim, record.getCurrentClaimOffset(), out);
    if (forceAttributesWritten || record.isAttributesChanged() || updateType == UpdateType.CREATE || updateType == UpdateType.SWAP_IN) {
        // indicate attributes changed
        out.write(1);
        final Map<String, String> attributes = flowFile.getAttributes();
        out.writeInt(attributes.size());
        for (final Map.Entry<String, String> entry : attributes.entrySet()) {
            writeString(entry.getKey(), out);
            writeString(entry.getValue(), out);
        }
    } else {
        // indicate attributes did not change
        out.write(0);
    }
    if (updateType == UpdateType.SWAP_IN) {
        out.writeUTF(record.getSwapLocation());
    }
}
Also used : FlowFile(org.apache.nifi.flowfile.FlowFile) ContentClaim(org.apache.nifi.controller.repository.claim.ContentClaim) StandardContentClaim(org.apache.nifi.controller.repository.claim.StandardContentClaim) FlowFileQueue(org.apache.nifi.controller.queue.FlowFileQueue) UpdateType(org.wali.UpdateType) HashMap(java.util.HashMap) Map(java.util.Map)

Example 5 with UpdateType

use of org.wali.UpdateType in project nifi by apache.

the class RepositoryRecordUpdate method getFieldValue.

@Override
public Object getFieldValue(final String fieldName) {
    if (RepositoryRecordSchema.REPOSITORY_RECORD_UPDATE_V2.equals(fieldName)) {
        String actionType = (String) fieldMap.getFieldValue(RepositoryRecordSchema.ACTION_TYPE);
        if (RepositoryRecordType.CONTENTMISSING.name().equals(actionType)) {
            actionType = RepositoryRecordType.DELETE.name();
        }
        final UpdateType updateType = UpdateType.valueOf(actionType);
        final String actionName;
        switch(updateType) {
            case CREATE:
            case UPDATE:
                actionName = RepositoryRecordSchema.CREATE_OR_UPDATE_ACTION;
                break;
            case DELETE:
                actionName = RepositoryRecordSchema.DELETE_ACTION;
                break;
            case SWAP_IN:
                actionName = RepositoryRecordSchema.SWAP_IN_ACTION;
                break;
            case SWAP_OUT:
                actionName = RepositoryRecordSchema.SWAP_OUT_ACTION;
                break;
            default:
                return null;
        }
        return new NamedValue(actionName, fieldMap);
    }
    return null;
}
Also used : NamedValue(org.apache.nifi.repository.schema.NamedValue) UpdateType(org.wali.UpdateType)

Aggregations

UpdateType (org.wali.UpdateType)6 HashMap (java.util.HashMap)3 BufferedInputStream (java.io.BufferedInputStream)2 DataInputStream (java.io.DataInputStream)2 EOFException (java.io.EOFException)2 FileInputStream (java.io.FileInputStream)2 HashSet (java.util.HashSet)2 File (java.io.File)1 FileNotFoundException (java.io.FileNotFoundException)1 IOException (java.io.IOException)1 InputStream (java.io.InputStream)1 DecimalFormat (java.text.DecimalFormat)1 Map (java.util.Map)1 FlowFileQueue (org.apache.nifi.controller.queue.FlowFileQueue)1 ContentClaim (org.apache.nifi.controller.repository.claim.ContentClaim)1 StandardContentClaim (org.apache.nifi.controller.repository.claim.StandardContentClaim)1 FlowFile (org.apache.nifi.flowfile.FlowFile)1 NamedValue (org.apache.nifi.repository.schema.NamedValue)1 ByteCountingInputStream (org.apache.nifi.stream.io.ByteCountingInputStream)1 LimitingInputStream (org.apache.nifi.stream.io.LimitingInputStream)1