Search in sources :

Example 16 with Transaction

use of com.linkedin.databus2.producers.ds.Transaction in project databus by linkedin.

the class DbUpdateState method onStartElement.

@Override
public void onStartElement(StateMachine stateMachine, XMLStreamReader xmlStreamReader) throws DatabusException, XMLStreamException {
    _currentStateType = STATETYPE.STARTELEMENT;
    _opType = DBUpdateImage.OpType.UNKNOWN;
    boolean isUpdate = false;
    boolean isDelete = false;
    boolean isPreImage = false;
    for (int i = 0; i < xmlStreamReader.getAttributeCount(); i++) {
        if (xmlStreamReader.getAttributeName(i).getLocalPart().equals(TABLEATTR))
            _currentTable = xmlStreamReader.getAttributeValue(i);
        else if (xmlStreamReader.getAttributeName(i).getLocalPart().equalsIgnoreCase(UPDATEATTRNAME) && xmlStreamReader.getAttributeValue(i).equalsIgnoreCase(UPDATEVAL)) {
            _opType = DBUpdateImage.OpType.UPDATE;
        } else if (xmlStreamReader.getAttributeName(i).getLocalPart().equalsIgnoreCase(DELETEATTRNAME) && xmlStreamReader.getAttributeValue(i).equalsIgnoreCase(DELETEVAL)) {
            _opType = DBUpdateImage.OpType.DELETE;
        } else if (xmlStreamReader.getAttributeName(i).getLocalPart().equalsIgnoreCase(INSERTATTRNAME) && xmlStreamReader.getAttributeValue(i).equalsIgnoreCase(INSERTVAL)) {
            _opType = DBUpdateImage.OpType.INSERT;
        } else if (xmlStreamReader.getAttributeName(i).getLocalPart().equalsIgnoreCase(PREIMAGEATTRNAME) && xmlStreamReader.getAttributeValue(i).equalsIgnoreCase(PREIMAGEVAL)) {
            isPreImage = true;
        }
    }
    if (//This is the pre-image of the row, we can skip this
    isPreImage) {
        if (LOG.isDebugEnabled())
            LOG.debug("Skipping current dbUpdate because it's a preimage");
        skipCurrentDBupdate(stateMachine, xmlStreamReader);
        return;
    }
    if (_currentTable == null || _currentTable.length() == 0) {
        LOG.fatal("PROBLEM WITH XML: Dbupdate does not have any table name associated with it, stopping ");
        throw new DatabusException("Dbupdate does not have any table name associated with it, stopping");
    }
    Schema schema = StateMachineHelper.tableToSchema(_currentTable, stateMachine.getTableToSourceNameMap(), stateMachine.getSchemaRegistryService());
    if (schema == null) {
        if (LOG.isDebugEnabled())
            LOG.debug("This source is not configured (couldn't find namespace). Skipping to tokens, to capture scn for empty DBUpdate");
        /**
       * State jump from DBUPDATE-> TOKENS. (we skip COLUMNS).
       * At this point, we can't capture this update (because the tableName -> namespace configuration is not found), we are still interested in knowing
       * the SCN associated with the current dbUpdate. The SCNs are captured because, if it's a slow source, we need to insert EOP at some threshold (This is done by the
       * goldengate event producer). If the transaction has no updates relevant to the tables we are interested in, this will be passed to the transaction callback as an "empty" transaction with just SCNs contained.
       */
        skipToTokens(stateMachine, xmlStreamReader);
        setNextStateProcessor(stateMachine, xmlStreamReader);
        return;
    }
    stateMachine.columnsState.setCurrentSchema(schema);
    stateMachine.columnsState.setKeyPairs(new ArrayList<ColumnsState.KeyPair>());
    xmlStreamReader.nextTag();
    setNextStateProcessor(stateMachine, xmlStreamReader);
}
Also used : DatabusException(com.linkedin.databus2.core.DatabusException) Schema(org.apache.avro.Schema)

Example 17 with Transaction

use of com.linkedin.databus2.producers.ds.Transaction in project databus by linkedin.

the class TrailFilePositionSetter method locateFilePosition.

/**
   *
   * This is an optimized version of searching the trail file position. It does a lookup from latest file quickly skipping trail files whose first transaction has
   * SCN newer than the requested SCN.
   *
   * Steps:
   * 1. Get all the list of trail files
   * 2. Iterate the trail files from the latest to earliest trail file until exhausted or successful, do
   *    a) FindResult = Call FindPosition() on the current file
   *    b) If FindResult was successful( FOUND (exact scn is present) or EXACT_SCN_NOT_FOUND (scn with both lower and higher SCn is seen but not exact) and the found txn was not the first transaction seen, then return
   *    c) Otherwise on the first transaction, reset and look at the previous file
   *
   * This method is quick because if the currentTrailFile's SCN has higher than requested SCN or transaction found was the first one, then it fails fast (after first transaction) so that lookup happens on the previous file.
   * In EI/Prod, each trail file is in the order of 50 MB and this jumping should save a lot of time.
   *
   * Reason for continuing to scan if the requested transaction is the first one :
   *        In each round of scanning we start from one trail file if the first transaction read has SCN == requestedScn, then we
   *        would still need to look at previous file as there could be txns with same SCN. So, we need to locate the first txn
   *        which has this SCN. If there is no previous file present (earliest txn matches requested SCN), then we return error.
   * @param scn : SCN to locate
   * @param callback TransactionSCNFinderCallback to parse and store offsets.
   * @return FilePositionResult of the locate operation
   * @throws IOException if issues with File operations.
   */
public synchronized FilePositionResult locateFilePosition(long scn, TransactionSCNFinderCallback callback) throws IOException {
    TrailFileNotifier notifier = new TrailFileNotifier(_dir, _filter, null, 0, null);
    List<File> orderedTrailFiles = notifier.getCandidateTrailFiles();
    _log.info("Initial set of Trail Files :" + orderedTrailFiles);
    if ((null == orderedTrailFiles) || orderedTrailFiles.isEmpty()) {
        return FilePositionResult.createNoTxnsFoundResult();
    }
    FilePositionResult res = null;
    if (scn == USE_EARLIEST_SCN) {
        res = getFilePosition(scn, callback);
    } else {
        for (int i = orderedTrailFiles.size() - 1; i >= 0; i--) {
            callback.reset();
            File startFile = orderedTrailFiles.get(i);
            _log.info("Locating the SCN (" + scn + ") starting from the trail file :" + startFile);
            res = getFilePosition(scn, callback, startFile.getName());
            _log.info("Result of the location operation for SCN (" + scn + ") starting from trail file (" + startFile + ") is : " + res);
            // If this is the first txn scanned, we need to go to previous file
            if (((res.getStatus() == Status.EXACT_SCN_NOT_FOUND) || (res.getStatus() == Status.FOUND)) && // TxnRank will be 0 if this is the
            (res.getTxnPos().getTxnRank() > 0)) // first txn scanned
            {
                break;
            }
            if ((0 == i) && (res != null)) {
                ScnTxnPos scnTxnPos = res.getTxnPos();
                if ((scnTxnPos != null) && (scnTxnPos.getTxnRank() <= 0)) {
                    return FilePositionResult.createErrorResult(new DatabusException("A transaction with scn less than requested SCN was not found. " + "Without this txn, we cannot identify if all transactions for requested SCN " + "have been located. Requested SinceSCN is :" + scn));
                }
            }
        }
    }
    return res;
}
Also used : DatabusException(com.linkedin.databus2.core.DatabusException) File(java.io.File)

Example 18 with Transaction

use of com.linkedin.databus2.producers.ds.Transaction in project databus by linkedin.

the class OracleTxlogEventReader method readEventsFromAllSources.

@Override
public ReadEventCycleSummary readEventsFromAllSources(long sinceSCN) throws DatabusException, EventCreationException, UnsupportedKeyException {
    boolean eventBufferNeedsRollback = true;
    boolean debugEnabled = _log.isDebugEnabled();
    List<EventReaderSummary> summaries = new ArrayList<EventReaderSummary>();
    try {
        long cycleStartTS = System.currentTimeMillis();
        _eventBuffer.startEvents();
        // Open the database connection if it is closed (at start or after an SQLException)
        if (_eventSelectConnection == null || _eventSelectConnection.isClosed()) {
            resetConnections();
        }
        /**
       * Chunking in Relay:
       * =================
       *
       *  Variables used:
       *  ===============
       *
       *  1. _inChunking : Flag to indicate if the relay is in chunking mode
       *  2. _chunkingType : Type of chunking supported
       *  3. _chunkedScnThreshold :
       *               The threshold Scn diff which triggers chunking. If the relay's maxScn is older
       *               than DB's maxScn by this threshold, then chunking will be enabled.
       *  4. _txnsPerChunk : Chunk size of txns for txn based chunking.
       *  5. _scnChunkSize : Chunk Size for scn based chunking.
       *  6. _catchupTargetMaxScn : Cached copy of DB's maxScn used as chunking's target SCN.
       *
       *  =========================================
       *  Behavior of Chunking for Slow Sources:
       *  =========================================
       *
       *  The slow sources case that is illustrated here is when all the sources in the sourcesList (fetched by relay) is slow.
       *  In this case, the endOfPeriodSCN will not increase on its own whereas in all other cases, it will.
       *
       *  At startup, if the _catchupTargetMaxScn - currScn > _chunkedScnThreshold, then chunking is enabled.
       *  1. Txn_based_chunking
       *
       *    a) If chunking is on at startup, then txn-based chunking query is used. Otherwise, regular query is used.
       *    b) For a period till SLOW_SOURCE_QUERY_THRESHOLD msec, the endOfPeriodSCN/SinceSCN will not increase.
       *    c) After SLOW_SOURCE_QUERY_THRESHOLD msec, the sinceScn/endOfPeriodSCN will be increased to current MaxScn. If chunking was previously enabled
       *        at this time, it will be disabled upto MAX_SCN_DELAY_MS msec after which _catchupTargetMaxScn will be refreshed.
       *    d) if the new _catchupTargetMaxScn - currScn > _chunkedScnThreshold, then chunking is again enabled.
       *    e) go to (b)
       *
       *  2. SCN based Chunking
       *    a) If chunking is on at startup, then scn-based chunking query is used. Otherwise, regular query is used.
       *    b) For a period till SLOW_SOURCE_QUERY_THRESHOLD msec, the endOfPeriodSCN/SinceSCN keep increasing by _scnChunkSize with no rows fetched.
       *    c) When _catchupTargetMaxScn - endOfPeriodSCN <  _chunkedScnThreshold, then chunking is disabled and regular query kicks in and in this
       *       phase sinceSCN/endOfPeriodSCN will not increase. After MAX_SCN_DELAY_MS interval, _catchupTargetSCN will be refreshed.
       *    d) If the new _catchupTargetMaxScn - currScn > _chunkedScnThreshold, then SCN chunking is again enabled.
       *    e) go to (b)       *
       *
       */
        if (sinceSCN <= 0) {
            _catchupTargetMaxScn = sinceSCN = getMaxTxlogSCN(_eventSelectConnection);
            _log.debug("sinceSCN was <= 0. Overriding with the current max SCN=" + sinceSCN);
            _eventBuffer.setStartSCN(sinceSCN);
            try {
                DBHelper.commit(_eventSelectConnection);
            } catch (SQLException s) {
                DBHelper.rollback(_eventSelectConnection);
            }
        } else if ((_chunkingType.isChunkingEnabled()) && (_catchupTargetMaxScn <= 0)) {
            _catchupTargetMaxScn = getMaxTxlogSCN(_eventSelectConnection);
            _log.debug("catchupTargetMaxScn was <= 0. Overriding with the current max SCN=" + _catchupTargetMaxScn);
        }
        if (_catchupTargetMaxScn <= 0)
            _inChunkingMode = false;
        // Get events for each source
        List<OracleTriggerMonitoredSourceInfo> filteredSources = filterSources(sinceSCN);
        long endOfPeriodScn = EventReaderSummary.NO_EVENTS_SCN;
        for (OracleTriggerMonitoredSourceInfo source : _sources) {
            if (filteredSources.contains(source)) {
                long startTS = System.currentTimeMillis();
                EventReaderSummary summary = readEventsFromOneSource(_eventSelectConnection, source, sinceSCN);
                summaries.add(summary);
                endOfPeriodScn = Math.max(endOfPeriodScn, summary.getEndOfPeriodSCN());
                long endTS = System.currentTimeMillis();
                source.getStatisticsBean().addTimeOfLastDBAccess(endTS);
                if (_eventsLog.isDebugEnabled() || (_eventsLog.isInfoEnabled() && summary.getNumberOfEvents() > 0)) {
                    _eventsLog.info(summary.toString());
                }
                // Update statistics for the source
                if (summary.getNumberOfEvents() > 0) {
                    source.getStatisticsBean().addEventCycle(summary.getNumberOfEvents(), endTS - startTS, summary.getSizeOfSerializedEvents(), summary.getEndOfPeriodSCN());
                } else {
                    source.getStatisticsBean().addEmptyEventCycle();
                }
            } else {
                source.getStatisticsBean().addEmptyEventCycle();
            }
        }
        _lastSeenEOP = Math.max(_lastSeenEOP, Math.max(endOfPeriodScn, sinceSCN));
        // If we did not read any events in this cycle then get the max SCN from the txlog. This
        // is for slow sources so that the endOfPeriodScn never lags too far behind the max scn
        // in the txlog table.
        long curtime = System.currentTimeMillis();
        if (endOfPeriodScn == EventReaderSummary.NO_EVENTS_SCN) {
            // If in SCN Chunking mode, its possible to get empty batches for a SCN range,
            if ((sinceSCN + _scnChunkSize <= _catchupTargetMaxScn) && (ChunkingType.SCN_CHUNKING == _chunkingType)) {
                endOfPeriodScn = sinceSCN + _scnChunkSize;
                _lastquerytime = curtime;
            } else if (ChunkingType.TXN_CHUNKING == _chunkingType && _inChunkingMode) {
                long nextBatchScn = getMaxScnSkippedForTxnChunked(_eventSelectConnection, sinceSCN, _txnsPerChunk);
                _log.info("No events while in txn chunking. CurrScn : " + sinceSCN + ", jumping to :" + nextBatchScn);
                endOfPeriodScn = nextBatchScn;
                _lastquerytime = curtime;
            } else if ((curtime - _lastquerytime) > _slowQuerySourceThreshold) {
                _lastquerytime = curtime;
                //get new start scn for subsequent calls;
                final long maxTxlogSCN = getMaxTxlogSCN(_eventSelectConnection);
                //For performance reasons, getMaxTxlogSCN() returns the max scn only among txlog rows
                //which have their scn rewritten (i.e. scn < infinity). This allows the getMaxTxlogSCN
                //query to be evaluated using only the SCN index. Getting the true max SCN requires
                //scanning the rows where scn == infinity which is expensive.
                //On the other hand, readEventsFromOneSource will read the latter events. So it is
                //possible that maxTxlogSCN < scn of the last event in the buffer!
                //We use max() to guarantee that there are no SCN regressions.
                endOfPeriodScn = Math.max(maxTxlogSCN, sinceSCN);
                _log.info("SlowSourceQueryThreshold hit. currScn : " + sinceSCN + ". Advanced endOfPeriodScn to " + endOfPeriodScn + " and added the event to relay");
                if (debugEnabled) {
                    _log.debug("No events processed. Read max SCN from txlog table for endOfPeriodScn. endOfPeriodScn=" + endOfPeriodScn);
                }
            }
            if (endOfPeriodScn != EventReaderSummary.NO_EVENTS_SCN && endOfPeriodScn > sinceSCN) {
                // If the SCN has moved forward in the above if/else loop, then
                _log.info("The endOfPeriodScn has advanced from to " + endOfPeriodScn);
                _eventBuffer.endEvents(endOfPeriodScn, _relayInboundStatsCollector);
                eventBufferNeedsRollback = false;
            } else {
                eventBufferNeedsRollback = true;
            }
        } else {
            //we have appended some events; and a new end of period has been found
            _lastquerytime = curtime;
            _eventBuffer.endEvents(endOfPeriodScn, _relayInboundStatsCollector);
            if (debugEnabled) {
                _log.debug("End of events: " + endOfPeriodScn + " windown range= " + _eventBuffer.getMinScn() + "," + _eventBuffer.lastWrittenScn());
            }
            //no need to roll back
            eventBufferNeedsRollback = false;
        }
        //save endOfPeriodScn if new one has been discovered
        if (endOfPeriodScn != EventReaderSummary.NO_EVENTS_SCN) {
            if (null != _maxScnWriter && (endOfPeriodScn != sinceSCN)) {
                _maxScnWriter.saveMaxScn(endOfPeriodScn);
            }
            for (OracleTriggerMonitoredSourceInfo source : _sources) {
                //update maxDBScn here
                source.getStatisticsBean().addMaxDBScn(endOfPeriodScn);
                source.getStatisticsBean().addTimeOfLastDBAccess(System.currentTimeMillis());
            }
        }
        long cycleEndTS = System.currentTimeMillis();
        //check if we should refresh _catchupTargetMaxScn
        if (_chunkingType.isChunkingEnabled() && (_lastSeenEOP >= _catchupTargetMaxScn) && (curtime - _lastMaxScnTime >= _maxScnDelayMs)) {
            //reset it to -1 so it gets refreshed next time around
            _catchupTargetMaxScn = -1;
        }
        boolean chunkMode = _chunkingType.isChunkingEnabled() && (_catchupTargetMaxScn > 0) && (_lastSeenEOP < _catchupTargetMaxScn);
        if (!chunkMode && _inChunkingMode)
            _log.info("Disabling chunking for sources !!");
        _inChunkingMode = chunkMode;
        if (_inChunkingMode && debugEnabled)
            _log.debug("_inChunkingMode = true, _catchupTargetMaxScn=" + _catchupTargetMaxScn + ", endOfPeriodScn=" + endOfPeriodScn + ", _lastSeenEOP=" + _lastSeenEOP);
        ReadEventCycleSummary summary = new ReadEventCycleSummary(_name, summaries, Math.max(endOfPeriodScn, sinceSCN), (cycleEndTS - cycleStartTS));
        // Have to commit the transaction since we are in serializable isolation level
        DBHelper.commit(_eventSelectConnection);
        // Return the event summaries
        return summary;
    } catch (SQLException ex) {
        try {
            DBHelper.rollback(_eventSelectConnection);
        } catch (SQLException s) {
            throw new DatabusException(s.getMessage());
        }
        ;
        handleExceptionInReadEvents(ex);
        throw new DatabusException(ex);
    } catch (Exception e) {
        handleExceptionInReadEvents(e);
        throw new DatabusException(e);
    } finally {
        // If that happens, rollback the event buffer.
        if (eventBufferNeedsRollback) {
            if (_log.isDebugEnabled()) {
                _log.debug("Rolling back the event buffer because eventBufferNeedsRollback is true.");
            }
            _eventBuffer.rollbackEvents();
        }
    }
}
Also used : DatabusException(com.linkedin.databus2.core.DatabusException) SQLException(java.sql.SQLException) ArrayList(java.util.ArrayList) EventCreationException(com.linkedin.databus2.producers.EventCreationException) DatabusException(com.linkedin.databus2.core.DatabusException) UnsupportedKeyException(com.linkedin.databus.core.UnsupportedKeyException) SQLException(java.sql.SQLException)

Example 19 with Transaction

use of com.linkedin.databus2.producers.ds.Transaction in project databus by linkedin.

the class TransactionState method onStartElement.

@Override
public void onStartElement(StateMachine stateMachine, XMLStreamReader xmlStreamReader) throws DatabusException, XMLStreamException {
    _currentStateType = STATETYPE.STARTELEMENT;
    for (int i = 0; i < xmlStreamReader.getAttributeCount(); i++) {
        if (xmlStreamReader.getAttributeName(i).getLocalPart().equals(TRANSACTIONTIMESTAMPATTR)) {
            StringBuilder timeStamp = new StringBuilder(xmlStreamReader.getAttributeValue(i));
            _lastSeenTimestampStr = timeStamp.toString();
            //The timestamp given by golden gate does not have nanoseconds accuracy needed by oracle timestamp
            String correctedTimestamp = timeStamp.append("000").toString();
            _currentTimeStamp = GGEventGenerationFactory.ggTimeStampStringToNanoSeconds(correctedTimestamp);
        }
    }
    if (_currentTimeStamp == UNINITIALIZEDTS)
        throw new DatabusException("Unable to locate timestamp in the transaction tag in the xml");
    // start of new transaction
    if (_startTransProcessingTimeNs == 0) {
        _startTransProcessingTimeNs = System.nanoTime();
        _startTransLocation = xmlStreamReader.getLocation().getCharacterOffset();
        // this is location of the END of the <transaction timestamp="..."> entity
        // so we need to adjust the size of the transactions in bytes
        _startTransLocation -= TRANSACTION_ELEMENT_SIZE;
    }
    //create dbupdates list
    stateMachine.dbUpdateState.setSourceDbUpdatesMap(new HashMap<Integer, HashSet<DbUpdateState.DBUpdateImage>>());
    xmlStreamReader.nextTag();
    setNextStateProcessor(stateMachine, xmlStreamReader);
}
Also used : DatabusException(com.linkedin.databus2.core.DatabusException) HashSet(java.util.HashSet)

Example 20 with Transaction

use of com.linkedin.databus2.producers.ds.Transaction in project databus by linkedin.

the class TransactionState method onEndElement.

@Override
public void onEndElement(StateMachine stateMachine, XMLStreamReader xmlStreamReader) throws Exception {
    _currentStateType = STATETYPE.ENDELEMENT;
    if (LOG.isDebugEnabled())
        LOG.debug("The current transaction has " + stateMachine.dbUpdateState.getSourceDbUpdatesMap().size() + " DbUpdates");
    if (_transactionSuccessCallBack == null) {
        throw new DatabusException("No callback specified for the transaction state! Cannot proceed without a callback");
    }
    long endTransactionLocation = xmlStreamReader.getLocation().getCharacterOffset();
    _transactionSize = endTransactionLocation - _startTransLocation;
    // collect stats
    long trTime = System.nanoTime() - _startTransProcessingTimeNs;
    long scn = stateMachine.dbUpdateState.getScn();
    TransactionInfo trInfo = new TransactionInfo(_transactionSize, trTime, _currentTimeStamp, scn);
    if (stateMachine.dbUpdateState.getSourceDbUpdatesMap().size() == 0) {
        if (LOG.isDebugEnabled())
            LOG.debug("The current transaction contains no dbUpdates, giving empty callback");
        _transactionSuccessCallBack.onTransactionEnd(null, trInfo);
    } else {
        List<PerSourceTransactionalUpdate> dbUpdates = sortDbUpdates(stateMachine.dbUpdateState.getSourceDbUpdatesMap());
        _transactionSuccessCallBack.onTransactionEnd(dbUpdates, trInfo);
    }
    stateMachine.dbUpdateState.cleanUpState(stateMachine, xmlStreamReader);
    cleanUpState(stateMachine, xmlStreamReader);
    xmlStreamReader.nextTag();
    setNextStateProcessor(stateMachine, xmlStreamReader);
}
Also used : DatabusException(com.linkedin.databus2.core.DatabusException) TransactionInfo(com.linkedin.databus.monitoring.mbean.GGParserStatistics.TransactionInfo)

Aggregations

DatabusException (com.linkedin.databus2.core.DatabusException)10 File (java.io.File)9 Test (org.testng.annotations.Test)8 FilePositionResult (com.linkedin.databus.core.TrailFilePositionSetter.FilePositionResult)6 GGXMLTrailTransactionFinder (com.linkedin.databus2.producers.db.GGXMLTrailTransactionFinder)6 Logger (org.apache.log4j.Logger)6 ArrayList (java.util.ArrayList)5 DatabusRuntimeException (com.linkedin.databus.core.DatabusRuntimeException)3 HashSet (java.util.HashSet)3 Schema (org.apache.avro.Schema)3 QueryEvent (com.google.code.or.binlog.impl.event.QueryEvent)2 XidEvent (com.google.code.or.binlog.impl.event.XidEvent)2 DbusEventBufferAppendable (com.linkedin.databus.core.DbusEventBufferAppendable)2 TransactionInfo (com.linkedin.databus.monitoring.mbean.GGParserStatistics.TransactionInfo)2 TransactionState (com.linkedin.databus2.ggParser.XmlStateMachine.TransactionState)2 PhysicalSourceStaticConfig (com.linkedin.databus2.relay.config.PhysicalSourceStaticConfig)2 VersionedSchema (com.linkedin.databus2.schemas.VersionedSchema)2 GenericRecord (org.apache.avro.generic.GenericRecord)2 BinlogEventV4 (com.google.code.or.binlog.BinlogEventV4)1 AbstractBinlogEventV4 (com.google.code.or.binlog.impl.event.AbstractBinlogEventV4)1