Search in sources :

Example 41 with VisibleForTesting

use of org.apache.flink.annotation.VisibleForTesting in project flink by apache.

the class KinesisDataFetcher method emitWatermark.

/**
 * Called periodically to emit a watermark. Checks all shards for the current event time
 * watermark, and possibly emits the next watermark.
 *
 * <p>Shards that have not received an update for a certain interval are considered inactive so
 * as to not hold back the watermark indefinitely. When all shards are inactive, the subtask
 * will be marked as temporarily idle to not block downstream operators.
 */
@VisibleForTesting
protected void emitWatermark() {
    LOG.debug("Evaluating watermark for subtask {} time {}", indexOfThisConsumerSubtask, getCurrentTimeMillis());
    long potentialWatermark = Long.MAX_VALUE;
    long potentialNextWatermark = Long.MAX_VALUE;
    long idleTime = (shardIdleIntervalMillis > 0) ? getCurrentTimeMillis() - shardIdleIntervalMillis : Long.MAX_VALUE;
    for (Map.Entry<Integer, ShardWatermarkState> e : shardWatermarks.entrySet()) {
        Watermark w = e.getValue().lastEmittedRecordWatermark;
        // consider only active shards, or those that would advance the watermark
        if (w != null && (e.getValue().lastUpdated >= idleTime || e.getValue().emitQueue.getSize() > 0 || w.getTimestamp() > lastWatermark)) {
            potentialWatermark = Math.min(potentialWatermark, w.getTimestamp());
            // for sync, use the watermark of the next record, when available
            // otherwise watermark may stall when record is blocked by synchronization
            RecordEmitter.RecordQueue<RecordWrapper<T>> q = e.getValue().emitQueue;
            RecordWrapper<T> nextRecord = q.peek();
            Watermark nextWatermark = (nextRecord != null) ? nextRecord.watermark : w;
            potentialNextWatermark = Math.min(potentialNextWatermark, nextWatermark.getTimestamp());
        }
    }
    // advance watermark if possible (watermarks can only be ascending)
    if (potentialWatermark == Long.MAX_VALUE) {
        if (shardWatermarks.isEmpty() || shardIdleIntervalMillis > 0) {
            LOG.info("No active shard for subtask {}, marking the source idle.", indexOfThisConsumerSubtask);
            // no active shard, signal downstream operators to not wait for a watermark
            sourceContext.markAsTemporarilyIdle();
            isIdle = true;
        }
    } else {
        if (potentialWatermark > lastWatermark) {
            LOG.debug("Emitting watermark {} from subtask {}", potentialWatermark, indexOfThisConsumerSubtask);
            sourceContext.emitWatermark(new Watermark(potentialWatermark));
            lastWatermark = potentialWatermark;
            isIdle = false;
        }
        nextWatermark = potentialNextWatermark;
    }
}
Also used : AtomicInteger(java.util.concurrent.atomic.AtomicInteger) Map(java.util.Map) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap) HashMap(java.util.HashMap) Watermark(org.apache.flink.streaming.api.watermark.Watermark) RecordEmitter(org.apache.flink.streaming.connectors.kinesis.util.RecordEmitter) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting)

Example 42 with VisibleForTesting

use of org.apache.flink.annotation.VisibleForTesting in project flink by apache.

the class StateBootstrapTransformation method getConfig.

@VisibleForTesting
StreamConfig getConfig(OperatorID operatorID, StateBackend stateBackend, Configuration additionalConfig, StreamOperator<TaggedOperatorSubtaskState> operator) {
    // Eagerly perform a deep copy of the configuration, otherwise it will result in undefined
    // behavior when deploying with multiple bootstrap transformations.
    Configuration deepCopy = new Configuration(MutableConfig.of(stream.getExecutionEnvironment().getConfiguration()));
    deepCopy.addAll(additionalConfig);
    final StreamConfig config = new StreamConfig(deepCopy);
    config.setChainStart();
    config.setCheckpointingEnabled(true);
    config.setCheckpointMode(CheckpointingMode.EXACTLY_ONCE);
    if (keyType != null) {
        TypeSerializer<?> keySerializer = keyType.createSerializer(stream.getExecutionEnvironment().getConfig());
        config.setStateKeySerializer(keySerializer);
        config.setStatePartitioner(0, keySelector);
    }
    config.setStreamOperator(operator);
    config.setOperatorName(operatorID.toHexString());
    config.setOperatorID(operatorID);
    config.setStateBackend(stateBackend);
    // This means leaving this stateBackend unwrapped.
    config.setChangelogStateBackendEnabled(TernaryBoolean.FALSE);
    config.setManagedMemoryFractionOperatorOfUseCase(ManagedMemoryUseCase.STATE_BACKEND, 1.0);
    return config;
}
Also used : Configuration(org.apache.flink.configuration.Configuration) StreamConfig(org.apache.flink.streaming.api.graph.StreamConfig) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting)

Example 43 with VisibleForTesting

use of org.apache.flink.annotation.VisibleForTesting in project flink by apache.

the class HiveTableFileInputFormat method toHadoopFileSplit.

@VisibleForTesting
static FileSplit toHadoopFileSplit(FileInputSplit fileSplit) throws IOException {
    URI uri = fileSplit.getPath().toUri();
    long length = fileSplit.getLength();
    // Hadoop FileSplit should not have -1 length.
    if (length == -1) {
        length = fileSplit.getPath().getFileSystem().getFileStatus(fileSplit.getPath()).getLen() - fileSplit.getStart();
    }
    return new FileSplit(new Path(uri), fileSplit.getStart(), length, (String[]) null);
}
Also used : Path(org.apache.hadoop.fs.Path) FileSplit(org.apache.hadoop.mapred.FileSplit) URI(java.net.URI) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting)

Example 44 with VisibleForTesting

use of org.apache.flink.annotation.VisibleForTesting in project flink by apache.

the class HiveCatalog method createHiveConf.

@VisibleForTesting
static HiveConf createHiveConf(@Nullable String hiveConfDir, @Nullable String hadoopConfDir) {
    // create HiveConf from hadoop configuration with hadoop conf directory configured.
    Configuration hadoopConf = null;
    if (isNullOrWhitespaceOnly(hadoopConfDir)) {
        for (String possibleHadoopConfPath : HadoopUtils.possibleHadoopConfPaths(new org.apache.flink.configuration.Configuration())) {
            hadoopConf = getHadoopConfiguration(possibleHadoopConfPath);
            if (hadoopConf != null) {
                break;
            }
        }
    } else {
        hadoopConf = getHadoopConfiguration(hadoopConfDir);
        if (hadoopConf == null) {
            String possiableUsedConfFiles = "core-site.xml | hdfs-site.xml | yarn-site.xml | mapred-site.xml";
            throw new CatalogException("Failed to load the hadoop conf from specified path:" + hadoopConfDir, new FileNotFoundException("Please check the path none of the conf files (" + possiableUsedConfFiles + ") exist in the folder."));
        }
    }
    if (hadoopConf == null) {
        hadoopConf = new Configuration();
    }
    // ignore all the static conf file URLs that HiveConf may have set
    HiveConf.setHiveSiteLocation(null);
    HiveConf.setLoadMetastoreConfig(false);
    HiveConf.setLoadHiveServer2Config(false);
    HiveConf hiveConf = new HiveConf(hadoopConf, HiveConf.class);
    LOG.info("Setting hive conf dir as {}", hiveConfDir);
    if (hiveConfDir != null) {
        Path hiveSite = new Path(hiveConfDir, HIVE_SITE_FILE);
        if (!hiveSite.toUri().isAbsolute()) {
            // treat relative URI as local file to be compatible with previous behavior
            hiveSite = new Path(new File(hiveSite.toString()).toURI());
        }
        try (InputStream inputStream = hiveSite.getFileSystem(hadoopConf).open(hiveSite)) {
            hiveConf.addResource(inputStream, hiveSite.toString());
            // trigger a read from the conf so that the input stream is read
            isEmbeddedMetastore(hiveConf);
        } catch (IOException e) {
            throw new CatalogException("Failed to load hive-site.xml from specified path:" + hiveSite, e);
        }
    } else {
        // user doesn't provide hive conf dir, we try to find it in classpath
        URL hiveSite = Thread.currentThread().getContextClassLoader().getResource(HIVE_SITE_FILE);
        if (hiveSite != null) {
            LOG.info("Found {} in classpath: {}", HIVE_SITE_FILE, hiveSite);
            hiveConf.addResource(hiveSite);
        }
    }
    return hiveConf;
}
Also used : Path(org.apache.hadoop.fs.Path) ObjectPath(org.apache.flink.table.catalog.ObjectPath) Configuration(org.apache.hadoop.conf.Configuration) HiveTableUtil.getHadoopConfiguration(org.apache.flink.table.catalog.hive.util.HiveTableUtil.getHadoopConfiguration) InputStream(java.io.InputStream) CatalogException(org.apache.flink.table.catalog.exceptions.CatalogException) FileNotFoundException(java.io.FileNotFoundException) IOException(java.io.IOException) URL(java.net.URL) HiveConf(org.apache.hadoop.hive.conf.HiveConf) File(java.io.File) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting)

Example 45 with VisibleForTesting

use of org.apache.flink.annotation.VisibleForTesting in project flink by apache.

the class TaskMailboxImpl method size.

@VisibleForTesting
public int size() {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        return batch.size() + queue.size();
    } finally {
        lock.unlock();
    }
}
Also used : ReentrantLock(java.util.concurrent.locks.ReentrantLock) VisibleForTesting(org.apache.flink.annotation.VisibleForTesting)

Aggregations

VisibleForTesting (org.apache.flink.annotation.VisibleForTesting)64 HashMap (java.util.HashMap)11 IOException (java.io.IOException)8 ArrayList (java.util.ArrayList)7 Configuration (org.apache.flink.configuration.Configuration)7 Map (java.util.Map)6 File (java.io.File)5 URI (java.net.URI)4 List (java.util.List)4 Tuple2 (org.apache.flink.api.java.tuple.Tuple2)4 Field (java.lang.reflect.Field)3 Set (java.util.Set)3 Nullable (javax.annotation.Nullable)3 ByteArrayOutputStream (java.io.ByteArrayOutputStream)2 InputStream (java.io.InputStream)2 Path (java.nio.file.Path)2 ConcurrentHashMap (java.util.concurrent.ConcurrentHashMap)2 Matcher (java.util.regex.Matcher)2 MetricGroup (org.apache.flink.metrics.MetricGroup)2 ExecutionJobVertex (org.apache.flink.runtime.executiongraph.ExecutionJobVertex)2