Search in sources :

Example 1 with Watermark

use of org.apache.gobblin.source.extractor.Watermark in project incubator-gobblin by apache.

the class ReplicaHadoopFsEndPoint method getWatermark.

@Override
public synchronized Optional<ComparableWatermark> getWatermark() {
    if (this.watermarkInitialized) {
        return this.cachedWatermark;
    }
    this.watermarkInitialized = true;
    try {
        Path metaData = new Path(rc.getPath(), WATERMARK_FILE);
        FileSystem fs = FileSystem.get(rc.getFsURI(), new Configuration());
        if (fs.exists(metaData)) {
            try (FSDataInputStream fin = fs.open(metaData)) {
                InputStreamReader reader = new InputStreamReader(fin, Charsets.UTF_8);
                String content = CharStreams.toString(reader);
                Watermark w = WatermarkMetadataUtil.deserialize(content);
                if (w instanceof ComparableWatermark) {
                    this.cachedWatermark = Optional.of((ComparableWatermark) w);
                }
            }
            return this.cachedWatermark;
        }
        // for replica, can not use the file time stamp as that is different with original source time stamp
        return this.cachedWatermark;
    } catch (IOException e) {
        log.warn("Can not find " + WATERMARK_FILE + " for replica " + this);
        return this.cachedWatermark;
    } catch (WatermarkMetadataUtil.WatermarkMetadataMulFormatException e) {
        log.warn("Can not create watermark from " + WATERMARK_FILE + " for replica " + this);
        return this.cachedWatermark;
    }
}
Also used : Path(org.apache.hadoop.fs.Path) ComparableWatermark(org.apache.gobblin.source.extractor.ComparableWatermark) Configuration(org.apache.hadoop.conf.Configuration) InputStreamReader(java.io.InputStreamReader) FileSystem(org.apache.hadoop.fs.FileSystem) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) IOException(java.io.IOException) Watermark(org.apache.gobblin.source.extractor.Watermark) ComparableWatermark(org.apache.gobblin.source.extractor.ComparableWatermark)

Aggregations

IOException (java.io.IOException)1 InputStreamReader (java.io.InputStreamReader)1 ComparableWatermark (org.apache.gobblin.source.extractor.ComparableWatermark)1 Watermark (org.apache.gobblin.source.extractor.Watermark)1 Configuration (org.apache.hadoop.conf.Configuration)1 FSDataInputStream (org.apache.hadoop.fs.FSDataInputStream)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Path (org.apache.hadoop.fs.Path)1