Search in sources :

Example 1 with DataCompletenessConfigDTO

use of com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO in project pinot by linkedin.

the class Wo4WAvgDataCompletenessAlgorithm method getBaselineCounts.

@Override
public List<Long> getBaselineCounts(String dataset, Long bucketValue) {
    long weekInMillis = TimeUnit.MILLISECONDS.convert(7, TimeUnit.DAYS);
    long baselineInMS = bucketValue;
    List<Long> baselineCounts = new ArrayList<>();
    for (int i = 0; i < 4; i++) {
        long count = 0;
        baselineInMS = baselineInMS - weekInMillis;
        DataCompletenessConfigDTO config = dataCompletenessConfigDAO.findByDatasetAndDateMS(dataset, baselineInMS);
        if (config != null) {
            count = config.getCountStar();
        }
        baselineCounts.add(count);
    }
    return baselineCounts;
}
Also used : DataCompletenessConfigDTO(com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO) ArrayList(java.util.ArrayList)

Example 2 with DataCompletenessConfigDTO

use of com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO in project pinot by linkedin.

the class DataCompletenessTaskRunner method createEntriesInDatabaseIfNotPresent.

/**
   * Creates the current bucket entries in the table
   * @param dataset
   * @param bucketNameToBucketValueMS
   * @return
   */
private int createEntriesInDatabaseIfNotPresent(String dataset, Map<String, Long> bucketNameToBucketValueMS) {
    int numEntriesCreated = 0;
    for (Entry<String, Long> entry : bucketNameToBucketValueMS.entrySet()) {
        String bucketName = entry.getKey();
        Long bucketValue = entry.getValue();
        DataCompletenessConfigDTO checkOrCreateConfig = DAO_REGISTRY.getDataCompletenessConfigDAO().findByDatasetAndDateSDF(dataset, bucketName);
        if (checkOrCreateConfig == null) {
            checkOrCreateConfig = new DataCompletenessConfigDTO();
            checkOrCreateConfig.setDataset(dataset);
            checkOrCreateConfig.setDateToCheckInSDF(bucketName);
            checkOrCreateConfig.setDateToCheckInMS(bucketValue);
            DAO_REGISTRY.getDataCompletenessConfigDAO().save(checkOrCreateConfig);
            numEntriesCreated++;
        // NOTE: Decided to not store timeValue in the DataCompletenessConfig, because one bucket can have multiple
        // timeValues (5 MINUTES bucketed into 30 MINUTES case)
        }
    }
    return numEntriesCreated;
}
Also used : DataCompletenessConfigDTO(com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO)

Example 3 with DataCompletenessConfigDTO

use of com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO in project pinot by linkedin.

the class DataCompletenessTaskRunner method getBucketsToProcess.

/**
   * Gets all the buckets that need to be checked
   * @param dataset
   * @param adjustedStart
   * @param adjustedEnd
   * @param dataCompletenessAlgorithm
   * @param dateTimeFormatter
   * @param bucketSize
   * @return
   */
private Map<String, Long> getBucketsToProcess(String dataset, long adjustedStart, long adjustedEnd, DataCompletenessAlgorithm dataCompletenessAlgorithm, DateTimeFormatter dateTimeFormatter, long bucketSize) {
    // find completed buckets from database in timerange, for dataset, and percentComplete >= 95%
    // We're using this call instead of checking for isDataComplete, because we want to check the entries
    // even after we marked it complete, in case the percentage changes
    // But instead of checking it for anything other than 100%, setting a limit called CONSIDER_COMPLETE_AFTER
    List<DataCompletenessConfigDTO> completeEntries = DAO_REGISTRY.getDataCompletenessConfigDAO().findAllByDatasetAndInTimeRangeAndPercentCompleteGT(dataset, adjustedStart, adjustedEnd, dataCompletenessAlgorithm.getConsiderCompleteAfter());
    List<String> completeBuckets = new ArrayList<>();
    for (DataCompletenessConfigDTO entry : completeEntries) {
        completeBuckets.add(entry.getDateToCheckInSDF());
    }
    LOG.info("Data complete buckets size:{} buckets:{}", completeBuckets.size(), completeBuckets);
    // get all buckets
    // dateToCheckInSDF -> dateToCheckInMS
    Map<String, Long> bucketNameToBucketValueMS = new HashMap<>();
    while (adjustedStart < adjustedEnd) {
        String bucketName = dateTimeFormatter.print(adjustedStart);
        bucketNameToBucketValueMS.put(bucketName, adjustedStart);
        adjustedStart = adjustedStart + bucketSize;
    }
    LOG.info("All buckets size:{} buckets:{}", bucketNameToBucketValueMS.size(), bucketNameToBucketValueMS.keySet());
    // get buckets to process = all buckets - complete buckets
    bucketNameToBucketValueMS.keySet().removeAll(completeBuckets);
    LOG.info("Buckets to process = (All buckets - data complete buckets) size:{} buckets:{}", bucketNameToBucketValueMS.size(), bucketNameToBucketValueMS.keySet());
    return bucketNameToBucketValueMS;
}
Also used : DataCompletenessConfigDTO(com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList)

Example 4 with DataCompletenessConfigDTO

use of com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO in project pinot by linkedin.

the class DataCompletenessTaskRunner method executeCleanupTask.

/**
   * This task cleans up the database of data completeness config entries,
   * if they are older than constants CLEANUP_TIME_DURATION and CLEANUP_TIMEUNIT
   * @param dataCompletenessTaskInfo
   */
private void executeCleanupTask(DataCompletenessTaskInfo dataCompletenessTaskInfo) {
    LOG.info("Execute data completeness cleanup {}", dataCompletenessTaskInfo);
    try {
        // find all entries older than 30 days, delete them
        long cleanupOlderThanDuration = TimeUnit.MILLISECONDS.convert(DataCompletenessConstants.CLEANUP_TIME_DURATION, DataCompletenessConstants.CLEANUP_TIMEUNIT);
        long cleanupOlderThanMillis = new DateTime().minus(cleanupOlderThanDuration).getMillis();
        List<DataCompletenessConfigDTO> findAllByTimeOlderThan = DAO_REGISTRY.getDataCompletenessConfigDAO().findAllByTimeOlderThan(cleanupOlderThanMillis);
        LOG.info("Deleting {} entries older than {} i.e. {}", findAllByTimeOlderThan.size(), cleanupOlderThanMillis, new DateTime(cleanupOlderThanMillis));
        for (DataCompletenessConfigDTO config : findAllByTimeOlderThan) {
            DAO_REGISTRY.getDataCompletenessConfigDAO().delete(config);
        }
        // find all entries older than LOOKBACK and still dataComplete=false, mark timedOut
        long timeOutOlderThanDuration = TimeUnit.MILLISECONDS.convert(DataCompletenessConstants.LOOKBACK_TIME_DURATION, DataCompletenessConstants.LOOKBACK_TIMEUNIT);
        long timeOutOlderThanMillis = new DateTime().minus(timeOutOlderThanDuration).getMillis();
        List<DataCompletenessConfigDTO> findAllByTimeOlderThanAndStatus = DAO_REGISTRY.getDataCompletenessConfigDAO().findAllByTimeOlderThanAndStatus(timeOutOlderThanMillis, false);
        LOG.info("Timing out {} entries older than {} i.e. {} and still not complete", findAllByTimeOlderThanAndStatus.size(), timeOutOlderThanMillis, new DateTime(timeOutOlderThanMillis));
        for (DataCompletenessConfigDTO config : findAllByTimeOlderThanAndStatus) {
            config.setTimedOut(true);
            DAO_REGISTRY.getDataCompletenessConfigDAO().update(config);
        }
    } catch (Exception e) {
        LOG.error("Exception data completeness cleanup task", e);
    }
}
Also used : DataCompletenessConfigDTO(com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO) DateTime(org.joda.time.DateTime)

Example 5 with DataCompletenessConfigDTO

use of com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO in project pinot by linkedin.

the class DataCompletenessConfigManagerImpl method convertListOfBeanToDTO.

private List<DataCompletenessConfigDTO> convertListOfBeanToDTO(List<DataCompletenessConfigBean> list) {
    List<DataCompletenessConfigDTO> results = new ArrayList<>();
    for (DataCompletenessConfigBean abstractBean : list) {
        DataCompletenessConfigDTO dto = MODEL_MAPPER.map(abstractBean, DataCompletenessConfigDTO.class);
        results.add(dto);
    }
    return results;
}
Also used : DataCompletenessConfigDTO(com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO) ArrayList(java.util.ArrayList) DataCompletenessConfigBean(com.linkedin.thirdeye.datalayer.pojo.DataCompletenessConfigBean)

Aggregations

DataCompletenessConfigDTO (com.linkedin.thirdeye.datalayer.dto.DataCompletenessConfigDTO)15 DateTime (org.joda.time.DateTime)4 DataCompletenessConfigBean (com.linkedin.thirdeye.datalayer.pojo.DataCompletenessConfigBean)3 ArrayList (java.util.ArrayList)3 Test (org.testng.annotations.Test)3 Predicate (com.linkedin.thirdeye.datalayer.util.Predicate)2 HashMap (java.util.HashMap)2 DecimalFormat (java.text.DecimalFormat)1 Period (org.joda.time.Period)1