Search in sources :

Example 6 with DimensionValue

use of io.cdap.cdap.api.dataset.lib.cube.DimensionValue in project cdap by cdapio.

the class DefaultCube method query.

@Override
public Collection<TimeSeries> query(CubeQuery query) {
    /*
      CubeQuery example: "dataset read ops for app per dataset". Or:

      SELECT count('read.ops')                                           << measure name and type
      FROM aggregation1.1min_resolution                                  << aggregation and resolution
      GROUP BY dataset,                                                  << groupByDimensions
      WHERE namespace='ns1' AND app='myApp' AND program='myFlow' AND     << dimensionValues
            ts>=1423370200 AND ts{@literal<}1423398198                   << startTs and endTs
      LIMIT 100                                                          << limit

      Execution:

      1) (optional, if aggregation to query in is not provided) find aggregation to supply results

      Here, we need aggregation that has following dimensions: 'namespace', 'app', 'program', 'dataset'.

      Ideally (to reduce the scan range), 'dataset' should be in the end, other dimensions as close to the beginning
      as possible, and minimal number of other "unspecified" dimensions.

      Let's say we found aggregation: 'namespace', 'app', 'program', 'instance', 'dataset'

      2) build a scan in the aggregation

      For scan we set "any" into the dimension values that aggregation has but query doesn't define value for:

      'namespace'='ns1', 'app'='myApp', 'program'='myFlow', 'instance'=*, 'dataset'=*

      Plus specified measure & aggregation?:

      'measureName'='read.ops'
      'measureType'='COUNTER'

      3) While scanning build a table: dimension values -> time -> value. Use measureType as values aggregate
         function if needed.
    */
    incrementMetric("cube.query.request.count", 1);
    if (!resolutionToFactTable.containsKey(query.getResolution())) {
        incrementMetric("cube.query.request.failure.count", 1);
        throw new IllegalArgumentException("There's no data aggregated for specified resolution to satisfy the query: " + query.toString());
    }
    // 1) find aggregation to query
    Aggregation agg;
    String aggName;
    if (query.getAggregation() != null) {
        aggName = query.getAggregation();
        agg = aggregations.get(query.getAggregation());
        if (agg == null) {
            incrementMetric("cube.query.request.failure.count", 1);
            throw new IllegalArgumentException(String.format("Specified aggregation %s is not found in cube aggregations: %s", query.getAggregation(), aggregations.keySet().toString()));
        }
    } else {
        ImmutablePair<String, Aggregation> aggregation = findAggregation(query);
        if (aggregation == null) {
            incrementMetric("cube.query.request.failure.count", 1);
            throw new IllegalArgumentException("There's no data aggregated for specified dimensions " + "to satisfy the query: " + query.toString());
        }
        agg = aggregation.getSecond();
        aggName = aggregation.getFirst();
    }
    // tell how many queries end up querying specific pre-aggregated views and resolutions
    incrementMetric("cube.query.agg." + aggName + ".count", 1);
    incrementMetric("cube.query.res." + query.getResolution() + ".count", 1);
    // 2) build a scan for a query
    List<DimensionValue> dimensionValues = Lists.newArrayList();
    for (String dimensionName : agg.getDimensionNames()) {
        // if not defined in query, will be set as null, which means "any"
        dimensionValues.add(new DimensionValue(dimensionName, query.getDimensionValues().get(dimensionName)));
    }
    FactScan scan = new FactScan(query.getStartTs(), query.getEndTs(), query.getMeasurements().keySet(), dimensionValues);
    // 3) execute scan query
    FactTable table = resolutionToFactTable.get(query.getResolution());
    FactScanner scanner = table.scan(scan);
    Table<Map<String, String>, String, Map<Long, Long>> resultMap = getTimeSeries(query, scanner);
    incrementMetric("cube.query.request.success.count", 1);
    incrementMetric("cube.query.result.size", resultMap.size());
    Collection<TimeSeries> timeSeries = convertToQueryResult(query, resultMap);
    incrementMetric("cube.query.result.timeseries.count", timeSeries.size());
    return timeSeries;
}
Also used : FactScan(io.cdap.cdap.data2.dataset2.lib.timeseries.FactScan) TimeSeries(io.cdap.cdap.api.dataset.lib.cube.TimeSeries) FactScanner(io.cdap.cdap.data2.dataset2.lib.timeseries.FactScanner) FactTable(io.cdap.cdap.data2.dataset2.lib.timeseries.FactTable) DimensionValue(io.cdap.cdap.api.dataset.lib.cube.DimensionValue) Map(java.util.Map) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap)

Example 7 with DimensionValue

use of io.cdap.cdap.api.dataset.lib.cube.DimensionValue in project cdap by cdapio.

the class DefaultCube method findDimensionValues.

@Override
public Collection<DimensionValue> findDimensionValues(CubeExploreQuery query) {
    LOG.trace("Searching for next-level context, query: {}", query);
    // In each aggregation that matches given dimensions, try to fill in value in a single null-valued given dimension.
    // NOTE: that we try to fill in first value that is non-null-valued in a stored record
    // (see FactTable#findSingleDimensionValue)
    SortedSet<DimensionValue> result = Sets.newTreeSet(DIMENSION_VALUE_COMPARATOR);
    // todo: the passed query should have map instead
    LinkedHashMap<String, String> slice = Maps.newLinkedHashMap();
    for (DimensionValue dimensionValue : query.getDimensionValues()) {
        slice.put(dimensionValue.getName(), dimensionValue.getValue());
    }
    FactTable table = resolutionToFactTable.get(query.getResolution());
    for (Aggregation agg : aggregations.values()) {
        if (agg.getDimensionNames().containsAll(slice.keySet())) {
            result.addAll(table.findSingleDimensionValue(agg.getDimensionNames(), slice, query.getStartTs(), query.getEndTs()));
        }
    }
    return result;
}
Also used : FactTable(io.cdap.cdap.data2.dataset2.lib.timeseries.FactTable) DimensionValue(io.cdap.cdap.api.dataset.lib.cube.DimensionValue)

Example 8 with DimensionValue

use of io.cdap.cdap.api.dataset.lib.cube.DimensionValue in project cdap by cdapio.

the class DefaultCube method add.

@Override
public void add(Collection<? extends CubeFact> facts) {
    List<Fact> toWrite = Lists.newArrayList();
    int dimValuesCount = 0;
    for (CubeFact fact : facts) {
        for (Map.Entry<String, ? extends Aggregation> aggEntry : aggregations.entrySet()) {
            Aggregation agg = aggEntry.getValue();
            AggregationAlias aggregationAlias = null;
            if (aggregationAliasMap.containsKey(aggEntry.getKey())) {
                aggregationAlias = aggregationAliasMap.get(aggEntry.getKey());
            }
            if (agg.accept(fact)) {
                List<DimensionValue> dimensionValues = Lists.newArrayList();
                for (String dimensionName : agg.getDimensionNames()) {
                    String dimensionValueKey = aggregationAlias == null ? dimensionName : aggregationAlias.getAlias(dimensionName);
                    dimensionValues.add(new DimensionValue(dimensionName, fact.getDimensionValues().get(dimensionValueKey)));
                    dimValuesCount++;
                }
                toWrite.add(new Fact(fact.getTimestamp(), dimensionValues, fact.getMeasurements()));
            }
        }
    }
    Map<Integer, Future<?>> futures = new HashMap<>();
    for (Map.Entry<Integer, FactTable> table : resolutionToFactTable.entrySet()) {
        futures.put(table.getKey(), executorService.submit(() -> table.getValue().add(toWrite)));
    }
    boolean failed = false;
    Exception failedException = null;
    StringBuilder failedMessage = new StringBuilder("Failed to add metrics to ");
    for (Map.Entry<Integer, Future<?>> future : futures.entrySet()) {
        try {
            Uninterruptibles.getUninterruptibly(future.getValue());
        } catch (ExecutionException e) {
            if (!failed) {
                failed = true;
                failedMessage.append(String.format("the %d resolution table", future.getKey()));
            } else {
                failedMessage.append(String.format(", the %d resolution table", future.getKey()));
            }
            if (failedException == null) {
                failedException = e;
            } else {
                failedException.addSuppressed(e);
            }
        }
    }
    if (failed) {
        throw new RuntimeException(failedMessage.append(".").toString(), failedException);
    }
    incrementMetric("cube.cubeFact.add.request.count", 1);
    incrementMetric("cube.cubeFact.added.count", facts.size());
    incrementMetric("cube.tsFact.created.count", toWrite.size());
    incrementMetric("cube.tsFact.created.dimValues.count", dimValuesCount);
    incrementMetric("cube.tsFact.added.count", toWrite.size() * resolutionToFactTable.size());
}
Also used : HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) Fact(io.cdap.cdap.data2.dataset2.lib.timeseries.Fact) CubeFact(io.cdap.cdap.api.dataset.lib.cube.CubeFact) IOException(java.io.IOException) ExecutionException(java.util.concurrent.ExecutionException) CubeFact(io.cdap.cdap.api.dataset.lib.cube.CubeFact) FactTable(io.cdap.cdap.data2.dataset2.lib.timeseries.FactTable) DimensionValue(io.cdap.cdap.api.dataset.lib.cube.DimensionValue) Future(java.util.concurrent.Future) ExecutionException(java.util.concurrent.ExecutionException) Map(java.util.Map) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap)

Example 9 with DimensionValue

use of io.cdap.cdap.api.dataset.lib.cube.DimensionValue in project cdap by cdapio.

the class TestAppWithCube method testApp.

@Category(SlowTests.class)
@Test
public void testApp() throws Exception {
    // Deploy the application
    ApplicationManager appManager = deployApplication(AppWithCube.class);
    ServiceManager serviceManager = appManager.getServiceManager(AppWithCube.SERVICE_NAME).start();
    try {
        serviceManager.waitForRun(ProgramRunStatus.RUNNING, 10, TimeUnit.SECONDS);
        URL url = serviceManager.getServiceURL();
        long tsInSec = System.currentTimeMillis() / 1000;
        // round to a minute for testing minute resolution
        tsInSec = (tsInSec / 60) * 60;
        // add couple facts
        add(url, ImmutableList.of(new CubeFact(tsInSec).addDimensionValue("user", "alex").addDimensionValue("action", "click").addMeasurement("count", MeasureType.COUNTER, 1)));
        add(url, ImmutableList.of(new CubeFact(tsInSec).addDimensionValue("user", "alex").addDimensionValue("action", "click").addMeasurement("count", MeasureType.COUNTER, 1), new CubeFact(tsInSec + 1).addDimensionValue("user", "alex").addDimensionValue("action", "back").addMeasurement("count", MeasureType.COUNTER, 1), new CubeFact(tsInSec + 2).addDimensionValue("user", "alex").addDimensionValue("action", "click").addMeasurement("count", MeasureType.COUNTER, 1)));
        // search for tags
        Collection<DimensionValue> tags = searchDimensionValue(url, new CubeExploreQuery(tsInSec - 60, tsInSec + 60, 1, 100, new ArrayList<DimensionValue>()));
        Assert.assertEquals(1, tags.size());
        DimensionValue tv = tags.iterator().next();
        Assert.assertEquals("user", tv.getName());
        Assert.assertEquals("alex", tv.getValue());
        tags = searchDimensionValue(url, CubeExploreQuery.builder().from().resolution(1, TimeUnit.SECONDS).where().dimension("user", "alex").timeRange(tsInSec - 60, tsInSec + 60).limit(100).build());
        Assert.assertEquals(2, tags.size());
        Iterator<DimensionValue> iterator = tags.iterator();
        tv = iterator.next();
        Assert.assertEquals("action", tv.getName());
        Assert.assertEquals("back", tv.getValue());
        tv = iterator.next();
        Assert.assertEquals("action", tv.getName());
        Assert.assertEquals("click", tv.getValue());
        // search for measures
        Collection<String> measures = searchMeasure(url, new CubeExploreQuery(tsInSec - 60, tsInSec + 60, 1, 100, ImmutableList.of(new DimensionValue("user", "alex"))));
        Assert.assertEquals(1, measures.size());
        String measure = measures.iterator().next();
        Assert.assertEquals("count", measure);
        // query for data
        // 1-sec resolution
        Collection<TimeSeries> data = query(url, CubeQuery.builder().select().measurement("count", AggregationFunction.SUM).from(null).resolution(1, TimeUnit.SECONDS).where().dimension("action", "click").timeRange(tsInSec - 60, tsInSec + 60).limit(100).build());
        Assert.assertEquals(1, data.size());
        TimeSeries series = data.iterator().next();
        List<TimeValue> timeValues = series.getTimeValues();
        Assert.assertEquals(2, timeValues.size());
        TimeValue timeValue = timeValues.get(0);
        Assert.assertEquals(tsInSec, timeValue.getTimestamp());
        Assert.assertEquals(2, timeValue.getValue());
        timeValue = timeValues.get(1);
        Assert.assertEquals(tsInSec + 2, timeValue.getTimestamp());
        Assert.assertEquals(1, timeValue.getValue());
        // 60-sec resolution
        data = query(url, new CubeQuery(null, tsInSec - 60, tsInSec + 60, 60, 100, ImmutableMap.of("count", AggregationFunction.SUM), ImmutableMap.of("action", "click"), new ArrayList<String>(), null, null));
        Assert.assertEquals(1, data.size());
        series = data.iterator().next();
        timeValues = series.getTimeValues();
        Assert.assertEquals(1, timeValues.size());
        timeValue = timeValues.get(0);
        Assert.assertEquals(tsInSec, timeValue.getTimestamp());
        Assert.assertEquals(3, timeValue.getValue());
    } finally {
        serviceManager.stop();
        serviceManager.waitForStopped(10, TimeUnit.SECONDS);
    }
}
Also used : ApplicationManager(io.cdap.cdap.test.ApplicationManager) TimeSeries(io.cdap.cdap.api.dataset.lib.cube.TimeSeries) ArrayList(java.util.ArrayList) CubeQuery(io.cdap.cdap.api.dataset.lib.cube.CubeQuery) URL(java.net.URL) CubeFact(io.cdap.cdap.api.dataset.lib.cube.CubeFact) DimensionValue(io.cdap.cdap.api.dataset.lib.cube.DimensionValue) ServiceManager(io.cdap.cdap.test.ServiceManager) CubeExploreQuery(io.cdap.cdap.api.dataset.lib.cube.CubeExploreQuery) TimeValue(io.cdap.cdap.api.dataset.lib.cube.TimeValue) Category(org.junit.experimental.categories.Category) Test(org.junit.Test)

Example 10 with DimensionValue

use of io.cdap.cdap.api.dataset.lib.cube.DimensionValue in project cdap by cdapio.

the class TestAppWithCube method searchDimensionValue.

private Collection<DimensionValue> searchDimensionValue(URL serviceUrl, CubeExploreQuery query) throws IOException {
    URL url = new URL(serviceUrl, "searchDimensionValue");
    HttpRequest request = HttpRequest.post(url).withBody(GSON.toJson(query)).build();
    HttpResponse response = executeHttp(request);
    Assert.assertEquals(200, response.getResponseCode());
    return GSON.fromJson(response.getResponseBodyAsString(), new TypeToken<Collection<DimensionValue>>() {
    }.getType());
}
Also used : HttpRequest(io.cdap.common.http.HttpRequest) DimensionValue(io.cdap.cdap.api.dataset.lib.cube.DimensionValue) TypeToken(com.google.gson.reflect.TypeToken) HttpResponse(io.cdap.common.http.HttpResponse) URL(java.net.URL)

Aggregations

DimensionValue (io.cdap.cdap.api.dataset.lib.cube.DimensionValue)40 Test (org.junit.Test)12 InMemoryMetricsTable (io.cdap.cdap.data2.dataset2.lib.table.inmemory.InMemoryMetricsTable)10 FactTable (io.cdap.cdap.data2.dataset2.lib.timeseries.FactTable)10 TimeValue (io.cdap.cdap.api.dataset.lib.cube.TimeValue)8 Row (io.cdap.cdap.api.dataset.table.Row)8 ArrayList (java.util.ArrayList)8 Map (java.util.Map)8 Measurement (io.cdap.cdap.api.dataset.lib.cube.Measurement)6 Scanner (io.cdap.cdap.api.dataset.table.Scanner)6 FuzzyRowFilter (io.cdap.cdap.data2.dataset2.lib.table.FuzzyRowFilter)6 HashMap (java.util.HashMap)6 LinkedHashMap (java.util.LinkedHashMap)6 CubeFact (io.cdap.cdap.api.dataset.lib.cube.CubeFact)4 TimeSeries (io.cdap.cdap.api.dataset.lib.cube.TimeSeries)4 FactScan (io.cdap.cdap.data2.dataset2.lib.timeseries.FactScan)4 URL (java.net.URL)4 AbstractIterator (com.google.common.collect.AbstractIterator)2 ImmutableList (com.google.common.collect.ImmutableList)2 TypeToken (com.google.gson.reflect.TypeToken)2