Search in sources :

Example 11 with Dataset

use of co.cask.cdap.api.dataset.Dataset in project cdap by caskdata.

the class SingleTypeModule method register.

@Override
public void register(DatasetDefinitionRegistry registry) {
    final Constructor ctor = findSuitableCtorOrFail(dataSetClass);
    DatasetType typeAnn = dataSetClass.getAnnotation(DatasetType.class);
    // default type name to dataset class name
    String typeName = typeAnn != null ? typeAnn.value() : dataSetClass.getName();
    // The ordering is important. It is the same order as the parameters
    final Map<String, DatasetDefinition> embeddedDefinitions = Maps.newLinkedHashMap();
    final Class<?>[] paramTypes = ctor.getParameterTypes();
    Annotation[][] paramAnns = ctor.getParameterAnnotations();
    // Gather all dataset name and type information for the @EmbeddedDataset parameters
    for (int i = 1; i < paramTypes.length; i++) {
        // Must have the EmbeddedDataset as it's the contract of the findSuitableCtorOrFail method
        EmbeddedDataset anno = Iterables.filter(Arrays.asList(paramAnns[i]), EmbeddedDataset.class).iterator().next();
        String type = anno.type();
        // default to dataset class name if dataset type name is not specified through the annotation
        if (EmbeddedDataset.DEFAULT_TYPE_NAME.equals(type)) {
            type = paramTypes[i].getName();
        }
        DatasetDefinition embeddedDefinition = registry.get(type);
        if (embeddedDefinition == null) {
            throw new IllegalStateException(String.format("Unknown Dataset type '%s', specified by parameter number %d of the %s Dataset", type, i, dataSetClass.getName()));
        }
        embeddedDefinitions.put(anno.value(), embeddedDefinition);
    }
    registry.add(new CompositeDatasetDefinition<Dataset>(typeName, embeddedDefinitions) {

        @Override
        public Dataset getDataset(DatasetContext datasetContext, DatasetSpecification spec, Map<String, String> arguments, ClassLoader classLoader) throws IOException {
            List<Object> params = new ArrayList<>();
            params.add(spec);
            for (Map.Entry<String, DatasetDefinition> entry : embeddedDefinitions.entrySet()) {
                params.add(entry.getValue().getDataset(datasetContext, spec.getSpecification(entry.getKey()), arguments, classLoader));
            }
            try {
                return (Dataset) ctor.newInstance(params.toArray());
            } catch (Exception e) {
                throw Throwables.propagate(e);
            }
        }
    });
}
Also used : EmbeddedDataset(co.cask.cdap.api.dataset.module.EmbeddedDataset) Constructor(java.lang.reflect.Constructor) EmbeddedDataset(co.cask.cdap.api.dataset.module.EmbeddedDataset) Dataset(co.cask.cdap.api.dataset.Dataset) DatasetSpecification(co.cask.cdap.api.dataset.DatasetSpecification) DatasetType(co.cask.cdap.api.dataset.module.DatasetType) IOException(java.io.IOException) DatasetDefinition(co.cask.cdap.api.dataset.DatasetDefinition) CompositeDatasetDefinition(co.cask.cdap.api.dataset.lib.CompositeDatasetDefinition) IOException(java.io.IOException) ArrayList(java.util.ArrayList) List(java.util.List) DatasetContext(co.cask.cdap.api.dataset.DatasetContext)

Example 12 with Dataset

use of co.cask.cdap.api.dataset.Dataset in project cdap by caskdata.

the class SingleThreadDatasetCache method getDataset.

@Override
public <T extends Dataset> T getDataset(DatasetCacheKey key, boolean bypass) throws DatasetInstantiationException {
    Dataset dataset;
    try {
        if (bypass) {
            dataset = datasetLoader.load(key);
        } else {
            try {
                dataset = datasetCache.get(key);
            } catch (ExecutionException | UncheckedExecutionException e) {
                throw e.getCause();
            }
        }
    } catch (DatasetInstantiationException | ServiceUnavailableException e) {
        throw e;
    } catch (Throwable t) {
        throw new DatasetInstantiationException(String.format("Could not instantiate dataset '%s:%s'", key.getNamespace(), key.getName()), t);
    }
    // make sure the dataset exists and is of the right type
    if (dataset == null) {
        throw new DatasetInstantiationException(String.format("Dataset '%s' does not exist", key.getName()));
    }
    T typedDataset;
    try {
        @SuppressWarnings("unchecked") T t = (T) dataset;
        typedDataset = t;
    } catch (Throwable t) {
        // must be ClassCastException
        throw new DatasetInstantiationException(String.format("Could not cast dataset '%s' to requested type. Actual type is %s.", key.getName(), dataset.getClass().getName()), t);
    }
    // any transaction aware that is not in the active tx-awares is added to the current tx context (if there is one).
    if (!bypass && dataset instanceof TransactionAware) {
        TransactionAware txAware = (TransactionAware) dataset;
        TransactionAware existing = activeTxAwares.get(key);
        if (existing == null) {
            activeTxAwares.put(key, txAware);
            if (txContext != null) {
                txContext.addTransactionAware(txAware);
            }
        } else if (existing != dataset) {
            // this better be the same dataset, otherwise the cache did not work
            throw new IllegalStateException(String.format("Unexpected state: Cache returned %s for %s, which is different from the " + "active transaction aware %s for the same key. This should never happen.", dataset, key, existing));
        }
    }
    return typedDataset;
}
Also used : UncheckedExecutionException(com.google.common.util.concurrent.UncheckedExecutionException) MeteredDataset(co.cask.cdap.api.dataset.metrics.MeteredDataset) Dataset(co.cask.cdap.api.dataset.Dataset) TransactionAware(org.apache.tephra.TransactionAware) ServiceUnavailableException(co.cask.cdap.common.ServiceUnavailableException) UncheckedExecutionException(com.google.common.util.concurrent.UncheckedExecutionException) ExecutionException(java.util.concurrent.ExecutionException) DatasetInstantiationException(co.cask.cdap.api.data.DatasetInstantiationException)

Example 13 with Dataset

use of co.cask.cdap.api.dataset.Dataset in project cdap by caskdata.

the class DatasetClassRewriterTest method testDatasetAccessRecorder.

@Test
public void testDatasetAccessRecorder() throws Exception {
    ByteCodeClassLoader classLoader = new ByteCodeClassLoader(getClass().getClassLoader());
    classLoader.addClass(rewrite(TopLevelExtendsDataset.class));
    classLoader.addClass(rewrite(TopLevelDirectDataset.class));
    classLoader.addClass(rewrite(TopLevelDataset.class));
    classLoader.addClass(rewrite(DefaultTopLevelExtendsDataset.class));
    classLoader.addClass(rewrite(CustomDatasetApp.InnerStaticInheritDataset.class));
    classLoader.addClass(rewrite(CustomDatasetApp.InnerDataset.class));
    InMemoryAccessRecorder accessRecorder = new InMemoryAccessRecorder();
    TestAuthorizationEnforcer authEnforcer = new TestAuthorizationEnforcer(EnumSet.allOf(Action.class));
    testDatasetAccessRecord(accessRecorder, createDataset(accessRecorder, authEnforcer, TopLevelDataset.class.getName(), classLoader));
    accessRecorder.clear();
    testDatasetAccessRecord(accessRecorder, createDataset(accessRecorder, authEnforcer, DefaultTopLevelExtendsDataset.class.getName(), classLoader));
    accessRecorder.clear();
    Dataset delegate = createDataset(accessRecorder, authEnforcer, TopLevelDataset.class.getName(), classLoader);
    testDatasetAccessRecord(accessRecorder, createDataset(accessRecorder, authEnforcer, DelegatingDataset.class.getName(), classLoader, new Class<?>[] { CustomOperations.class }, new Object[] { delegate }));
    accessRecorder.clear();
    testDatasetAccessRecord(accessRecorder, createDataset(accessRecorder, authEnforcer, CustomDatasetApp.InnerStaticInheritDataset.class.getName(), classLoader));
    accessRecorder.clear();
    testDatasetAccessRecord(accessRecorder, createDataset(accessRecorder, authEnforcer, CustomDatasetApp.InnerDataset.class.getName(), classLoader, new Class<?>[] { CustomDatasetApp.class }, new Object[] { new CustomDatasetApp() }));
}
Also used : ByteCodeClassLoader(co.cask.cdap.internal.asm.ByteCodeClassLoader) Action(co.cask.cdap.proto.security.Action) TopLevelDataset(co.cask.cdap.data2.dataset2.customds.TopLevelDataset) TopLevelDataset(co.cask.cdap.data2.dataset2.customds.TopLevelDataset) DelegatingDataset(co.cask.cdap.data2.dataset2.customds.DelegatingDataset) TopLevelDirectDataset(co.cask.cdap.data2.dataset2.customds.TopLevelDirectDataset) Dataset(co.cask.cdap.api.dataset.Dataset) DefaultTopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.DefaultTopLevelExtendsDataset) TopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.TopLevelExtendsDataset) DefaultTopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.DefaultTopLevelExtendsDataset) TopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.TopLevelExtendsDataset) CustomDatasetApp(co.cask.cdap.data2.dataset2.customds.CustomDatasetApp) CustomOperations(co.cask.cdap.data2.dataset2.customds.CustomOperations) TopLevelDirectDataset(co.cask.cdap.data2.dataset2.customds.TopLevelDirectDataset) DefaultTopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.DefaultTopLevelExtendsDataset) Test(org.junit.Test)

Example 14 with Dataset

use of co.cask.cdap.api.dataset.Dataset in project cdap by caskdata.

the class DatasetClassRewriterTest method testDatasetAuthorization.

@Test
public void testDatasetAuthorization() throws Exception {
    ByteCodeClassLoader classLoader = new ByteCodeClassLoader(getClass().getClassLoader());
    classLoader.addClass(rewrite(TopLevelExtendsDataset.class));
    classLoader.addClass(rewrite(TopLevelDirectDataset.class));
    classLoader.addClass(rewrite(TopLevelDataset.class));
    classLoader.addClass(rewrite(DefaultTopLevelExtendsDataset.class));
    classLoader.addClass(rewrite(CustomDatasetApp.InnerStaticInheritDataset.class));
    classLoader.addClass(rewrite(CustomDatasetApp.InnerDataset.class));
    InMemoryAccessRecorder accessRecorder = new InMemoryAccessRecorder();
    // Test no access
    TestAuthorizationEnforcer authEnforcer = new TestAuthorizationEnforcer(EnumSet.noneOf(Action.class));
    testNoAccess(createDataset(accessRecorder, authEnforcer, TopLevelDataset.class.getName(), classLoader));
    testNoAccess(createDataset(accessRecorder, authEnforcer, DefaultTopLevelExtendsDataset.class.getName(), classLoader));
    Dataset delegate = createDataset(accessRecorder, authEnforcer, TopLevelDataset.class.getName(), classLoader);
    testNoAccess(createDataset(accessRecorder, authEnforcer, DelegatingDataset.class.getName(), classLoader, new Class<?>[] { CustomOperations.class }, new Object[] { delegate }));
    testNoAccess(createDataset(accessRecorder, authEnforcer, CustomDatasetApp.InnerStaticInheritDataset.class.getName(), classLoader));
    testNoAccess(createDataset(accessRecorder, authEnforcer, CustomDatasetApp.InnerDataset.class.getName(), classLoader, new Class<?>[] { CustomDatasetApp.class }, new Object[] { new CustomDatasetApp() }));
    // Test read only access
    authEnforcer = new TestAuthorizationEnforcer(EnumSet.of(Action.READ));
    testReadOnlyAccess(createDataset(accessRecorder, authEnforcer, TopLevelDataset.class.getName(), classLoader));
    testReadOnlyAccess(createDataset(accessRecorder, authEnforcer, DefaultTopLevelExtendsDataset.class.getName(), classLoader));
    delegate = createDataset(accessRecorder, authEnforcer, TopLevelDataset.class.getName(), classLoader);
    testReadOnlyAccess(createDataset(accessRecorder, authEnforcer, DelegatingDataset.class.getName(), classLoader, new Class<?>[] { CustomOperations.class }, new Object[] { delegate }));
    testReadOnlyAccess(createDataset(accessRecorder, authEnforcer, CustomDatasetApp.InnerStaticInheritDataset.class.getName(), classLoader));
    testReadOnlyAccess(createDataset(accessRecorder, authEnforcer, CustomDatasetApp.InnerDataset.class.getName(), classLoader, new Class<?>[] { CustomDatasetApp.class }, new Object[] { new CustomDatasetApp() }));
    // Test write only access
    authEnforcer = new TestAuthorizationEnforcer(EnumSet.of(Action.WRITE));
    testWriteOnlyAccess(createDataset(accessRecorder, authEnforcer, TopLevelDataset.class.getName(), classLoader));
    testWriteOnlyAccess(createDataset(accessRecorder, authEnforcer, DefaultTopLevelExtendsDataset.class.getName(), classLoader));
    delegate = createDataset(accessRecorder, authEnforcer, TopLevelDataset.class.getName(), classLoader);
    testWriteOnlyAccess(createDataset(accessRecorder, authEnforcer, DelegatingDataset.class.getName(), classLoader, new Class<?>[] { CustomOperations.class }, new Object[] { delegate }));
    testWriteOnlyAccess(createDataset(accessRecorder, authEnforcer, CustomDatasetApp.InnerStaticInheritDataset.class.getName(), classLoader));
    testWriteOnlyAccess(createDataset(accessRecorder, authEnforcer, CustomDatasetApp.InnerDataset.class.getName(), classLoader, new Class<?>[] { CustomDatasetApp.class }, new Object[] { new CustomDatasetApp() }));
}
Also used : ByteCodeClassLoader(co.cask.cdap.internal.asm.ByteCodeClassLoader) Action(co.cask.cdap.proto.security.Action) TopLevelDataset(co.cask.cdap.data2.dataset2.customds.TopLevelDataset) TopLevelDataset(co.cask.cdap.data2.dataset2.customds.TopLevelDataset) DelegatingDataset(co.cask.cdap.data2.dataset2.customds.DelegatingDataset) TopLevelDirectDataset(co.cask.cdap.data2.dataset2.customds.TopLevelDirectDataset) Dataset(co.cask.cdap.api.dataset.Dataset) DefaultTopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.DefaultTopLevelExtendsDataset) TopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.TopLevelExtendsDataset) DefaultTopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.DefaultTopLevelExtendsDataset) TopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.TopLevelExtendsDataset) CustomDatasetApp(co.cask.cdap.data2.dataset2.customds.CustomDatasetApp) CustomOperations(co.cask.cdap.data2.dataset2.customds.CustomOperations) TopLevelDirectDataset(co.cask.cdap.data2.dataset2.customds.TopLevelDirectDataset) DefaultTopLevelExtendsDataset(co.cask.cdap.data2.dataset2.customds.DefaultTopLevelExtendsDataset) Test(org.junit.Test)

Example 15 with Dataset

use of co.cask.cdap.api.dataset.Dataset in project cdap by caskdata.

the class BasicMapReduceContext method createInput.

private Input.InputFormatProviderInput createInput(Input.DatasetInput datasetInput) {
    String datasetName = datasetInput.getName();
    Map<String, String> datasetArgs = datasetInput.getArguments();
    // keep track of the original alias to set it on the created Input before returning it
    String originalAlias = datasetInput.getAlias();
    Dataset dataset;
    if (datasetInput.getNamespace() == null) {
        dataset = getDataset(datasetName, datasetArgs, AccessType.READ);
    } else {
        dataset = getDataset(datasetInput.getNamespace(), datasetName, datasetArgs, AccessType.READ);
    }
    DatasetInputFormatProvider datasetInputFormatProvider = new DatasetInputFormatProvider(datasetInput.getNamespace(), datasetName, datasetArgs, dataset, datasetInput.getSplits(), MapReduceBatchReadableInputFormat.class);
    return (Input.InputFormatProviderInput) Input.of(datasetName, datasetInputFormatProvider).alias(originalAlias);
}
Also used : DatasetInputFormatProvider(co.cask.cdap.internal.app.runtime.batch.dataset.DatasetInputFormatProvider) Dataset(co.cask.cdap.api.dataset.Dataset)

Aggregations

Dataset (co.cask.cdap.api.dataset.Dataset)18 IOException (java.io.IOException)11 DatasetManagementException (co.cask.cdap.api.dataset.DatasetManagementException)7 SystemDatasetInstantiator (co.cask.cdap.data.dataset.SystemDatasetInstantiator)6 DatasetInstantiationException (co.cask.cdap.api.data.DatasetInstantiationException)3 UnsupportedTypeException (co.cask.cdap.api.data.schema.UnsupportedTypeException)3 PartitionedFileSet (co.cask.cdap.api.dataset.lib.PartitionedFileSet)3 BadRequestException (co.cask.cdap.common.BadRequestException)3 DatasetSpecification (co.cask.cdap.api.dataset.DatasetSpecification)2 PartitionKey (co.cask.cdap.api.dataset.lib.PartitionKey)2 Partitioning (co.cask.cdap.api.dataset.lib.Partitioning)2 TopicNotFoundException (co.cask.cdap.api.messaging.TopicNotFoundException)2 ServiceUnavailableException (co.cask.cdap.common.ServiceUnavailableException)2 CustomDatasetApp (co.cask.cdap.data2.dataset2.customds.CustomDatasetApp)2 CustomOperations (co.cask.cdap.data2.dataset2.customds.CustomOperations)2 DefaultTopLevelExtendsDataset (co.cask.cdap.data2.dataset2.customds.DefaultTopLevelExtendsDataset)2 DelegatingDataset (co.cask.cdap.data2.dataset2.customds.DelegatingDataset)2 TopLevelDataset (co.cask.cdap.data2.dataset2.customds.TopLevelDataset)2 TopLevelDirectDataset (co.cask.cdap.data2.dataset2.customds.TopLevelDirectDataset)2 TopLevelExtendsDataset (co.cask.cdap.data2.dataset2.customds.TopLevelExtendsDataset)2