Search in sources :

Example 31 with DatasetFramework

use of co.cask.cdap.data2.dataset2.DatasetFramework in project cdap by caskdata.

the class AbstractDatasetFrameworkTest method testMultipleTransitiveDependencies.

@Test
public void testMultipleTransitiveDependencies() throws DatasetManagementException, IOException {
    // Adding modules
    DatasetFramework framework = getFramework();
    try {
        framework.addModule(IN_MEMORY, new InMemoryTableModule());
        framework.addModule(CORE, new CoreDatasetsModule());
        framework.addModule(FILE, new FileSetModule());
        framework.addModule(PFS, new PartitionedFileSetModule());
        framework.addModule(TWICE, new SingleTypeModule(EmbedsTableTwiceDataset.class));
        // Creating an instances
        framework.addInstance(EmbedsTableTwiceDataset.class.getName(), MY_DS, PartitionedFileSetProperties.builder().setPartitioning(Partitioning.builder().addStringField("x").build()).build());
        Assert.assertTrue(framework.hasInstance(MY_DS));
        framework.getDataset(MY_DS, DatasetProperties.EMPTY.getProperties(), null);
    } finally {
        framework.deleteAllInstances(NAMESPACE_ID);
        framework.deleteAllModules(NAMESPACE_ID);
    }
}
Also used : LineageWriterDatasetFramework(co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework) InMemoryTableModule(co.cask.cdap.data2.dataset2.module.lib.inmemory.InMemoryTableModule) CoreDatasetsModule(co.cask.cdap.data2.dataset2.lib.table.CoreDatasetsModule) PartitionedFileSetModule(co.cask.cdap.data2.dataset2.lib.partitioned.PartitionedFileSetModule) FileSetModule(co.cask.cdap.data2.dataset2.lib.file.FileSetModule) PartitionedFileSetModule(co.cask.cdap.data2.dataset2.lib.partitioned.PartitionedFileSetModule) Test(org.junit.Test)

Example 32 with DatasetFramework

use of co.cask.cdap.data2.dataset2.DatasetFramework in project cdap by caskdata.

the class AbstractDatasetFrameworkTest method testNamespaceModuleIsolation.

@Test
public void testNamespaceModuleIsolation() throws Exception {
    DatasetFramework framework = getFramework();
    // create 2 namespaces
    NamespaceId namespace1 = new NamespaceId("ns1");
    NamespaceId namespace2 = new NamespaceId("ns2");
    namespaceAdmin.create(new NamespaceMeta.Builder().setName(namespace1).build());
    namespaceAdmin.create(new NamespaceMeta.Builder().setName(namespace2).build());
    namespacedLocationFactory.get(namespace1).mkdirs();
    namespacedLocationFactory.get(namespace2).mkdirs();
    // add modules in each namespace, with one module that shares the same name
    DatasetModuleId simpleModuleNs1 = namespace1.datasetModule(SimpleKVTable.class.getName());
    DatasetModuleId simpleModuleNs2 = namespace2.datasetModule(SimpleKVTable.class.getName());
    DatasetModuleId doubleModuleNs2 = namespace2.datasetModule(DoubleWrappedKVTable.class.getName());
    DatasetModule module1 = new SingleTypeModule(SimpleKVTable.class);
    DatasetModule module2 = new SingleTypeModule(DoubleWrappedKVTable.class);
    framework.addModule(simpleModuleNs1, module1);
    framework.addModule(simpleModuleNs2, module1);
    framework.addModule(doubleModuleNs2, module2);
    // check that we can add instances of datasets in those modules
    framework.addInstance(SimpleKVTable.class.getName(), namespace1.dataset("kv1"), DatasetProperties.EMPTY);
    framework.addInstance(SimpleKVTable.class.getName(), namespace2.dataset("kv1"), DatasetProperties.EMPTY);
    // check that only namespace2 can add an instance of this type, since the module should only be in namespace2
    framework.addInstance(DoubleWrappedKVTable.class.getName(), namespace2.dataset("kv2"), DatasetProperties.EMPTY);
    try {
        framework.addInstance(DoubleWrappedKVTable.class.getName(), namespace2.dataset("kv2"), DatasetProperties.EMPTY);
        Assert.fail();
    } catch (Exception e) {
    // expected
    }
    // check that deleting all modules from namespace2 does not affect namespace1
    framework.deleteAllInstances(namespace2);
    framework.deleteAllModules(namespace2);
    // should still be able to add an instance in namespace1
    framework.addInstance(SimpleKVTable.class.getName(), namespace1.dataset("kv3"), DatasetProperties.EMPTY);
    // but not in namespace2
    try {
        framework.addInstance(SimpleKVTable.class.getName(), namespace2.dataset("kv3"), DatasetProperties.EMPTY);
        Assert.fail();
    } catch (Exception e) {
    // expected
    }
    // add back modules to namespace2
    framework.addModule(simpleModuleNs2, module1);
    framework.addModule(doubleModuleNs2, module2);
    // check that deleting a single module from namespace1 does not affect namespace2
    framework.deleteAllInstances(namespace1);
    framework.deleteModule(simpleModuleNs1);
    // should still be able to add an instance in namespace2
    framework.addInstance(DoubleWrappedKVTable.class.getName(), namespace2.dataset("kv1"), DatasetProperties.EMPTY);
    // but not in namespace1
    try {
        framework.addInstance(SimpleKVTable.class.getName(), namespace1.dataset("kv1"), DatasetProperties.EMPTY);
        Assert.fail();
    } catch (Exception e) {
    // expected
    }
}
Also used : LineageWriterDatasetFramework(co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework) DatasetModuleId(co.cask.cdap.proto.id.DatasetModuleId) NamespaceMeta(co.cask.cdap.proto.NamespaceMeta) NamespaceId(co.cask.cdap.proto.id.NamespaceId) DatasetModule(co.cask.cdap.api.dataset.module.DatasetModule) InstanceConflictException(co.cask.cdap.api.dataset.InstanceConflictException) DatasetManagementException(co.cask.cdap.api.dataset.DatasetManagementException) IOException(java.io.IOException) Test(org.junit.Test)

Example 33 with DatasetFramework

use of co.cask.cdap.data2.dataset2.DatasetFramework in project cdap by caskdata.

the class AbstractDatasetFrameworkTest method testNamespaceInstanceIsolation.

@Test
@SuppressWarnings("ConstantConditions")
public void testNamespaceInstanceIsolation() throws Exception {
    DatasetFramework framework = getFramework();
    // create 2 namespaces
    NamespaceId namespace1 = new NamespaceId("ns1");
    NamespaceId namespace2 = new NamespaceId("ns2");
    namespaceAdmin.create(new NamespaceMeta.Builder().setName(namespace1).build());
    namespaceAdmin.create(new NamespaceMeta.Builder().setName(namespace2).build());
    namespacedLocationFactory.get(namespace1).mkdirs();
    namespacedLocationFactory.get(namespace2).mkdirs();
    // create 2 tables, one in each namespace. both tables have the same name.
    DatasetId table1ID = namespace1.dataset("table");
    DatasetId table2ID = namespace2.dataset("table");
    // have slightly different properties so that we can distinguish between them
    framework.addInstance(Table.class.getName(), table1ID, DatasetProperties.builder().add("tag", "table1").build());
    framework.addInstance(Table.class.getName(), table2ID, DatasetProperties.builder().add("tag", "table2").build());
    // perform some data operations to make sure they are not the same underlying table
    final Table table1 = framework.getDataset(table1ID, Maps.<String, String>newHashMap(), null);
    final Table table2 = framework.getDataset(table2ID, Maps.<String, String>newHashMap(), null);
    TransactionExecutor txnl = new DefaultTransactionExecutor(new MinimalTxSystemClient(), (TransactionAware) table1, (TransactionAware) table2);
    txnl.execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() throws Exception {
            table1.put(Bytes.toBytes("rowkey"), Bytes.toBytes("column"), Bytes.toBytes("val1"));
            table2.put(Bytes.toBytes("rowkey"), Bytes.toBytes("column"), Bytes.toBytes("val2"));
        }
    });
    // check data is different, which means they are different underlying tables
    txnl.execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() throws Exception {
            Assert.assertEquals("val1", Bytes.toString(table1.get(Bytes.toBytes("rowkey"), Bytes.toBytes("column"))));
            Assert.assertEquals("val2", Bytes.toString(table2.get(Bytes.toBytes("rowkey"), Bytes.toBytes("column"))));
        }
    });
    // check get all in a namespace only includes those in that namespace
    Collection<DatasetSpecificationSummary> specs = framework.getInstances(namespace1);
    Assert.assertEquals(1, specs.size());
    Assert.assertEquals("table1", specs.iterator().next().getProperties().get("tag"));
    specs = framework.getInstances(namespace2);
    Assert.assertEquals(1, specs.size());
    Assert.assertEquals("table2", specs.iterator().next().getProperties().get("tag"));
    // delete one instance and make sure the other still exists
    framework.deleteInstance(table1ID);
    Assert.assertFalse(framework.hasInstance(table1ID));
    Assert.assertTrue(framework.hasInstance(table2ID));
    // delete all instances in one namespace and make sure the other still exists
    framework.addInstance(Table.class.getName(), table1ID, DatasetProperties.EMPTY);
    framework.deleteAllInstances(namespace1);
    Assert.assertTrue(framework.hasInstance(table2ID));
    // delete one namespace and make sure the other still exists
    namespacedLocationFactory.get(namespace1).delete(true);
    Assert.assertTrue(framework.hasInstance(table2ID));
}
Also used : Table(co.cask.cdap.api.dataset.table.Table) TransactionExecutor(org.apache.tephra.TransactionExecutor) DefaultTransactionExecutor(org.apache.tephra.DefaultTransactionExecutor) DatasetSpecificationSummary(co.cask.cdap.proto.DatasetSpecificationSummary) InstanceConflictException(co.cask.cdap.api.dataset.InstanceConflictException) DatasetManagementException(co.cask.cdap.api.dataset.DatasetManagementException) IOException(java.io.IOException) DatasetId(co.cask.cdap.proto.id.DatasetId) MinimalTxSystemClient(org.apache.tephra.inmemory.MinimalTxSystemClient) LineageWriterDatasetFramework(co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework) NamespaceMeta(co.cask.cdap.proto.NamespaceMeta) DefaultTransactionExecutor(org.apache.tephra.DefaultTransactionExecutor) NamespaceId(co.cask.cdap.proto.id.NamespaceId) Test(org.junit.Test)

Example 34 with DatasetFramework

use of co.cask.cdap.data2.dataset2.DatasetFramework in project cdap by caskdata.

the class ProgramScheduleStoreDatasetTest method checkDatasetType.

@Test
public void checkDatasetType() throws DatasetManagementException {
    DatasetFramework dsFramework = getInjector().getInstance(DatasetFramework.class);
    Assert.assertTrue(dsFramework.hasType(NamespaceId.SYSTEM.datasetType(Schedulers.STORE_TYPE_NAME)));
}
Also used : DatasetFramework(co.cask.cdap.data2.dataset2.DatasetFramework) Test(org.junit.Test)

Example 35 with DatasetFramework

use of co.cask.cdap.data2.dataset2.DatasetFramework in project cdap by caskdata.

the class ProgramScheduleStoreDatasetTest method testFindSchedulesByEventAndUpdateSchedule.

@Test
public void testFindSchedulesByEventAndUpdateSchedule() throws Exception {
    DatasetFramework dsFramework = getInjector().getInstance(DatasetFramework.class);
    TransactionSystemClient txClient = getInjector().getInstance(TransactionSystemClient.class);
    TransactionExecutorFactory txExecutorFactory = new DynamicTransactionExecutorFactory(txClient);
    final ProgramScheduleStoreDataset store = dsFramework.getDataset(Schedulers.STORE_DATASET_ID, new HashMap<String, String>(), null);
    Assert.assertNotNull(store);
    TransactionExecutor txExecutor = txExecutorFactory.createExecutor(Collections.singleton((TransactionAware) store));
    final ProgramSchedule sched11 = new ProgramSchedule("sched11", "one partition schedule", PROG1_ID, ImmutableMap.of("prop3", "abc"), new PartitionTrigger(DS1_ID, 1), ImmutableList.<Constraint>of());
    final ProgramSchedule sched12 = new ProgramSchedule("sched12", "two partition schedule", PROG1_ID, ImmutableMap.of("propper", "popper"), new PartitionTrigger(DS2_ID, 2), ImmutableList.<Constraint>of());
    final ProgramSchedule sched22 = new ProgramSchedule("sched22", "twentytwo partition schedule", PROG2_ID, ImmutableMap.of("nn", "4"), new PartitionTrigger(DS2_ID, 22), ImmutableList.<Constraint>of());
    txExecutor.execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() throws Exception {
            // event for DS1 or DS2 should trigger nothing. validate it returns an empty collection
            Assert.assertTrue(store.findSchedules(Schedulers.triggerKeyForPartition(DS1_ID)).isEmpty());
            Assert.assertTrue(store.findSchedules(Schedulers.triggerKeyForPartition(DS2_ID)).isEmpty());
        }
    });
    txExecutor.execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() throws Exception {
            store.addSchedules(ImmutableList.of(sched11, sched12, sched22));
        }
    });
    txExecutor.execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() throws Exception {
            // event for DS1 should trigger only sched11
            Assert.assertEquals(ImmutableSet.of(sched11), toScheduleSet(store.findSchedules(Schedulers.triggerKeyForPartition(DS1_ID))));
            // event for DS2 triggers only sched12 and sched22
            Assert.assertEquals(ImmutableSet.of(sched12, sched22), toScheduleSet(store.findSchedules(Schedulers.triggerKeyForPartition(DS2_ID))));
        }
    });
    final ProgramSchedule sched11New = new ProgramSchedule(sched11.getName(), "time schedule", PROG1_ID, ImmutableMap.of("timeprop", "time"), new TimeTrigger("* * * * *"), ImmutableList.<Constraint>of());
    final ProgramSchedule sched12New = new ProgramSchedule(sched12.getName(), "one partition schedule", PROG1_ID, ImmutableMap.of("pp", "p"), new PartitionTrigger(DS1_ID, 2), ImmutableList.<Constraint>of());
    final ProgramSchedule sched22New = new ProgramSchedule(sched22.getName(), "one streamsize schedule", PROG2_ID, ImmutableMap.of("ss", "s"), new StreamSizeTrigger(NS_ID.stream("stream"), 1), ImmutableList.<Constraint>of());
    txExecutor.execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() throws Exception {
            store.updateSchedule(sched11New);
            store.updateSchedule(sched12New);
            store.updateSchedule(sched22New);
        }
    });
    txExecutor.execute(new TransactionExecutor.Subroutine() {

        @Override
        public void apply() throws Exception {
            // event for DS1 should trigger only sched12New after update
            Assert.assertEquals(ImmutableSet.of(sched12New), toScheduleSet(store.findSchedules(Schedulers.triggerKeyForPartition(DS1_ID))));
            // event for DS2 triggers no schedule after update
            Assert.assertEquals(ImmutableSet.<ProgramSchedule>of(), toScheduleSet(store.findSchedules(Schedulers.triggerKeyForPartition(DS2_ID))));
        }
    });
}
Also used : TimeTrigger(co.cask.cdap.internal.app.runtime.schedule.trigger.TimeTrigger) DynamicTransactionExecutorFactory(co.cask.cdap.data.runtime.DynamicTransactionExecutorFactory) TransactionExecutor(org.apache.tephra.TransactionExecutor) DatasetManagementException(co.cask.cdap.api.dataset.DatasetManagementException) TransactionExecutorFactory(co.cask.cdap.data2.transaction.TransactionExecutorFactory) DynamicTransactionExecutorFactory(co.cask.cdap.data.runtime.DynamicTransactionExecutorFactory) DatasetFramework(co.cask.cdap.data2.dataset2.DatasetFramework) TransactionSystemClient(org.apache.tephra.TransactionSystemClient) ProgramSchedule(co.cask.cdap.internal.app.runtime.schedule.ProgramSchedule) TransactionAware(org.apache.tephra.TransactionAware) StreamSizeTrigger(co.cask.cdap.internal.app.runtime.schedule.trigger.StreamSizeTrigger) PartitionTrigger(co.cask.cdap.internal.app.runtime.schedule.trigger.PartitionTrigger) Test(org.junit.Test)

Aggregations

DatasetFramework (co.cask.cdap.data2.dataset2.DatasetFramework)26 Test (org.junit.Test)21 TransactionSystemClient (org.apache.tephra.TransactionSystemClient)11 CConfiguration (co.cask.cdap.common.conf.CConfiguration)10 Location (org.apache.twill.filesystem.Location)9 SystemDatasetInstantiator (co.cask.cdap.data.dataset.SystemDatasetInstantiator)8 LineageWriterDatasetFramework (co.cask.cdap.data2.metadata.writer.LineageWriterDatasetFramework)8 DatasetManagementException (co.cask.cdap.api.dataset.DatasetManagementException)7 IOException (java.io.IOException)7 LocationFactory (org.apache.twill.filesystem.LocationFactory)6 Transactional (co.cask.cdap.api.Transactional)5 DatasetManager (co.cask.cdap.api.dataset.DatasetManager)5 DefaultDatasetManager (co.cask.cdap.data2.datafabric.dataset.DefaultDatasetManager)5 MultiThreadDatasetCache (co.cask.cdap.data2.dataset2.MultiThreadDatasetCache)5 CoreDatasetsModule (co.cask.cdap.data2.dataset2.lib.table.CoreDatasetsModule)5 LogPathIdentifier (co.cask.cdap.logging.appender.system.LogPathIdentifier)5 FileMetaDataWriter (co.cask.cdap.logging.meta.FileMetaDataWriter)5 Injector (com.google.inject.Injector)5 Table (co.cask.cdap.api.dataset.table.Table)4 MetricsCollectionService (co.cask.cdap.api.metrics.MetricsCollectionService)4