Search in sources :

Example 11 with SystemDatasetInstantiator

use of co.cask.cdap.data.dataset.SystemDatasetInstantiator in project cdap by caskdata.

the class ExistingEntitySystemMetadataWriter method writeSystemMetadataForDatasets.

private void writeSystemMetadataForDatasets(NamespaceId namespace, DatasetFramework dsFramework) throws DatasetManagementException, IOException, NamespaceNotFoundException {
    SystemDatasetInstantiatorFactory systemDatasetInstantiatorFactory = new SystemDatasetInstantiatorFactory(locationFactory, dsFramework, cConf);
    try (SystemDatasetInstantiator systemDatasetInstantiator = systemDatasetInstantiatorFactory.create()) {
        for (DatasetSpecificationSummary summary : dsFramework.getInstances(namespace)) {
            final DatasetId dsInstance = namespace.dataset(summary.getName());
            DatasetProperties dsProperties = DatasetProperties.of(summary.getProperties());
            String dsType = summary.getType();
            Dataset dataset = null;
            try {
                try {
                    dataset = impersonator.doAs(dsInstance, new Callable<Dataset>() {

                        @Override
                        public Dataset call() throws Exception {
                            return systemDatasetInstantiator.getDataset(dsInstance);
                        }
                    });
                } catch (Exception e) {
                    LOG.warn("Exception while instantiating dataset {}", dsInstance, e);
                }
                SystemMetadataWriter writer = new DatasetSystemMetadataWriter(metadataStore, dsInstance, dsProperties, dataset, dsType, summary.getDescription());
                writer.write();
            } finally {
                if (dataset != null) {
                    dataset.close();
                }
            }
        }
    }
}
Also used : SystemDatasetInstantiatorFactory(co.cask.cdap.data.dataset.SystemDatasetInstantiatorFactory) DatasetSystemMetadataWriter(co.cask.cdap.data2.metadata.system.DatasetSystemMetadataWriter) SystemDatasetInstantiator(co.cask.cdap.data.dataset.SystemDatasetInstantiator) Dataset(co.cask.cdap.api.dataset.Dataset) DatasetProperties(co.cask.cdap.api.dataset.DatasetProperties) DatasetSystemMetadataWriter(co.cask.cdap.data2.metadata.system.DatasetSystemMetadataWriter) ProgramSystemMetadataWriter(co.cask.cdap.data2.metadata.system.ProgramSystemMetadataWriter) ViewSystemMetadataWriter(co.cask.cdap.data2.metadata.system.ViewSystemMetadataWriter) SystemMetadataWriter(co.cask.cdap.data2.metadata.system.SystemMetadataWriter) AppSystemMetadataWriter(co.cask.cdap.data2.metadata.system.AppSystemMetadataWriter) ArtifactSystemMetadataWriter(co.cask.cdap.data2.metadata.system.ArtifactSystemMetadataWriter) StreamSystemMetadataWriter(co.cask.cdap.data2.metadata.system.StreamSystemMetadataWriter) DatasetSpecificationSummary(co.cask.cdap.proto.DatasetSpecificationSummary) Callable(java.util.concurrent.Callable) DatasetManagementException(co.cask.cdap.api.dataset.DatasetManagementException) NamespaceNotFoundException(co.cask.cdap.common.NamespaceNotFoundException) IOException(java.io.IOException) DatasetId(co.cask.cdap.proto.id.DatasetId)

Example 12 with SystemDatasetInstantiator

use of co.cask.cdap.data.dataset.SystemDatasetInstantiator in project cdap by caskdata.

the class ExploreExecutorHttpHandler method doAddPartition.

private void doAddPartition(HttpRequest request, HttpResponder responder, DatasetId datasetId) {
    Dataset dataset;
    try (SystemDatasetInstantiator datasetInstantiator = datasetInstantiatorFactory.create()) {
        dataset = datasetInstantiator.getDataset(datasetId);
        if (dataset == null) {
            responder.sendString(HttpResponseStatus.NOT_FOUND, "Cannot load dataset " + datasetId);
            return;
        }
    } catch (IOException e) {
        String classNotFoundMessage = isClassNotFoundException(e);
        if (classNotFoundMessage != null) {
            JsonObject json = new JsonObject();
            json.addProperty("handle", QueryHandle.NO_OP.getHandle());
            responder.sendJson(HttpResponseStatus.OK, json);
            return;
        }
        LOG.error("Exception instantiating dataset {}.", datasetId, e);
        responder.sendString(HttpResponseStatus.INTERNAL_SERVER_ERROR, "Exception instantiating dataset " + datasetId.getDataset());
        return;
    }
    try {
        if (!(dataset instanceof PartitionedFileSet)) {
            responder.sendString(HttpResponseStatus.BAD_REQUEST, "not a partitioned dataset.");
            return;
        }
        Partitioning partitioning = ((PartitionedFileSet) dataset).getPartitioning();
        Reader reader = new InputStreamReader(new ChannelBufferInputStream(request.getContent()));
        Map<String, String> properties = GSON.fromJson(reader, new TypeToken<Map<String, String>>() {
        }.getType());
        String fsPath = properties.get("path");
        if (fsPath == null) {
            responder.sendString(HttpResponseStatus.BAD_REQUEST, "path was not specified.");
            return;
        }
        PartitionKey partitionKey;
        try {
            partitionKey = PartitionedFileSetArguments.getOutputPartitionKey(properties, partitioning);
        } catch (Exception e) {
            responder.sendString(HttpResponseStatus.BAD_REQUEST, "invalid partition key: " + e.getMessage());
            return;
        }
        if (partitionKey == null) {
            responder.sendString(HttpResponseStatus.BAD_REQUEST, "no partition key was given.");
            return;
        }
        QueryHandle handle = exploreTableManager.addPartition(datasetId, properties, partitionKey, fsPath);
        JsonObject json = new JsonObject();
        json.addProperty("handle", handle.getHandle());
        responder.sendJson(HttpResponseStatus.OK, json);
    } catch (Throwable e) {
        LOG.error("Got exception:", e);
        responder.sendString(HttpResponseStatus.INTERNAL_SERVER_ERROR, e.getMessage());
    }
}
Also used : InputStreamReader(java.io.InputStreamReader) Dataset(co.cask.cdap.api.dataset.Dataset) JsonObject(com.google.gson.JsonObject) Reader(java.io.Reader) InputStreamReader(java.io.InputStreamReader) PartitionedFileSet(co.cask.cdap.api.dataset.lib.PartitionedFileSet) IOException(java.io.IOException) BadRequestException(co.cask.cdap.common.BadRequestException) ExploreException(co.cask.cdap.explore.service.ExploreException) SQLException(java.sql.SQLException) DatasetManagementException(co.cask.cdap.api.dataset.DatasetManagementException) JsonSyntaxException(com.google.gson.JsonSyntaxException) UnsupportedTypeException(co.cask.cdap.api.data.schema.UnsupportedTypeException) IOException(java.io.IOException) Partitioning(co.cask.cdap.api.dataset.lib.Partitioning) SystemDatasetInstantiator(co.cask.cdap.data.dataset.SystemDatasetInstantiator) TypeToken(com.google.common.reflect.TypeToken) PartitionKey(co.cask.cdap.api.dataset.lib.PartitionKey) ChannelBufferInputStream(org.jboss.netty.buffer.ChannelBufferInputStream) QueryHandle(co.cask.cdap.proto.QueryHandle)

Example 13 with SystemDatasetInstantiator

use of co.cask.cdap.data.dataset.SystemDatasetInstantiator in project cdap by caskdata.

the class HiveExploreStructuredRecordTestRun method start.

@BeforeClass
public static void start() throws Exception {
    initialize(tmpFolder);
    DatasetModuleId moduleId = NAMESPACE_ID.datasetModule("email");
    datasetFramework.addModule(moduleId, new EmailTableDefinition.EmailTableModule());
    datasetFramework.addInstance("email", MY_TABLE, DatasetProperties.EMPTY);
    transactional = Transactions.createTransactional(new MultiThreadDatasetCache(new SystemDatasetInstantiator(datasetFramework), transactionSystemClient, NAMESPACE_ID, Collections.<String, String>emptyMap(), null, null));
    transactional.execute(new TxRunnable() {

        @Override
        public void run(DatasetContext context) throws Exception {
            // Accessing dataset instance to perform data operations
            EmailTableDefinition.EmailTable table = context.getDataset(MY_TABLE.getDataset());
            Assert.assertNotNull(table);
            table.writeEmail("email1", "this is the subject", "this is the body", "sljackson@boss.com");
        }
    });
    datasetFramework.addModule(NAMESPACE_ID.datasetModule("TableWrapper"), new TableWrapperDefinition.Module());
}
Also used : DatasetModuleId(co.cask.cdap.proto.id.DatasetModuleId) MultiThreadDatasetCache(co.cask.cdap.data2.dataset2.MultiThreadDatasetCache) SystemDatasetInstantiator(co.cask.cdap.data.dataset.SystemDatasetInstantiator) TxRunnable(co.cask.cdap.api.TxRunnable) TableWrapperDefinition(co.cask.cdap.explore.service.datasets.TableWrapperDefinition) EmailTableDefinition(co.cask.cdap.explore.service.datasets.EmailTableDefinition) DatasetContext(co.cask.cdap.api.data.DatasetContext) BeforeClass(org.junit.BeforeClass)

Example 14 with SystemDatasetInstantiator

use of co.cask.cdap.data.dataset.SystemDatasetInstantiator in project cdap by caskdata.

the class ExploreTableManager method generateDisableStatement.

private String generateDisableStatement(DatasetId datasetId, DatasetSpecification spec) throws ExploreException {
    String tableName = tableNaming.getTableName(datasetId, spec.getProperties());
    String databaseName = ExploreProperties.getExploreDatabaseName(spec.getProperties());
    // If table does not exist, nothing to be done
    try {
        exploreService.getTableInfo(datasetId.getNamespace(), databaseName, tableName);
    } catch (TableNotFoundException e) {
        // Ignore exception, since this means table was not found.
        return null;
    }
    try (SystemDatasetInstantiator datasetInstantiator = datasetInstantiatorFactory.create()) {
        Dataset dataset = datasetInstantiator.getDataset(datasetId);
        try {
            if (dataset instanceof FileSet || dataset instanceof PartitionedFileSet) {
                // do not drop the explore table that dataset is reusing an existing table
                if (FileSetProperties.isUseExisting(spec.getProperties())) {
                    return null;
                }
            }
            return generateDeleteStatement(dataset, databaseName, tableName);
        } finally {
            Closeables.closeQuietly(dataset);
        }
    } catch (IOException e) {
        LOG.error("Exception creating dataset classLoaderProvider for dataset {}.", datasetId, e);
        throw new ExploreException("Exception instantiating dataset " + datasetId);
    }
}
Also used : FileSet(co.cask.cdap.api.dataset.lib.FileSet) PartitionedFileSet(co.cask.cdap.api.dataset.lib.PartitionedFileSet) SystemDatasetInstantiator(co.cask.cdap.data.dataset.SystemDatasetInstantiator) Dataset(co.cask.cdap.api.dataset.Dataset) PartitionedFileSet(co.cask.cdap.api.dataset.lib.PartitionedFileSet) IOException(java.io.IOException)

Example 15 with SystemDatasetInstantiator

use of co.cask.cdap.data.dataset.SystemDatasetInstantiator in project cdap by caskdata.

the class FileMetadataCleanerTest method testFileMetadataWithCommonContextPrefix.

@Test
public void testFileMetadataWithCommonContextPrefix() throws Exception {
    DatasetFramework datasetFramework = injector.getInstance(DatasetFramework.class);
    DatasetManager datasetManager = new DefaultDatasetManager(datasetFramework, NamespaceId.SYSTEM, co.cask.cdap.common.service.RetryStrategies.noRetry(), null);
    Transactional transactional = Transactions.createTransactionalWithRetry(Transactions.createTransactional(new MultiThreadDatasetCache(new SystemDatasetInstantiator(datasetFramework), injector.getInstance(TransactionSystemClient.class), NamespaceId.SYSTEM, ImmutableMap.<String, String>of(), null, null)), RetryStrategies.retryOnConflict(20, 100));
    FileMetaDataWriter fileMetaDataWriter = new FileMetaDataWriter(datasetManager, transactional);
    FileMetaDataReader fileMetadataReader = injector.getInstance(FileMetaDataReader.class);
    FileMetadataCleaner fileMetadataCleaner = new FileMetadataCleaner(datasetManager, transactional);
    try {
        List<LogPathIdentifier> logPathIdentifiers = new ArrayList<>();
        // this should be able to scan and delete common prefix programs like testFlow1, testFlow10 during clenaup.
        for (int i = 1; i <= 20; i++) {
            logPathIdentifiers.add(new LogPathIdentifier(NamespaceId.DEFAULT.getNamespace(), "testApp", String.format("testFlow%s", i)));
        }
        LocationFactory locationFactory = injector.getInstance(LocationFactory.class);
        Location location = locationFactory.create(TMP_FOLDER.newFolder().getPath()).append("/logs");
        long currentTime = System.currentTimeMillis();
        long newCurrentTime = currentTime + 100;
        for (int i = 1; i <= 20; i++) {
            LogPathIdentifier identifier = logPathIdentifiers.get(i - 1);
            for (int j = 0; j < 10; j++) {
                fileMetaDataWriter.writeMetaData(identifier, newCurrentTime + j, newCurrentTime + j, location.append("testFileNew" + Integer.toString(j)));
            }
        }
        List<LogLocation> locations;
        for (int i = 1; i <= 20; i++) {
            locations = fileMetadataReader.listFiles(logPathIdentifiers.get(i - 1), newCurrentTime, newCurrentTime + 10);
            // should include files from currentTime (0..9)
            Assert.assertEquals(10, locations.size());
        }
        long tillTime = newCurrentTime + 4;
        List<FileMetadataCleaner.DeletedEntry> deleteEntries = fileMetadataCleaner.scanAndGetFilesToDelete(tillTime, TRANSACTION_TIMEOUT);
        // 20 context, 5 entries each
        Assert.assertEquals(100, deleteEntries.size());
        for (int i = 1; i <= 20; i++) {
            locations = fileMetadataReader.listFiles(logPathIdentifiers.get(i - 1), newCurrentTime, newCurrentTime + 10);
            // should include files from time (5..9)
            Assert.assertEquals(5, locations.size());
            int startIndex = 5;
            for (LogLocation logLocation : locations) {
                Assert.assertEquals(String.format("testFileNew%s", startIndex), logLocation.getLocation().getName());
                startIndex++;
            }
        }
    } finally {
        // cleanup meta
        cleanupMetadata(transactional, datasetManager);
    }
}
Also used : FileMetaDataWriter(co.cask.cdap.logging.meta.FileMetaDataWriter) DefaultDatasetManager(co.cask.cdap.data2.datafabric.dataset.DefaultDatasetManager) DatasetManager(co.cask.cdap.api.dataset.DatasetManager) ArrayList(java.util.ArrayList) DefaultDatasetManager(co.cask.cdap.data2.datafabric.dataset.DefaultDatasetManager) LocationFactory(org.apache.twill.filesystem.LocationFactory) DatasetFramework(co.cask.cdap.data2.dataset2.DatasetFramework) TransactionSystemClient(org.apache.tephra.TransactionSystemClient) MultiThreadDatasetCache(co.cask.cdap.data2.dataset2.MultiThreadDatasetCache) SystemDatasetInstantiator(co.cask.cdap.data.dataset.SystemDatasetInstantiator) LogLocation(co.cask.cdap.logging.write.LogLocation) FileMetaDataReader(co.cask.cdap.logging.meta.FileMetaDataReader) LogPathIdentifier(co.cask.cdap.logging.appender.system.LogPathIdentifier) Transactional(co.cask.cdap.api.Transactional) Location(org.apache.twill.filesystem.Location) LogLocation(co.cask.cdap.logging.write.LogLocation) Test(org.junit.Test)

Aggregations

SystemDatasetInstantiator (co.cask.cdap.data.dataset.SystemDatasetInstantiator)20 DatasetFramework (co.cask.cdap.data2.dataset2.DatasetFramework)10 MultiThreadDatasetCache (co.cask.cdap.data2.dataset2.MultiThreadDatasetCache)9 IOException (java.io.IOException)8 TransactionSystemClient (org.apache.tephra.TransactionSystemClient)8 Test (org.junit.Test)8 Transactional (co.cask.cdap.api.Transactional)7 Dataset (co.cask.cdap.api.dataset.Dataset)7 DatasetManager (co.cask.cdap.api.dataset.DatasetManager)7 DefaultDatasetManager (co.cask.cdap.data2.datafabric.dataset.DefaultDatasetManager)7 FileMetaDataWriter (co.cask.cdap.logging.meta.FileMetaDataWriter)7 LocationFactory (org.apache.twill.filesystem.LocationFactory)7 DatasetManagementException (co.cask.cdap.api.dataset.DatasetManagementException)5 LogPathIdentifier (co.cask.cdap.logging.appender.system.LogPathIdentifier)5 Location (org.apache.twill.filesystem.Location)5 UnsupportedTypeException (co.cask.cdap.api.data.schema.UnsupportedTypeException)4 PartitionedFileSet (co.cask.cdap.api.dataset.lib.PartitionedFileSet)4 FileMetaDataReader (co.cask.cdap.logging.meta.FileMetaDataReader)4 LogLocation (co.cask.cdap.logging.write.LogLocation)4 PartitionKey (co.cask.cdap.api.dataset.lib.PartitionKey)3