Search in sources :

Example 1 with SystemMetadata

use of io.cdap.cdap.data2.metadata.system.SystemMetadata in project cdap by caskdata.

the class DatasetMetadataStorage method read.

private Metadata read(MetadataDatasetContext context, Read read) {
    MetadataDataset.Record userMetadata = readScope(context, MetadataScope.USER, read);
    MetadataDataset.Record systemMetadata = readScope(context, MetadataScope.SYSTEM, read);
    return mergeDisjointMetadata(new Metadata(USER, userMetadata.getTags(), userMetadata.getProperties()), new Metadata(SYSTEM, systemMetadata.getTags(), systemMetadata.getProperties()));
}
Also used : MetadataDataset(io.cdap.cdap.data2.metadata.dataset.MetadataDataset) Metadata(io.cdap.cdap.spi.metadata.Metadata)

Example 2 with SystemMetadata

use of io.cdap.cdap.data2.metadata.system.SystemMetadata in project cdap by caskdata.

the class DatasetAdminService method createOrUpdate.

/**
 * Configures and creates a Dataset
 *
 * @param datasetInstanceId dataset instance to be created
 * @param typeMeta type meta for the dataset
 * @param props dataset instance properties
 * @param existing if dataset already exists (in case of update), the existing properties
 * @return dataset specification
 */
public DatasetCreationResponse createOrUpdate(final DatasetId datasetInstanceId, final DatasetTypeMeta typeMeta, final DatasetProperties props, @Nullable final DatasetSpecification existing) throws Exception {
    if (existing == null) {
        LOG.info("Creating dataset instance {}, type meta: {}", datasetInstanceId, typeMeta);
    } else {
        LOG.info("Updating dataset instance {}, type meta: {}, existing: {}", datasetInstanceId, typeMeta, existing);
    }
    try (DatasetClassLoaderProvider classLoaderProvider = new DirectoryClassLoaderProvider(cConf, locationFactory)) {
        final DatasetContext context = DatasetContext.from(datasetInstanceId.getNamespace());
        UserGroupInformation ugi = getUgiForDataset(impersonator, datasetInstanceId);
        final DatasetType type = ImpersonationUtils.doAs(ugi, () -> {
            LOG.trace("Getting dataset type {}", typeMeta.getName());
            DatasetType type1 = dsFramework.getDatasetType(typeMeta, null, classLoaderProvider);
            if (type1 == null) {
                throw new BadRequestException(String.format("Cannot instantiate dataset type using provided type meta: %s", typeMeta));
            }
            LOG.trace("Got dataset type {}", typeMeta.getName());
            return type1;
        });
        DatasetSpecification spec = ImpersonationUtils.doAs(ugi, () -> {
            LOG.trace("Configuring dataset {} of type {}", datasetInstanceId.getDataset(), typeMeta.getName());
            DatasetSpecification spec1 = existing == null ? type.configure(datasetInstanceId.getEntityName(), props) : type.reconfigure(datasetInstanceId.getEntityName(), props, existing);
            LOG.trace("Configured dataset {} of type {}", datasetInstanceId.getDataset(), typeMeta.getName());
            DatasetAdmin admin = type.getAdmin(context, spec1);
            try {
                if (existing != null) {
                    if (admin instanceof Updatable) {
                        ((Updatable) admin).update(existing);
                    } else {
                        admin.upgrade();
                    }
                } else {
                    LOG.trace("Creating dataset {} of type {}", datasetInstanceId.getDataset(), typeMeta.getName());
                    admin.create();
                    LOG.trace("Created dataset {} of type {}", datasetInstanceId.getDataset(), typeMeta.getName());
                }
            } finally {
                Closeables.closeQuietly(admin);
            }
            return spec1;
        });
        // Writing system metadata should be done without impersonation since user may not have access to system tables.
        LOG.trace("Computing metadata for dataset {}", datasetInstanceId.getDataset());
        SystemMetadata metadata = computeSystemMetadata(datasetInstanceId, spec, props, typeMeta, type, context, existing != null, ugi);
        LOG.trace("Computed metadata for dataset {}", datasetInstanceId.getDataset());
        return new DatasetCreationResponse(spec, metadata);
    } catch (Exception e) {
        if (e instanceof IncompatibleUpdateException) {
            // this is expected to happen if user provides bad update properties, so we log this as debug
            LOG.debug("Incompatible update for dataset '{}'", datasetInstanceId, e);
        } else {
            LOG.error("Error {} dataset '{}': {}", existing == null ? "creating" : "updating", datasetInstanceId, e.getMessage(), e);
        }
        throw e;
    }
}
Also used : DatasetSpecification(io.cdap.cdap.api.dataset.DatasetSpecification) DatasetAdmin(io.cdap.cdap.api.dataset.DatasetAdmin) DatasetType(io.cdap.cdap.data2.datafabric.dataset.DatasetType) IncompatibleUpdateException(io.cdap.cdap.api.dataset.IncompatibleUpdateException) AccessException(io.cdap.cdap.api.security.AccessException) IOException(java.io.IOException) BadRequestException(io.cdap.cdap.common.BadRequestException) NotFoundException(io.cdap.cdap.common.NotFoundException) DirectoryClassLoaderProvider(io.cdap.cdap.data2.datafabric.dataset.type.DirectoryClassLoaderProvider) Updatable(io.cdap.cdap.api.dataset.Updatable) SystemMetadata(io.cdap.cdap.data2.metadata.system.SystemMetadata) BadRequestException(io.cdap.cdap.common.BadRequestException) DatasetClassLoaderProvider(io.cdap.cdap.data2.datafabric.dataset.type.DatasetClassLoaderProvider) DatasetContext(io.cdap.cdap.api.dataset.DatasetContext) UserGroupInformation(org.apache.hadoop.security.UserGroupInformation) IncompatibleUpdateException(io.cdap.cdap.api.dataset.IncompatibleUpdateException)

Example 3 with SystemMetadata

use of io.cdap.cdap.data2.metadata.system.SystemMetadata in project cdap by caskdata.

the class DatasetInstanceService method publishMetadata.

private void publishMetadata(DatasetId dataset, SystemMetadata metadata) {
    if (metadata != null && !metadata.isEmpty()) {
        SystemMetadataWriter metadataWriter = new DelegateSystemMetadataWriter(metadataServiceClient, dataset, metadata);
        metadataWriter.write();
    }
}
Also used : DelegateSystemMetadataWriter(io.cdap.cdap.data2.metadata.system.DelegateSystemMetadataWriter) SystemMetadataWriter(io.cdap.cdap.data2.metadata.system.SystemMetadataWriter) DelegateSystemMetadataWriter(io.cdap.cdap.data2.metadata.system.DelegateSystemMetadataWriter)

Example 4 with SystemMetadata

use of io.cdap.cdap.data2.metadata.system.SystemMetadata in project cdap by caskdata.

the class DatasetInstanceService method create.

/**
 * Creates a dataset instance.
 *
 * @param namespaceId the namespace to create the dataset instance in
 * @param name the name of the new dataset instance
 * @param props the properties for the new dataset instance
 * @throws NamespaceNotFoundException if the specified namespace was not found
 * @throws DatasetAlreadyExistsException if a dataset with the same name already exists
 * @throws DatasetTypeNotFoundException if the dataset type was not found
 * @throws UnauthorizedException if perimeter security and authorization are enabled, and the current user does not
 *  have {@link StandardPermission#UPDATE} privilege on the #instance's namespace
 */
void create(String namespaceId, String name, DatasetInstanceConfiguration props) throws Exception {
    NamespaceId namespace = ConversionHelpers.toNamespaceId(namespaceId);
    DatasetId datasetId = ConversionHelpers.toDatasetInstanceId(namespaceId, name);
    Principal requestingUser = authenticationContext.getPrincipal();
    String ownerPrincipal = props.getOwnerPrincipal();
    // need to enforce on the principal id if impersonation is involved
    KerberosPrincipalId effectiveOwner = SecurityUtil.getEffectiveOwner(ownerAdmin, namespace, ownerPrincipal);
    if (DatasetsUtil.isUserDataset(datasetId)) {
        LOG.trace("Authorizing impersonation for dataset {}", name);
        if (effectiveOwner != null) {
            accessEnforcer.enforce(effectiveOwner, requestingUser, AccessPermission.SET_OWNER);
        }
        accessEnforcer.enforce(datasetId, requestingUser, StandardPermission.CREATE);
        LOG.trace("Authorized impersonation for dataset {}", name);
    }
    LOG.trace("Ensuring existence of namespace {} for dataset {}", namespace, name);
    ensureNamespaceExists(namespace);
    LOG.trace("Ensured existence of namespace {} for dataset {}", namespace, name);
    LOG.trace("Retrieving instance metadata from MDS for dataset {}", name);
    DatasetSpecification existing = instanceManager.get(datasetId);
    if (existing != null) {
        throw new DatasetAlreadyExistsException(datasetId);
    }
    LOG.trace("Retrieved instance metadata from MDS for dataset {}", name);
    // for creation, we need enforcement for dataset type for user dataset, but bypass for system datasets
    DatasetTypeMeta typeMeta = getTypeInfo(namespace, props.getTypeName(), !DatasetsUtil.isUserDataset(datasetId));
    if (typeMeta == null) {
        // Type not found in the instance's namespace and the system namespace. Bail out.
        throw new DatasetTypeNotFoundException(ConversionHelpers.toDatasetTypeId(namespace, props.getTypeName()));
    }
    LOG.info("Creating dataset {}.{}, type name: {}, properties: {}", namespaceId, name, props.getTypeName(), props.getProperties());
    // exists or not
    if (ownerPrincipal != null) {
        LOG.trace("Adding owner for dataset {}", name);
        KerberosPrincipalId owner = new KerberosPrincipalId(ownerPrincipal);
        ownerAdmin.add(datasetId, owner);
        LOG.trace("Added owner {} for dataset {}", owner, name);
    }
    try {
        DatasetProperties datasetProperties = DatasetProperties.builder().addAll(props.getProperties()).setDescription(props.getDescription()).build();
        LOG.trace("Calling op executor service to configure dataset {}", name);
        DatasetCreationResponse response = opExecutorClient.create(datasetId, typeMeta, datasetProperties);
        LOG.trace("Received spec and metadata from op executor service for dataset {}: {}", name, response);
        LOG.trace("Adding instance metadata for dataset {}", name);
        DatasetSpecification spec = response.getSpec();
        instanceManager.add(namespace, spec);
        LOG.trace("Added instance metadata for dataset {}", name);
        metaCache.invalidate(datasetId);
        LOG.trace("Publishing audit for creation of dataset {}", name);
        publishAudit(datasetId, AuditType.CREATE);
        LOG.trace("Published audit for creation of dataset {}", name);
        SystemMetadata metadata = response.getMetadata();
        LOG.trace("Publishing system metadata for creation of dataset {}: {}", name, metadata);
        publishMetadata(datasetId, metadata);
        LOG.trace("Published system metadata for creation of dataset {}", name);
        // Enable explore
        enableExplore(datasetId, spec, props);
    } catch (Exception e) {
        // there was a problem in creating the dataset instance so delete the owner if it got added earlier
        // safe to call for entities which does not have an owner too
        ownerAdmin.delete(datasetId);
        throw e;
    }
}
Also used : DatasetProperties(io.cdap.cdap.api.dataset.DatasetProperties) DatasetSpecification(io.cdap.cdap.api.dataset.DatasetSpecification) DatasetTypeMeta(io.cdap.cdap.proto.DatasetTypeMeta) DatasetCreationResponse(io.cdap.cdap.data2.datafabric.dataset.service.executor.DatasetCreationResponse) HandlerException(io.cdap.cdap.common.HandlerException) NotFoundException(io.cdap.cdap.common.NotFoundException) UnauthorizedException(io.cdap.cdap.security.spi.authorization.UnauthorizedException) DatasetTypeNotFoundException(io.cdap.cdap.common.DatasetTypeNotFoundException) NamespaceNotFoundException(io.cdap.cdap.common.NamespaceNotFoundException) IOException(java.io.IOException) DatasetAlreadyExistsException(io.cdap.cdap.common.DatasetAlreadyExistsException) ExecutionException(java.util.concurrent.ExecutionException) DatasetNotFoundException(io.cdap.cdap.common.DatasetNotFoundException) DatasetId(io.cdap.cdap.proto.id.DatasetId) SystemMetadata(io.cdap.cdap.data2.metadata.system.SystemMetadata) DatasetAlreadyExistsException(io.cdap.cdap.common.DatasetAlreadyExistsException) NamespaceId(io.cdap.cdap.proto.id.NamespaceId) DatasetTypeNotFoundException(io.cdap.cdap.common.DatasetTypeNotFoundException) KerberosPrincipalId(io.cdap.cdap.proto.id.KerberosPrincipalId) Principal(io.cdap.cdap.proto.security.Principal)

Example 5 with SystemMetadata

use of io.cdap.cdap.data2.metadata.system.SystemMetadata in project cdap by caskdata.

the class DatasetAdminService method computeSystemMetadata.

private SystemMetadata computeSystemMetadata(DatasetId datasetInstanceId, final DatasetSpecification spec, DatasetProperties props, final DatasetTypeMeta typeMeta, final DatasetType type, final DatasetContext context, boolean existing, UserGroupInformation ugi) throws IOException {
    // add system metadata for user datasets only
    if (DatasetsUtil.isUserDataset(datasetInstanceId)) {
        Dataset dataset = null;
        try {
            try {
                dataset = ImpersonationUtils.doAs(ugi, () -> type.getDataset(context, spec, DatasetDefinition.NO_ARGUMENTS));
            } catch (Exception e) {
                LOG.warn("Exception while instantiating Dataset {}", datasetInstanceId, e);
            }
            // Make sure to write whatever system metadata that can be derived
            // even if the above instantiation throws exception
            DatasetSystemMetadataProvider metadataProvider;
            if (existing) {
                metadataProvider = new DatasetSystemMetadataProvider(datasetInstanceId, props, dataset, typeMeta.getName(), spec.getDescription());
            } else {
                long createTime = System.currentTimeMillis();
                metadataProvider = new DatasetSystemMetadataProvider(datasetInstanceId, props, createTime, dataset, typeMeta.getName(), spec.getDescription());
            }
            return new SystemMetadata(metadataProvider.getSystemPropertiesToAdd(), metadataProvider.getSystemTagsToAdd(), metadataProvider.getSchemaToAdd());
        } finally {
            if (dataset != null) {
                dataset.close();
            }
        }
    }
    return SystemMetadata.EMPTY;
}
Also used : Dataset(io.cdap.cdap.api.dataset.Dataset) DatasetSystemMetadataProvider(io.cdap.cdap.data2.metadata.system.DatasetSystemMetadataProvider) SystemMetadata(io.cdap.cdap.data2.metadata.system.SystemMetadata) IncompatibleUpdateException(io.cdap.cdap.api.dataset.IncompatibleUpdateException) AccessException(io.cdap.cdap.api.security.AccessException) IOException(java.io.IOException) BadRequestException(io.cdap.cdap.common.BadRequestException) NotFoundException(io.cdap.cdap.common.NotFoundException)

Aggregations

NotFoundException (io.cdap.cdap.common.NotFoundException)3 SystemMetadata (io.cdap.cdap.data2.metadata.system.SystemMetadata)3 IOException (java.io.IOException)3 DatasetSpecification (io.cdap.cdap.api.dataset.DatasetSpecification)2 IncompatibleUpdateException (io.cdap.cdap.api.dataset.IncompatibleUpdateException)2 AccessException (io.cdap.cdap.api.security.AccessException)2 BadRequestException (io.cdap.cdap.common.BadRequestException)2 MetadataDataset (io.cdap.cdap.data2.metadata.dataset.MetadataDataset)2 ImmutableMap (com.google.common.collect.ImmutableMap)1 Dataset (io.cdap.cdap.api.dataset.Dataset)1 DatasetAdmin (io.cdap.cdap.api.dataset.DatasetAdmin)1 DatasetContext (io.cdap.cdap.api.dataset.DatasetContext)1 DatasetProperties (io.cdap.cdap.api.dataset.DatasetProperties)1 Updatable (io.cdap.cdap.api.dataset.Updatable)1 Metadata (io.cdap.cdap.api.metadata.Metadata)1 MetadataEntity (io.cdap.cdap.api.metadata.MetadataEntity)1 MetadataScope (io.cdap.cdap.api.metadata.MetadataScope)1 DatasetAlreadyExistsException (io.cdap.cdap.common.DatasetAlreadyExistsException)1 DatasetNotFoundException (io.cdap.cdap.common.DatasetNotFoundException)1 DatasetTypeNotFoundException (io.cdap.cdap.common.DatasetTypeNotFoundException)1