Search in sources :

Example 6 with UnsupportedTypeException

use of co.cask.cdap.api.data.schema.UnsupportedTypeException in project cdap by caskdata.

the class DatasetSerDe method getDatasetSchema.

private void getDatasetSchema(Configuration conf, DatasetId datasetId) throws SerDeException {
    try (ContextManager.Context hiveContext = ContextManager.getContext(conf)) {
        // Because it calls initialize just to get the object inspector
        if (hiveContext == null) {
            LOG.info("Hive provided a null conf, will not be able to get dataset schema.");
            return;
        }
        // some datasets like Table and ObjectMappedTable have schema in the dataset properties
        try {
            DatasetSpecification datasetSpec = hiveContext.getDatasetSpec(datasetId);
            String schemaStr = datasetSpec.getProperty("schema");
            if (schemaStr != null) {
                schema = Schema.parseJson(schemaStr);
                return;
            }
        } catch (DatasetManagementException | ServiceUnavailableException e) {
            throw new SerDeException("Could not instantiate dataset " + datasetId, e);
        } catch (IOException e) {
            throw new SerDeException("Exception getting schema for dataset " + datasetId, e);
        }
        // other datasets must be instantiated to get their schema
        // conf is null if this is a query that writes to a dataset
        ClassLoader parentClassLoader = conf == null ? null : conf.getClassLoader();
        try (SystemDatasetInstantiator datasetInstantiator = hiveContext.createDatasetInstantiator(parentClassLoader)) {
            Dataset dataset = datasetInstantiator.getDataset(datasetId);
            if (dataset == null) {
                throw new SerDeException("Could not find dataset " + datasetId);
            }
            Type recordType;
            if (dataset instanceof RecordScannable) {
                recordType = ((RecordScannable) dataset).getRecordType();
            } else if (dataset instanceof RecordWritable) {
                recordType = ((RecordWritable) dataset).getRecordType();
            } else {
                throw new SerDeException("Dataset " + datasetId + " is not explorable.");
            }
            schema = schemaGenerator.generate(recordType);
        } catch (UnsupportedTypeException e) {
            throw new SerDeException("Dataset " + datasetId + " has an unsupported schema.", e);
        } catch (IOException e) {
            throw new SerDeException("Exception while trying to instantiate dataset " + datasetId, e);
        }
    } catch (IOException e) {
        throw new SerDeException("Could not get hive context from configuration.", e);
    }
}
Also used : RecordWritable(co.cask.cdap.api.data.batch.RecordWritable) Dataset(co.cask.cdap.api.dataset.Dataset) DatasetSpecification(co.cask.cdap.api.dataset.DatasetSpecification) ServiceUnavailableException(co.cask.cdap.common.ServiceUnavailableException) IOException(java.io.IOException) RecordScannable(co.cask.cdap.api.data.batch.RecordScannable) DatasetManagementException(co.cask.cdap.api.dataset.DatasetManagementException) Type(java.lang.reflect.Type) SystemDatasetInstantiator(co.cask.cdap.data.dataset.SystemDatasetInstantiator) ContextManager(co.cask.cdap.hive.context.ContextManager) UnsupportedTypeException(co.cask.cdap.api.data.schema.UnsupportedTypeException) SerDeException(org.apache.hadoop.hive.serde2.SerDeException)

Example 7 with UnsupportedTypeException

use of co.cask.cdap.api.data.schema.UnsupportedTypeException in project cdap by caskdata.

the class StreamSerDe method initialize.

// initialize gets called multiple times by Hive. It may seem like a good idea to put additional settings into
// the conf, but be very careful when doing so. If there are multiple hive tables involved in a query, initialize
// for each table is called before input splits are fetched for any table. It is therefore not safe to put anything
// the input format may need into conf in this method. Rather, use StorageHandler's method to place needed config
// into the properties map there, which will get passed here and also copied into the job conf for the input
// format to consume.
@Override
public void initialize(Configuration conf, Properties properties) throws SerDeException {
    // The columns property comes from the Hive metastore, which has it from the create table statement
    // It is then important that this schema be accurate and in the right order - the same order as
    // object inspectors will reflect them.
    String streamName = properties.getProperty(Constants.Explore.STREAM_NAME);
    String streamNamespace = properties.getProperty(Constants.Explore.STREAM_NAMESPACE);
    // to avoid a null pointer exception that prevents dropping a table, we handle the null namespace case here.
    if (streamNamespace == null) {
        // we also still need an ObjectInspector as Hive uses it to check what columns the table has.
        this.inspector = new ObjectDeserializer(properties, null).getInspector();
        return;
    }
    StreamId streamId = new StreamId(streamNamespace, streamName);
    try (ContextManager.Context context = ContextManager.getContext(conf)) {
        Schema schema = null;
        // Because it calls initialize just to get the object inspector
        if (context != null) {
            // Get the stream format from the stream config.
            FormatSpecification formatSpec = getFormatSpec(properties, streamId, context);
            this.streamFormat = (AbstractStreamEventRecordFormat) RecordFormats.createInitializedFormat(formatSpec);
            schema = formatSpec.getSchema();
        }
        this.deserializer = new ObjectDeserializer(properties, schema, BODY_OFFSET);
        this.inspector = deserializer.getInspector();
    } catch (UnsupportedTypeException e) {
        // this should have been validated up front when schema was set on the stream.
        // if we hit this something went wrong much earlier.
        LOG.error("Schema unsupported by format.", e);
        throw new SerDeException("Schema unsupported by format.", e);
    } catch (IOException e) {
        LOG.error("Could not get the config for stream {}.", streamName, e);
        throw new SerDeException("Could not get the config for stream " + streamName, e);
    } catch (Exception e) {
        LOG.error("Could not create the format for stream {}.", streamName, e);
        throw new SerDeException("Could not create the format for stream " + streamName, e);
    }
}
Also used : StreamId(co.cask.cdap.proto.id.StreamId) ContextManager(co.cask.cdap.hive.context.ContextManager) Schema(co.cask.cdap.api.data.schema.Schema) FormatSpecification(co.cask.cdap.api.data.format.FormatSpecification) UnsupportedTypeException(co.cask.cdap.api.data.schema.UnsupportedTypeException) IOException(java.io.IOException) ObjectDeserializer(co.cask.cdap.hive.serde.ObjectDeserializer) SerDeException(org.apache.hadoop.hive.serde2.SerDeException) UnsupportedTypeException(co.cask.cdap.api.data.schema.UnsupportedTypeException) IOException(java.io.IOException) SerDeException(org.apache.hadoop.hive.serde2.SerDeException)

Example 8 with UnsupportedTypeException

use of co.cask.cdap.api.data.schema.UnsupportedTypeException in project cdap by caskdata.

the class RecordFormat method initialize.

/**
   * Initialize the format with the given desired schema and properties.
   * Guaranteed to be called once before any other method is called.
   *
   * @param formatSpecification the specification for the format, containing the desired schema and settings
   * @throws UnsupportedTypeException if the desired schema and properties are not supported
   */
public void initialize(@Nullable FormatSpecification formatSpecification) throws UnsupportedTypeException {
    Schema desiredSchema = null;
    Map<String, String> settings = Collections.emptyMap();
    if (formatSpecification != null) {
        desiredSchema = formatSpecification.getSchema();
        settings = formatSpecification.getSettings();
    }
    desiredSchema = desiredSchema == null ? getDefaultSchema() : desiredSchema;
    if (desiredSchema == null) {
        String msg = "A schema must be provided to the format: ";
        if (formatSpecification != null) {
            msg += formatSpecification.getName();
        }
        throw new UnsupportedTypeException(msg);
    }
    validateIsRecord(desiredSchema);
    validateSchema(desiredSchema);
    this.schema = desiredSchema;
    configure(settings);
}
Also used : Schema(co.cask.cdap.api.data.schema.Schema) UnsupportedTypeException(co.cask.cdap.api.data.schema.UnsupportedTypeException)

Example 9 with UnsupportedTypeException

use of co.cask.cdap.api.data.schema.UnsupportedTypeException in project cdap by caskdata.

the class ArtifactInspector method inspectPlugins.

/**
   * Inspects the plugin file and extracts plugin classes information.
   */
private ArtifactClasses.Builder inspectPlugins(ArtifactClasses.Builder builder, File artifactFile, ArtifactId artifactId, PluginInstantiator pluginInstantiator) throws IOException, InvalidArtifactException {
    // See if there are export packages. Plugins should be in those packages
    Set<String> exportPackages = getExportPackages(artifactFile);
    if (exportPackages.isEmpty()) {
        return builder;
    }
    try {
        ClassLoader pluginClassLoader = pluginInstantiator.getArtifactClassLoader(artifactId);
        for (Class<?> cls : getPluginClasses(exportPackages, pluginClassLoader)) {
            Plugin pluginAnnotation = cls.getAnnotation(Plugin.class);
            if (pluginAnnotation == null) {
                continue;
            }
            Map<String, PluginPropertyField> pluginProperties = Maps.newHashMap();
            try {
                String configField = getProperties(TypeToken.of(cls), pluginProperties);
                Set<String> pluginEndpoints = getPluginEndpoints(cls);
                PluginClass pluginClass = new PluginClass(pluginAnnotation.type(), getPluginName(cls), getPluginDescription(cls), cls.getName(), configField, pluginProperties, pluginEndpoints);
                builder.addPlugin(pluginClass);
            } catch (UnsupportedTypeException e) {
                LOG.warn("Plugin configuration type not supported. Plugin ignored. {}", cls, e);
            }
        }
    } catch (Throwable t) {
        throw new InvalidArtifactException(String.format("Class could not be found while inspecting artifact for plugins. " + "Please check dependencies are available, and that the correct parent artifact was specified. " + "Error class: %s, message: %s.", t.getClass(), t.getMessage()), t);
    }
    return builder;
}
Also used : CloseableClassLoader(co.cask.cdap.api.artifact.CloseableClassLoader) UnsupportedTypeException(co.cask.cdap.api.data.schema.UnsupportedTypeException) PluginClass(co.cask.cdap.api.plugin.PluginClass) PluginPropertyField(co.cask.cdap.api.plugin.PluginPropertyField) InvalidArtifactException(co.cask.cdap.common.InvalidArtifactException) Plugin(co.cask.cdap.api.annotation.Plugin)

Example 10 with UnsupportedTypeException

use of co.cask.cdap.api.data.schema.UnsupportedTypeException in project cdap by caskdata.

the class ViewSystemMetadataWriter method getSchemaToAdd.

@Nullable
@Override
protected String getSchemaToAdd() {
    Schema schema = viewSpec.getFormat().getSchema();
    if (schema == null) {
        FormatSpecification format = viewSpec.getFormat();
        RecordFormat<Object, Object> initializedFormat;
        try {
            initializedFormat = RecordFormats.createInitializedFormat(format);
            schema = initializedFormat.getSchema();
        } catch (IllegalAccessException | InstantiationException | UnsupportedTypeException | ClassNotFoundException e) {
            LOG.debug("Exception: ", e);
            LOG.warn("Exception while determining schema for view {}. View {} will not contain schema as metadata.", viewId, viewId);
        }
    }
    return schema == null ? null : schema.toString();
}
Also used : Schema(co.cask.cdap.api.data.schema.Schema) FormatSpecification(co.cask.cdap.api.data.format.FormatSpecification) UnsupportedTypeException(co.cask.cdap.api.data.schema.UnsupportedTypeException) Nullable(javax.annotation.Nullable)

Aggregations

UnsupportedTypeException (co.cask.cdap.api.data.schema.UnsupportedTypeException)30 Schema (co.cask.cdap.api.data.schema.Schema)11 Stream (co.cask.cdap.api.data.stream.Stream)10 IOException (java.io.IOException)9 FormatSpecification (co.cask.cdap.api.data.format.FormatSpecification)4 DatasetManagementException (co.cask.cdap.api.dataset.DatasetManagementException)3 BadRequestException (co.cask.cdap.common.BadRequestException)3 ExploreException (co.cask.cdap.explore.service.ExploreException)3 QueryHandle (co.cask.cdap.proto.QueryHandle)3 JsonObject (com.google.gson.JsonObject)3 SQLException (java.sql.SQLException)3 DatasetSpecification (co.cask.cdap.api.dataset.DatasetSpecification)2 PluginPropertyField (co.cask.cdap.api.plugin.PluginPropertyField)2 InvalidArtifactException (co.cask.cdap.common.InvalidArtifactException)2 AuditPolicy (co.cask.cdap.common.security.AuditPolicy)2 ContextManager (co.cask.cdap.hive.context.ContextManager)2 StreamId (co.cask.cdap.proto.id.StreamId)2 JsonSyntaxException (com.google.gson.JsonSyntaxException)2 InputStream (java.io.InputStream)2 InputStreamReader (java.io.InputStreamReader)2