Search in sources :

Example 71 with Type

use of org.apache.avro.Schema.Type in project incubator-gobblin by apache.

the class AvroToJdbcEntryConverter method convertSchema.

/**
 * Converts Avro schema to JdbcEntrySchema.
 *
 * Few precondition to the Avro schema
 * 1. Avro schema should have one entry type record at first depth.
 * 2. Avro schema can recurse by having record inside record.
 * 3. Supported Avro primitive types and conversion
 *  boolean --> java.lang.Boolean
 *  int --> java.lang.Integer
 *  long --> java.lang.Long or java.sql.Date , java.sql.Time , java.sql.Timestamp
 *  float --> java.lang.Float
 *  double --> java.lang.Double
 *  bytes --> byte[]
 *  string --> java.lang.String
 *  null: only allowed if it's within union (see complex types for more details)
 * 4. Supported Avro complex types
 *  Records: Supports nested record type as well.
 *  Enum --> java.lang.String
 *  Unions --> Only allowed if it have one primitive type in it, along with Record type, or null type with one primitive type where null will be ignored.
 *  Once Union is narrowed down to one primitive type, it will follow conversion of primitive type above.
 * {@inheritDoc}
 *
 * 5. In order to make conversion from Avro long type to java.sql.Date or java.sql.Time or java.sql.Timestamp,
 * converter will get table metadata from JDBC.
 * 6. As it needs JDBC connection from condition 5, it also assumes that it will use JDBC publisher where it will get connection information from.
 * 7. Conversion assumes that both schema, Avro and JDBC, uses same column name where name space in Avro is ignored.
 *    For case sensitivity, Avro is case sensitive where it differs in JDBC based on underlying database. As Avro is case sensitive, column name equality also take case sensitive in to account.
 *
 * @see org.apache.gobblin.converter.Converter#convertSchema(java.lang.Object, org.apache.gobblin.configuration.WorkUnitState)
 */
@Override
public JdbcEntrySchema convertSchema(Schema inputSchema, WorkUnitState workUnit) throws SchemaConversionException {
    LOG.info("Converting schema " + inputSchema);
    Preconditions.checkArgument(Type.RECORD.equals(inputSchema.getType()), "%s is expected for the first level element in Avro schema %s", Type.RECORD, inputSchema);
    Map<String, Type> avroColumnType = flatten(inputSchema);
    String jsonStr = Preconditions.checkNotNull(workUnit.getProp(CONVERTER_AVRO_JDBC_DATE_FIELDS));
    java.lang.reflect.Type typeOfMap = new TypeToken<Map<String, JdbcType>>() {
    }.getType();
    Map<String, JdbcType> dateColumnMapping = new Gson().fromJson(jsonStr, typeOfMap);
    LOG.info("Date column mapping: " + dateColumnMapping);
    List<JdbcEntryMetaDatum> jdbcEntryMetaData = Lists.newArrayList();
    for (Map.Entry<String, Type> avroEntry : avroColumnType.entrySet()) {
        String colName = tryConvertAvroColNameToJdbcColName(avroEntry.getKey());
        JdbcType JdbcType = dateColumnMapping.get(colName);
        if (JdbcType == null) {
            JdbcType = AVRO_TYPE_JDBC_TYPE_MAPPING.get(avroEntry.getValue());
        }
        Preconditions.checkNotNull(JdbcType, "Failed to convert " + avroEntry + " AVRO_TYPE_JDBC_TYPE_MAPPING: " + AVRO_TYPE_JDBC_TYPE_MAPPING + " , dateColumnMapping: " + dateColumnMapping);
        jdbcEntryMetaData.add(new JdbcEntryMetaDatum(colName, JdbcType));
    }
    JdbcEntrySchema converted = new JdbcEntrySchema(jdbcEntryMetaData);
    LOG.info("Converted schema into " + converted);
    return converted;
}
Also used : Gson(com.google.gson.Gson) Type(org.apache.avro.Schema.Type) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) Map(java.util.Map) ImmutableMap(com.google.common.collect.ImmutableMap)

Example 72 with Type

use of org.apache.avro.Schema.Type in project avro-util by linkedin.

the class RecordBuilderBase method isValidValue.

protected static boolean isValidValue(Field f, Object value) {
    if (value != null) {
        return true;
    } else {
        Schema schema = f.schema();
        Type type = schema.getType();
        if (type == Type.NULL) {
            return true;
        } else {
            if (type == Type.UNION) {
                Iterator i$ = schema.getTypes().iterator();
                while (i$.hasNext()) {
                    Schema s = (Schema) i$.next();
                    if (s.getType() == Type.NULL) {
                        return true;
                    }
                }
            }
            return false;
        }
    }
}
Also used : Type(org.apache.avro.Schema.Type) Schema(org.apache.avro.Schema) Iterator(java.util.Iterator)

Example 73 with Type

use of org.apache.avro.Schema.Type in project akhq by tchiotludo.

the class AvroDeserializer method objectDeserializer.

@SuppressWarnings("unchecked")
private static Object objectDeserializer(Object value, Schema schema) {
    LogicalType logicalType = schema.getLogicalType();
    Type primitiveType = schema.getType();
    if (logicalType != null) {
        switch(logicalType.getName()) {
            case DATE:
                return AvroDeserializer.dateDeserializer(value, schema, primitiveType, logicalType);
            case DECIMAL:
                return AvroDeserializer.decimalDeserializer(value, schema, primitiveType, logicalType);
            case TIME_MICROS:
                return AvroDeserializer.timeMicrosDeserializer(value, schema, primitiveType, logicalType);
            case TIME_MILLIS:
                return AvroDeserializer.timeMillisDeserializer(value, schema, primitiveType, logicalType);
            case TIMESTAMP_MICROS:
                return AvroDeserializer.timestampMicrosDeserializer(value, schema, primitiveType, logicalType);
            case TIMESTAMP_MILLIS:
                return AvroDeserializer.timestampMillisDeserializer(value, schema, primitiveType, logicalType);
            case UUID:
                return AvroDeserializer.uuidDeserializer(value, schema, primitiveType, logicalType);
            default:
                throw new IllegalStateException("Unexpected value: " + logicalType);
        }
    } else {
        switch(primitiveType) {
            case UNION:
                return AvroDeserializer.unionDeserializer(value, schema);
            case MAP:
                return AvroDeserializer.mapDeserializer((Map<String, ?>) value, schema);
            case RECORD:
                return AvroDeserializer.recordDeserializer((GenericRecord) value);
            case ENUM:
                return value.toString();
            case ARRAY:
                return arrayDeserializer((Collection<?>) value, schema);
            case FIXED:
                return ((GenericFixed) value).bytes();
            case STRING:
                return ((CharSequence) value).toString();
            case BYTES:
                return ((ByteBuffer) value).array();
            case INT:
            case LONG:
            case FLOAT:
            case DOUBLE:
            case BOOLEAN:
            case NULL:
                return value;
            default:
                throw new IllegalStateException("Unexpected value: " + primitiveType);
        }
    }
}
Also used : GenericFixed(org.apache.avro.generic.GenericFixed) LogicalType(org.apache.avro.LogicalType) Type(org.apache.avro.Schema.Type) LogicalType(org.apache.avro.LogicalType) ByteBuffer(java.nio.ByteBuffer)

Example 74 with Type

use of org.apache.avro.Schema.Type in project avro by a0x8o.

the class SpecificData method createSchema.

/**
 * Create the schema for a Java type.
 */
@SuppressWarnings(value = "unchecked")
protected Schema createSchema(java.lang.reflect.Type type, Map<String, Schema> names) {
    if (type instanceof Class && CharSequence.class.isAssignableFrom((Class) type))
        return Schema.create(Type.STRING);
    else if (type == ByteBuffer.class)
        return Schema.create(Type.BYTES);
    else if ((type == Integer.class) || (type == Integer.TYPE))
        return Schema.create(Type.INT);
    else if ((type == Long.class) || (type == Long.TYPE))
        return Schema.create(Type.LONG);
    else if ((type == Float.class) || (type == Float.TYPE))
        return Schema.create(Type.FLOAT);
    else if ((type == Double.class) || (type == Double.TYPE))
        return Schema.create(Type.DOUBLE);
    else if ((type == Boolean.class) || (type == Boolean.TYPE))
        return Schema.create(Type.BOOLEAN);
    else if ((type == Void.class) || (type == Void.TYPE))
        return Schema.create(Type.NULL);
    else if (type instanceof ParameterizedType) {
        ParameterizedType ptype = (ParameterizedType) type;
        Class raw = (Class) ptype.getRawType();
        java.lang.reflect.Type[] params = ptype.getActualTypeArguments();
        if (Collection.class.isAssignableFrom(raw)) {
            // array
            if (params.length != 1)
                throw new AvroTypeException("No array type specified.");
            return Schema.createArray(createSchema(params[0], names));
        } else if (Map.class.isAssignableFrom(raw)) {
            // map
            java.lang.reflect.Type key = params[0];
            java.lang.reflect.Type value = params[1];
            if (!(key instanceof Class && CharSequence.class.isAssignableFrom((Class<?>) key)))
                throw new AvroTypeException("Map key class not CharSequence: " + SchemaUtil.describe(key));
            return Schema.createMap(createSchema(value, names));
        } else {
            return createSchema(raw, names);
        }
    } else if (type instanceof Class) {
        // class
        Class c = (Class) type;
        String fullName = c.getName();
        Schema schema = names.get(fullName);
        if (schema == null)
            try {
                schema = (Schema) (c.getDeclaredField("SCHEMA$").get(null));
                if (!fullName.equals(getClassName(schema)))
                    // HACK: schema mismatches class. maven shade plugin? try replacing.
                    schema = new Schema.Parser().parse(schema.toString().replace(schema.getNamespace(), c.getPackage().getName()));
            } catch (NoSuchFieldException e) {
                throw new AvroRuntimeException("Not a Specific class: " + c);
            } catch (IllegalAccessException e) {
                throw new AvroRuntimeException(e);
            }
        names.put(fullName, schema);
        return schema;
    }
    throw new AvroTypeException("Unknown type: " + type);
}
Also used : Schema(org.apache.avro.Schema) AvroRuntimeException(org.apache.avro.AvroRuntimeException) ByteBuffer(java.nio.ByteBuffer) ParameterizedType(java.lang.reflect.ParameterizedType) Type(org.apache.avro.Schema.Type) ParameterizedType(java.lang.reflect.ParameterizedType) HashMap(java.util.HashMap) ConcurrentMap(java.util.concurrent.ConcurrentMap) Map(java.util.Map) WeakHashMap(java.util.WeakHashMap) ConcurrentHashMap(java.util.concurrent.ConcurrentHashMap) AvroTypeException(org.apache.avro.AvroTypeException)

Example 75 with Type

use of org.apache.avro.Schema.Type in project knime-cloud by knime.

the class AbstractAmazonPersonalizeDataUploadNodeModel method createSchema.

private String createSchema(final AmazonPersonalize personalizeClient, final DataTableSpec spec) {
    final StringBuilder schemaNameBuilder = new StringBuilder(getSchemaNamePrefix());
    FieldAssembler<Schema> fieldAssembler = createFieldAssembler(SCHEMA_NAMESPACE);
    for (final String colName : spec.getColumnNames()) {
        if (!colName.startsWith(PREFIX_METADATA_FIELD)) {
            continue;
        }
        final DataColumnSpec colSpec = spec.getColumnSpec(colName);
        final boolean isCategorical;
        final Type type;
        if (colSpec.getType().isCompatible(StringValue.class)) {
            isCategorical = true;
            type = Type.STRING;
        } else if (colSpec.getType().isCompatible(IntValue.class)) {
            isCategorical = false;
            type = Type.INT;
        } else if (colSpec.getType().isCompatible(LongValue.class)) {
            isCategorical = false;
            type = Type.LONG;
        } else {
            isCategorical = false;
            type = Type.DOUBLE;
        }
        schemaNameBuilder.append("-" + type);
        // 'categorical' must be set for metadata
        fieldAssembler = fieldAssembler.name(colName).prop("categorical", isCategorical).type(Schema.create(type)).noDefault();
    }
    final String schemaName = schemaNameBuilder.toString();
    // check if the same schema has been created before
    final List<DatasetSchemaSummary> existingSchemas = AmazonPersonalizeUtils.listAllSchemas(personalizeClient);
    final Optional<DatasetSchemaSummary> schemaSummary = existingSchemas.stream().filter(e -> e.getName().equals(schemaName)).findAny();
    // if so, use this one again
    if (schemaSummary.isPresent()) {
        return schemaSummary.get().getSchemaArn();
    }
    // otherwise create new one
    final Schema schema = fieldAssembler.endRecord();
    final CreateSchemaRequest createSchemaRequest = new CreateSchemaRequest().withName(schemaName).withSchema(schema.toString());
    return personalizeClient.createSchema(createSchemaRequest).getSchemaArn();
}
Also used : ConnectionMonitor(org.knime.base.filehandling.remote.files.ConnectionMonitor) Arrays(java.util.Arrays) NodeSettingsRO(org.knime.core.node.NodeSettingsRO) AmazonConnectionInformationPortObject(org.knime.cloud.aws.util.AmazonConnectionInformationPortObject) CSVWriter(org.knime.base.node.io.csvwriter.CSVWriter) InvalidSettingsException(org.knime.core.node.InvalidSettingsException) CanceledExecutionException(org.knime.core.node.CanceledExecutionException) URISyntaxException(java.net.URISyntaxException) ListDatasetGroupsResult(com.amazonaws.services.personalize.model.ListDatasetGroupsResult) DescribeDatasetGroupResult(com.amazonaws.services.personalize.model.DescribeDatasetGroupResult) RemoteFile(org.knime.base.filehandling.remote.files.RemoteFile) CreateDatasetGroupResult(com.amazonaws.services.personalize.model.CreateDatasetGroupResult) CreateDatasetImportJobRequest(com.amazonaws.services.personalize.model.CreateDatasetImportJobRequest) InvalidInputException(com.amazonaws.services.personalize.model.InvalidInputException) Status(org.knime.cloud.aws.mlservices.utils.personalize.AmazonPersonalizeUtils.Status) DataColumnSpec(org.knime.core.data.DataColumnSpec) Map(java.util.Map) FieldAssembler(org.apache.avro.SchemaBuilder.FieldAssembler) URI(java.net.URI) DeleteDatasetGroupRequest(com.amazonaws.services.personalize.model.DeleteDatasetGroupRequest) DescribeDatasetImportJobRequest(com.amazonaws.services.personalize.model.DescribeDatasetImportJobRequest) PortType(org.knime.core.node.port.PortType) FileWriterSettings(org.knime.base.node.io.csvwriter.FileWriterSettings) IntValue(org.knime.core.data.IntValue) ExecutionMonitor(org.knime.core.node.ExecutionMonitor) Schema(org.apache.avro.Schema) AmazonPersonalize(com.amazonaws.services.personalize.AmazonPersonalize) NodeModel(org.knime.core.node.NodeModel) Collectors(java.util.stream.Collectors) List(java.util.List) BufferedDataTable(org.knime.core.node.BufferedDataTable) RemoteFileFactory(org.knime.base.filehandling.remote.files.RemoteFileFactory) Optional(java.util.Optional) DataSource(com.amazonaws.services.personalize.model.DataSource) DescribeDatasetImportJobResult(com.amazonaws.services.personalize.model.DescribeDatasetImportJobResult) PortObject(org.knime.core.node.port.PortObject) LongValue(org.knime.core.data.LongValue) DataTableSpec(org.knime.core.data.DataTableSpec) DatasetGroupSummary(com.amazonaws.services.personalize.model.DatasetGroupSummary) DescribeDatasetGroupRequest(com.amazonaws.services.personalize.model.DescribeDatasetGroupRequest) HashMap(java.util.HashMap) DatasetSummary(com.amazonaws.services.personalize.model.DatasetSummary) BufferedOutputStream(java.io.BufferedOutputStream) ExecutionContext(org.knime.core.node.ExecutionContext) CloudConnectionInformation(org.knime.cloud.core.util.port.CloudConnectionInformation) Connection(org.knime.base.filehandling.remote.files.Connection) AmazonPersonalizeUtils(org.knime.cloud.aws.mlservices.utils.personalize.AmazonPersonalizeUtils) CreateSchemaRequest(com.amazonaws.services.personalize.model.CreateSchemaRequest) OutputStreamWriter(java.io.OutputStreamWriter) AmazonPersonalizeConnection(org.knime.cloud.aws.mlservices.nodes.personalize.AmazonPersonalizeConnection) DataCell(org.knime.core.data.DataCell) Type(org.apache.avro.Schema.Type) StringValue(org.knime.core.data.StringValue) CreateDatasetGroupRequest(com.amazonaws.services.personalize.model.CreateDatasetGroupRequest) ConnectionInformation(org.knime.base.filehandling.remote.connectioninformation.port.ConnectionInformation) CloseableRowIterator(org.knime.core.data.container.CloseableRowIterator) CreateDatasetRequest(com.amazonaws.services.personalize.model.CreateDatasetRequest) ListDatasetsResult(com.amazonaws.services.personalize.model.ListDatasetsResult) FileOutputStream(java.io.FileOutputStream) PortObjectSpec(org.knime.core.node.port.PortObjectSpec) IOException(java.io.IOException) DatasetSchemaSummary(com.amazonaws.services.personalize.model.DatasetSchemaSummary) DeleteDatasetRequest(com.amazonaws.services.personalize.model.DeleteDatasetRequest) File(java.io.File) DataRow(org.knime.core.data.DataRow) NodeSettingsWO(org.knime.core.node.NodeSettingsWO) ListDatasetGroupsRequest(com.amazonaws.services.personalize.model.ListDatasetGroupsRequest) ListDatasetsRequest(com.amazonaws.services.personalize.model.ListDatasetsRequest) StringUtils(com.amazonaws.util.StringUtils) ColumnRearranger(org.knime.core.data.container.ColumnRearranger) FileUtil(org.knime.core.util.FileUtil) CreateSchemaRequest(com.amazonaws.services.personalize.model.CreateSchemaRequest) Schema(org.apache.avro.Schema) PortType(org.knime.core.node.port.PortType) Type(org.apache.avro.Schema.Type) DatasetSchemaSummary(com.amazonaws.services.personalize.model.DatasetSchemaSummary) DataColumnSpec(org.knime.core.data.DataColumnSpec) IntValue(org.knime.core.data.IntValue)

Aggregations

Type (org.apache.avro.Schema.Type)80 Schema (org.apache.avro.Schema)58 Field (org.apache.avro.Schema.Field)32 Map (java.util.Map)20 List (java.util.List)16 HashMap (java.util.HashMap)15 ArrayList (java.util.ArrayList)13 ByteBuffer (java.nio.ByteBuffer)11 Collectors (java.util.stream.Collectors)11 IOException (java.io.IOException)10 LogicalType (org.apache.avro.LogicalType)8 LinkedHashMap (java.util.LinkedHashMap)7 ConcurrentHashMap (java.util.concurrent.ConcurrentHashMap)7 ImmutableMap (com.google.common.collect.ImmutableMap)6 Arrays (java.util.Arrays)5 PersistentBase (org.apache.gora.persistency.impl.PersistentBase)5 Test (org.junit.Test)5 BaseRuntimeChildDefinition (ca.uhn.fhir.context.BaseRuntimeChildDefinition)4 BaseRuntimeElementDefinition (ca.uhn.fhir.context.BaseRuntimeElementDefinition)4 DataType (com.linkedin.pinot.common.data.FieldSpec.DataType)4