Search in sources :

Example 1 with DataSchemaTraverse

use of com.linkedin.data.schema.DataSchemaTraverse in project rest.li by linkedin.

the class SchemaTranslator method dataToAvroSchemaJson.

/**
   * Translate from a {@link DataSchema} to an Avro {@link Schema}
   * <p>
   * This method translates optional fields in the {@link DataSchema} to union with null
   * fields in Avro {@link Schema}. Record fields with optional attribute set to true will
   * be translated to a union type that has null member type. If field's type is not
   * a union, then the new union type will be a union of the field's type and the null type.
   * If the field's type is already a union, the new union type contains all the
   * union's member types and the null type.
   * <p>
   * This method also translates or sets the default value for optional fields in
   * the {@link DataSchema}. If the optional field does not have a default value,
   * set the translated default value to null. {@link OptionalDefaultMode}
   * specifies how an optional field with a default value is translated.
   * <p>
   * Both the schema and default value translation takes into account that default value
   * representation for Avro unions does not include the member type discriminator and
   * the type of the default value is always the 1st member type of the union. Schema translation
   * fails by throwing an {@link IllegalArgumentException} if the default value's type
   * is not the same as the 1st member type of the union.
   * <p>
   * If {@link DataToAvroSchemaTranslationOptions#getEmbeddedSchema()} EmbeddedSchema()} is
   * set to {@link EmbedSchemaMode#ROOT_ONLY}, then the input {@link DataSchema} will be embedded in the
   * translated Avro {@link Schema}.
   * The embedded schema will be the value of the "schema" property within the "com.linkedin.data" property.
   * If the input {@link DataSchema} is a typeref, then embedded schema will be that of the
   * actual type referenced.
   *
   * @param dataSchema provides the {@link DataSchema}.
   * @param options specifies the {@link DataToAvroSchemaTranslationOptions}.
   * @return the JSON representation of the Avro {@link Schema}.
   * @throws IllegalArgumentException if the {@link DataSchema} cannot be translated.
   */
public static String dataToAvroSchemaJson(DataSchema dataSchema, DataToAvroSchemaTranslationOptions options) throws IllegalArgumentException {
    // convert default values
    DataSchemaTraverse postOrderTraverse = new DataSchemaTraverse(DataSchemaTraverse.Order.POST_ORDER);
    final DefaultDataToAvroConvertCallback defaultConverter = new DefaultDataToAvroConvertCallback(options);
    postOrderTraverse.traverse(dataSchema, defaultConverter);
    // convert schema
    String schemaJson = SchemaToAvroJsonEncoder.schemaToAvro(dataSchema, defaultConverter.fieldDefaultValueProvider(), options);
    return schemaJson;
}
Also used : ByteString(com.linkedin.data.ByteString) DataSchemaTraverse(com.linkedin.data.schema.DataSchemaTraverse)

Example 2 with DataSchemaTraverse

use of com.linkedin.data.schema.DataSchemaTraverse in project rest.li by linkedin.

the class DataSchemaAnnotationValidator method buildSchemaValidators.

/**
   * Build a cache of {@link Validator}s declared for the specified schema.
   *
   * @param schema to cache {@link Validator}s for.
   * @return the cache if successful.
   */
private IdentityHashMap<Object, List<Validator>> buildSchemaValidators(DataSchema schema) {
    final IdentityHashMap<Object, List<Validator>> map = new IdentityHashMap<Object, List<Validator>>();
    DataSchemaTraverse traverse = new DataSchemaTraverse();
    traverse.traverse(schema, new DataSchemaTraverse.Callback() {

        @Override
        public void callback(List<String> path, DataSchema schema) {
            List<Validator> validatorList = map.get(schema);
            if (validatorList == null) {
                Object validateObject = schema.getProperties().get(VALIDATE);
                if (validateObject == null) {
                    validatorList = NO_VALIDATORS;
                } else {
                    validatorList = buildValidatorList(validateObject, path, schema);
                }
                map.put(schema, validatorList);
                if (schema.getType() == DataSchema.Type.RECORD) {
                    RecordDataSchema recordDataSchema = (RecordDataSchema) schema;
                    for (RecordDataSchema.Field field : recordDataSchema.getFields()) {
                        validateObject = field.getProperties().get(VALIDATE);
                        if (validateObject == null) {
                            validatorList = NO_VALIDATORS;
                        } else {
                            path.add(field.getName());
                            validatorList = buildValidatorList(validateObject, path, field);
                            path.remove(path.size() - 1);
                        }
                        map.put(field, validatorList);
                    }
                }
            }
        }
    });
    return map;
}
Also used : IdentityHashMap(java.util.IdentityHashMap) DataSchema(com.linkedin.data.schema.DataSchema) TyperefDataSchema(com.linkedin.data.schema.TyperefDataSchema) RecordDataSchema(com.linkedin.data.schema.RecordDataSchema) NamedDataSchema(com.linkedin.data.schema.NamedDataSchema) RecordDataSchema(com.linkedin.data.schema.RecordDataSchema) MessageList(com.linkedin.data.message.MessageList) ArrayList(java.util.ArrayList) List(java.util.List) DataSchemaTraverse(com.linkedin.data.schema.DataSchemaTraverse)

Example 3 with DataSchemaTraverse

use of com.linkedin.data.schema.DataSchemaTraverse in project rest.li by linkedin.

the class SchemaTranslator method avroToDataSchema.

/**
   * Translate an Avro {@link Schema} to a {@link DataSchema}.
   * <p>
   * If the translation mode is {@link AvroToDataSchemaTranslationMode#RETURN_EMBEDDED_SCHEMA}
   * and a {@link DataSchema} is embedded in the Avro schema, then return the embedded schema.
   * An embedded schema is present if the Avro {@link Schema} has a "com.linkedin.data" property and the
   * "com.linkedin.data" property contains both "schema" and "optionalDefaultMode" properties.
   * The "schema" property provides the embedded {@link DataSchema}.
   * The "optionalDefaultMode" property provides how optional default values were translated.
   * <p>
   * If the translation mode is {@link AvroToDataSchemaTranslationMode#VERIFY_EMBEDDED_SCHEMA}
   * and a {@link DataSchema} is embedded in the Avro schema, then verify that the embedded schema
   * translates to the input Avro schema. If the translated and embedded schema is the same,
   * then return the embedded schema, else throw {@link IllegalArgumentException}.
   * <p>
   * If the translation mode is {@link com.linkedin.data.avro.AvroToDataSchemaTranslationMode#TRANSLATE}
   * or no embedded {@link DataSchema} is present, then this method
   * translates the provided Avro {@link Schema} to a {@link DataSchema}
   * as described follows:
   * <p>
   * This method translates union with null record fields in Avro {@link Schema}
   * to optional fields in {@link DataSchema}. Record fields
   * whose type is a union with null will be translated to a new type, and the field becomes optional.
   * If the Avro union has two types (one of them is the null type), then the new type of the
   * field is the non-null member type of the union. If the Avro union does not have two types
   * (one of them is the null type) then the new type of the field is a union type with the null type
   * removed from the original union.
   * <p>
   * This method also translates default values. If the field's type is a union with null
   * and has a default value, then this method also translates the default value of the field
   * to comply with the new type of the field. If the default value is null,
   * then remove the default value. If new type is not a union and the default value
   * is of the non-null member type, then assign the default value to the
   * non-null value within the union value (i.e. the value of the only entry within the
   * JSON object.) If the new type is a union and the default value is of the
   * non-null member type, then assign the default value to a JSON object
   * containing a single entry with the key being the member type discriminator of
   * the first union member and the value being the actual member value.
   * <p>
   * Both the schema and default value translation takes into account that default value
   * representation for Avro unions does not include the member type discriminator and
   * the type of the default value is always the 1st member of the union.
   *
   * @param avroSchemaInJson provides the JSON representation of the Avro {@link Schema}.
   * @param options specifies the {@link AvroToDataSchemaTranslationOptions}.
   * @return the translated {@link DataSchema}.
   * @throws IllegalArgumentException if the Avro {@link Schema} cannot be translated.
   */
public static DataSchema avroToDataSchema(String avroSchemaInJson, AvroToDataSchemaTranslationOptions options) throws IllegalArgumentException {
    ValidationOptions validationOptions = SchemaParser.getDefaultSchemaParserValidationOptions();
    validationOptions.setAvroUnionMode(true);
    SchemaParserFactory parserFactory = SchemaParserFactory.instance(validationOptions);
    DataSchemaResolver resolver = getResolver(parserFactory, options);
    PegasusSchemaParser parser = parserFactory.create(resolver);
    parser.parse(avroSchemaInJson);
    if (parser.hasError()) {
        throw new IllegalArgumentException(parser.errorMessage());
    }
    assert (parser.topLevelDataSchemas().size() == 1);
    DataSchema dataSchema = parser.topLevelDataSchemas().get(0);
    DataSchema resultDataSchema = null;
    AvroToDataSchemaTranslationMode translationMode = options.getTranslationMode();
    if (translationMode == AvroToDataSchemaTranslationMode.RETURN_EMBEDDED_SCHEMA || translationMode == AvroToDataSchemaTranslationMode.VERIFY_EMBEDDED_SCHEMA) {
        // check for embedded schema
        Object dataProperty = dataSchema.getProperties().get(SchemaTranslator.DATA_PROPERTY);
        if (dataProperty != null && dataProperty.getClass() == DataMap.class) {
            Object schemaProperty = ((DataMap) dataProperty).get(SchemaTranslator.SCHEMA_PROPERTY);
            if (schemaProperty.getClass() == DataMap.class) {
                SchemaParser embeddedSchemaParser = SchemaParserFactory.instance().create(null);
                embeddedSchemaParser.parse(Arrays.asList(schemaProperty));
                if (embeddedSchemaParser.hasError()) {
                    throw new IllegalArgumentException("Embedded schema is invalid\n" + embeddedSchemaParser.errorMessage());
                }
                assert (embeddedSchemaParser.topLevelDataSchemas().size() == 1);
                resultDataSchema = embeddedSchemaParser.topLevelDataSchemas().get(0);
                if (translationMode == AvroToDataSchemaTranslationMode.VERIFY_EMBEDDED_SCHEMA) {
                    // additional verification to make sure that embedded schema translates to Avro schema
                    DataToAvroSchemaTranslationOptions dataToAvdoSchemaOptions = new DataToAvroSchemaTranslationOptions();
                    Object optionalDefaultModeProperty = ((DataMap) dataProperty).get(SchemaTranslator.OPTIONAL_DEFAULT_MODE_PROPERTY);
                    dataToAvdoSchemaOptions.setOptionalDefaultMode(OptionalDefaultMode.valueOf(optionalDefaultModeProperty.toString()));
                    Schema avroSchemaFromEmbedded = dataToAvroSchema(resultDataSchema, dataToAvdoSchemaOptions);
                    Schema avroSchemaFromJson = Schema.parse(avroSchemaInJson);
                    if (avroSchemaFromEmbedded.equals(avroSchemaFromJson) == false) {
                        throw new IllegalArgumentException("Embedded schema does not translate to input Avro schema: " + avroSchemaInJson);
                    }
                }
            }
        }
    }
    if (resultDataSchema == null) {
        // translationMode == TRANSLATE or no embedded schema
        DataSchemaTraverse traverse = new DataSchemaTraverse();
        traverse.traverse(dataSchema, AvroToDataSchemaConvertCallback.INSTANCE);
        // convert default values
        traverse.traverse(dataSchema, DefaultAvroToDataConvertCallback.INSTANCE);
        // make sure it can round-trip
        String dataSchemaJson = dataSchema.toString();
        resultDataSchema = DataTemplateUtil.parseSchema(dataSchemaJson);
    }
    return resultDataSchema;
}
Also used : PegasusSchemaParser(com.linkedin.data.schema.PegasusSchemaParser) SchemaParserFactory(com.linkedin.data.schema.SchemaParserFactory) FixedDataSchema(com.linkedin.data.schema.FixedDataSchema) DataSchema(com.linkedin.data.schema.DataSchema) UnionDataSchema(com.linkedin.data.schema.UnionDataSchema) MapDataSchema(com.linkedin.data.schema.MapDataSchema) EnumDataSchema(com.linkedin.data.schema.EnumDataSchema) Schema(org.apache.avro.Schema) RecordDataSchema(com.linkedin.data.schema.RecordDataSchema) ArrayDataSchema(com.linkedin.data.schema.ArrayDataSchema) ByteString(com.linkedin.data.ByteString) ValidationOptions(com.linkedin.data.schema.validation.ValidationOptions) SchemaParser(com.linkedin.data.schema.SchemaParser) PegasusSchemaParser(com.linkedin.data.schema.PegasusSchemaParser) DataMap(com.linkedin.data.DataMap) FixedDataSchema(com.linkedin.data.schema.FixedDataSchema) DataSchema(com.linkedin.data.schema.DataSchema) UnionDataSchema(com.linkedin.data.schema.UnionDataSchema) MapDataSchema(com.linkedin.data.schema.MapDataSchema) EnumDataSchema(com.linkedin.data.schema.EnumDataSchema) RecordDataSchema(com.linkedin.data.schema.RecordDataSchema) ArrayDataSchema(com.linkedin.data.schema.ArrayDataSchema) FileDataSchemaResolver(com.linkedin.data.schema.resolver.FileDataSchemaResolver) DataSchemaResolver(com.linkedin.data.schema.DataSchemaResolver) DefaultDataSchemaResolver(com.linkedin.data.schema.resolver.DefaultDataSchemaResolver) DataSchemaTraverse(com.linkedin.data.schema.DataSchemaTraverse)

Example 4 with DataSchemaTraverse

use of com.linkedin.data.schema.DataSchemaTraverse in project rest.li by linkedin.

the class RestLiResourceRelationship method findDataModels.

private void findDataModels() {
    final ResourceSchemaVisitior visitor = new BaseResourceSchemaVisitor() {

        @Override
        public void visitResourceSchema(VisitContext visitContext, ResourceSchema resourceSchema) {
            final String schema = resourceSchema.getSchema();
            // ActionSet resources do not have a schema
            if (schema != null) {
                final NamedDataSchema schemaSchema = extractSchema(schema);
                if (schemaSchema != null) {
                    connectSchemaToResource(visitContext, schemaSchema);
                }
            }
        }

        @Override
        public void visitCollectionResource(VisitContext visitContext, CollectionSchema collectionSchema) {
            final IdentifierSchema id = collectionSchema.getIdentifier();
            final NamedDataSchema typeSchema = extractSchema(id.getType());
            if (typeSchema != null) {
                connectSchemaToResource(visitContext, typeSchema);
            }
            final String params = id.getParams();
            if (params != null) {
                final NamedDataSchema paramsSchema = extractSchema(params);
                if (paramsSchema != null) {
                    connectSchemaToResource(visitContext, paramsSchema);
                }
            }
        }

        @Override
        public void visitAssociationResource(VisitContext visitContext, AssociationSchema associationSchema) {
            for (AssocKeySchema key : associationSchema.getAssocKeys()) {
                final NamedDataSchema keyTypeSchema = extractSchema(key.getType());
                if (keyTypeSchema != null) {
                    connectSchemaToResource(visitContext, keyTypeSchema);
                }
            }
        }

        @Override
        public void visitParameter(VisitContext visitContext, RecordTemplate parentResource, Object parentMethodSchema, ParameterSchema parameterSchema) {
            String parameterTypeString = parameterSchema.getType();
            if (// the parameter type field contains a inline schema, so we traverse into it
            isInlineSchema(parameterTypeString)) {
                visitInlineSchema(visitContext, parameterTypeString);
            } else {
                final NamedDataSchema schema;
                // grab the schema name from it
                if (parameterSchema.hasItems()) {
                    schema = extractSchema(parameterSchema.getItems());
                } else // the only remaining possibility is that the type field contains the name of a data schema
                {
                    schema = extractSchema(parameterTypeString);
                }
                if (schema != null) {
                    connectSchemaToResource(visitContext, schema);
                }
            }
        }

        @Override
        public void visitFinder(VisitContext visitContext, RecordTemplate parentResource, FinderSchema finderSchema) {
            final MetadataSchema metadata = finderSchema.getMetadata();
            if (metadata != null) {
                final NamedDataSchema metadataTypeSchema = extractSchema(metadata.getType());
                if (metadataTypeSchema != null) {
                    connectSchemaToResource(visitContext, metadataTypeSchema);
                }
            }
        }

        @Override
        public void visitAction(VisitContext visitContext, RecordTemplate parentResource, ResourceLevel resourceLevel, ActionSchema actionSchema) {
            final String returns = actionSchema.getReturns();
            if (returns != null) {
                if (// the parameter type field contains a inline schema, so we traverse into it
                isInlineSchema(returns)) {
                    visitInlineSchema(visitContext, returns);
                } else // otherwise the type field contains the name of a data schema
                {
                    final NamedDataSchema returnsSchema = extractSchema(returns);
                    if (returnsSchema != null) {
                        connectSchemaToResource(visitContext, returnsSchema);
                    }
                }
            }
            final StringArray throwsArray = actionSchema.getThrows();
            if (throwsArray != null) {
                for (String errorName : throwsArray) {
                    final NamedDataSchema errorSchema = extractSchema(errorName);
                    if (errorSchema != null) {
                        connectSchemaToResource(visitContext, errorSchema);
                    }
                }
            }
        }

        private boolean isInlineSchema(String schemaString) {
            return schemaString.startsWith("{");
        }

        private void visitInlineSchema(VisitContext visitContext, String schemaString) {
            DataSchema schema = DataTemplateUtil.parseSchema(schemaString, _schemaResolver);
            if (schema instanceof ArrayDataSchema) {
                DataSchema itemSchema = ((ArrayDataSchema) schema).getItems();
                if (itemSchema instanceof NamedDataSchema) {
                    connectSchemaToResource(visitContext, (NamedDataSchema) itemSchema);
                }
            }
            if (schema instanceof MapDataSchema) {
                DataSchema valueSchema = ((MapDataSchema) schema).getValues();
                if (valueSchema instanceof NamedDataSchema) {
                    connectSchemaToResource(visitContext, (NamedDataSchema) valueSchema);
                }
            }
        }

        private void connectSchemaToResource(VisitContext visitContext, final NamedDataSchema schema) {
            final Node<NamedDataSchema> schemaNode = _relationships.get(schema);
            _dataModels.put(schema.getFullName(), schema);
            final DataSchemaTraverse traveler = new DataSchemaTraverse();
            traveler.traverse(schema, new DataSchemaTraverse.Callback() {

                @Override
                public void callback(List<String> path, DataSchema nestedSchema) {
                    if (nestedSchema instanceof RecordDataSchema && nestedSchema != schema) {
                        final RecordDataSchema nestedRecordSchema = (RecordDataSchema) nestedSchema;
                        _dataModels.put(nestedRecordSchema.getFullName(), nestedRecordSchema);
                        final Node<RecordDataSchema> node = _relationships.get(nestedRecordSchema);
                        schemaNode.addAdjacentNode(node);
                    }
                }
            });
            final Node<ResourceSchema> resourceNode = _relationships.get(visitContext.getParentSchema());
            resourceNode.addAdjacentNode(schemaNode);
            schemaNode.addAdjacentNode(resourceNode);
        }
    };
    ResourceSchemaCollection.visitResources(_resourceSchemas.getResources().values(), visitor);
}
Also used : ResourceSchema(com.linkedin.restli.restspec.ResourceSchema) ResourceLevel(com.linkedin.restli.server.ResourceLevel) MapDataSchema(com.linkedin.data.schema.MapDataSchema) ParameterSchema(com.linkedin.restli.restspec.ParameterSchema) FinderSchema(com.linkedin.restli.restspec.FinderSchema) StringArray(com.linkedin.data.template.StringArray) IdentifierSchema(com.linkedin.restli.restspec.IdentifierSchema) RecordTemplate(com.linkedin.data.template.RecordTemplate) AssociationSchema(com.linkedin.restli.restspec.AssociationSchema) CollectionSchema(com.linkedin.restli.restspec.CollectionSchema) MetadataSchema(com.linkedin.restli.restspec.MetadataSchema) ActionSchema(com.linkedin.restli.restspec.ActionSchema) AssocKeySchema(com.linkedin.restli.restspec.AssocKeySchema) NamedDataSchema(com.linkedin.data.schema.NamedDataSchema) DataSchema(com.linkedin.data.schema.DataSchema) MapDataSchema(com.linkedin.data.schema.MapDataSchema) RecordDataSchema(com.linkedin.data.schema.RecordDataSchema) NamedDataSchema(com.linkedin.data.schema.NamedDataSchema) ArrayDataSchema(com.linkedin.data.schema.ArrayDataSchema) ArrayDataSchema(com.linkedin.data.schema.ArrayDataSchema) RecordDataSchema(com.linkedin.data.schema.RecordDataSchema) DataSchemaTraverse(com.linkedin.data.schema.DataSchemaTraverse)

Aggregations

DataSchemaTraverse (com.linkedin.data.schema.DataSchemaTraverse)4 DataSchema (com.linkedin.data.schema.DataSchema)3 RecordDataSchema (com.linkedin.data.schema.RecordDataSchema)3 ByteString (com.linkedin.data.ByteString)2 ArrayDataSchema (com.linkedin.data.schema.ArrayDataSchema)2 MapDataSchema (com.linkedin.data.schema.MapDataSchema)2 NamedDataSchema (com.linkedin.data.schema.NamedDataSchema)2 DataMap (com.linkedin.data.DataMap)1 MessageList (com.linkedin.data.message.MessageList)1 DataSchemaResolver (com.linkedin.data.schema.DataSchemaResolver)1 EnumDataSchema (com.linkedin.data.schema.EnumDataSchema)1 FixedDataSchema (com.linkedin.data.schema.FixedDataSchema)1 PegasusSchemaParser (com.linkedin.data.schema.PegasusSchemaParser)1 SchemaParser (com.linkedin.data.schema.SchemaParser)1 SchemaParserFactory (com.linkedin.data.schema.SchemaParserFactory)1 TyperefDataSchema (com.linkedin.data.schema.TyperefDataSchema)1 UnionDataSchema (com.linkedin.data.schema.UnionDataSchema)1 DefaultDataSchemaResolver (com.linkedin.data.schema.resolver.DefaultDataSchemaResolver)1 FileDataSchemaResolver (com.linkedin.data.schema.resolver.FileDataSchemaResolver)1 ValidationOptions (com.linkedin.data.schema.validation.ValidationOptions)1