Search in sources :

Example 1 with SchemaCoder

use of org.apache.beam.sdk.schemas.SchemaCoder in project beam by apache.

the class ParDo method getDoFnSchemaInformation.

/**
 * Extract information on how the DoFn uses schemas. In particular, if the schema of an element
 * parameter does not match the input PCollection's schema, convert.
 */
@Internal
public static DoFnSchemaInformation getDoFnSchemaInformation(DoFn<?, ?> fn, PCollection<?> input) {
    DoFnSignature signature = DoFnSignatures.getSignature(fn.getClass());
    DoFnSignature.ProcessElementMethod processElementMethod = signature.processElement();
    if (!processElementMethod.getSchemaElementParameters().isEmpty()) {
        if (!input.hasSchema()) {
            throw new IllegalArgumentException("Type of @Element must match the DoFn type" + input);
        }
    }
    SchemaRegistry schemaRegistry = input.getPipeline().getSchemaRegistry();
    DoFnSchemaInformation doFnSchemaInformation = DoFnSchemaInformation.create();
    for (SchemaElementParameter parameter : processElementMethod.getSchemaElementParameters()) {
        TypeDescriptor<?> elementT = parameter.elementT();
        FieldAccessDescriptor accessDescriptor = getFieldAccessDescriptorFromParameter(parameter.fieldAccessString(), input.getSchema(), signature.fieldAccessDeclarations(), fn);
        doFnSchemaInformation = doFnSchemaInformation.withFieldAccessDescriptor(accessDescriptor);
        Schema selectedSchema = SelectHelpers.getOutputSchema(input.getSchema(), accessDescriptor);
        ConvertHelpers.ConvertedSchemaInformation converted = ConvertHelpers.getConvertedSchemaInformation(selectedSchema, elementT, schemaRegistry);
        if (converted.outputSchemaCoder != null) {
            doFnSchemaInformation = doFnSchemaInformation.withSelectFromSchemaParameter((SchemaCoder<?>) input.getCoder(), accessDescriptor, selectedSchema, converted.outputSchemaCoder, converted.unboxedType != null);
        } else {
            // If the selected schema is a Row containing a single primitive type (which is the output
            // of Select when selecting a primitive), attempt to unbox it and match against the
            // parameter.
            checkArgument(converted.unboxedType != null);
            doFnSchemaInformation = doFnSchemaInformation.withUnboxPrimitiveParameter((SchemaCoder<?>) input.getCoder(), accessDescriptor, selectedSchema, elementT);
        }
    }
    for (DoFnSignature.Parameter p : processElementMethod.extraParameters()) {
        if (p instanceof ProcessContextParameter || p instanceof ElementParameter) {
            doFnSchemaInformation = doFnSchemaInformation.withFieldAccessDescriptor(FieldAccessDescriptor.withAllFields());
            break;
        }
    }
    return doFnSchemaInformation;
}
Also used : FieldAccessDescriptor(org.apache.beam.sdk.schemas.FieldAccessDescriptor) ConvertHelpers(org.apache.beam.sdk.schemas.utils.ConvertHelpers) SchemaCoder(org.apache.beam.sdk.schemas.SchemaCoder) Schema(org.apache.beam.sdk.schemas.Schema) ProcessContextParameter(org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.ProcessContextParameter) ElementParameter(org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.ElementParameter) SchemaElementParameter(org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.SchemaElementParameter) SchemaRegistry(org.apache.beam.sdk.schemas.SchemaRegistry) SchemaElementParameter(org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.SchemaElementParameter) DoFnSignature(org.apache.beam.sdk.transforms.reflect.DoFnSignature) Internal(org.apache.beam.sdk.annotations.Internal)

Example 2 with SchemaCoder

use of org.apache.beam.sdk.schemas.SchemaCoder in project beam by apache.

the class ParDo method schemasForStateSpecTypes.

private static SchemaCoder[] schemasForStateSpecTypes(DoFnSignature.StateDeclaration stateDeclaration, SchemaRegistry schemaRegistry) throws NoSuchSchemaException {
    Type stateType = stateDeclaration.stateType().getType();
    if (!(stateType instanceof ParameterizedType)) {
        // No type arguments means no coders to infer.
        return new SchemaCoder[0];
    }
    Type[] typeArguments = ((ParameterizedType) stateType).getActualTypeArguments();
    SchemaCoder[] coders = new SchemaCoder[typeArguments.length];
    for (int i = 0; i < typeArguments.length; i++) {
        Type typeArgument = typeArguments[i];
        TypeDescriptor typeDescriptor = TypeDescriptor.of(typeArgument);
        coders[i] = SchemaCoder.of(schemaRegistry.getSchema(typeDescriptor), typeDescriptor, schemaRegistry.getToRowFunction(typeDescriptor), schemaRegistry.getFromRowFunction(typeDescriptor));
    }
    return coders;
}
Also used : ParameterizedType(java.lang.reflect.ParameterizedType) Type(java.lang.reflect.Type) ParameterizedType(java.lang.reflect.ParameterizedType) TypeDescriptor(org.apache.beam.sdk.values.TypeDescriptor) SchemaCoder(org.apache.beam.sdk.schemas.SchemaCoder)

Aggregations

SchemaCoder (org.apache.beam.sdk.schemas.SchemaCoder)2 ParameterizedType (java.lang.reflect.ParameterizedType)1 Type (java.lang.reflect.Type)1 Internal (org.apache.beam.sdk.annotations.Internal)1 FieldAccessDescriptor (org.apache.beam.sdk.schemas.FieldAccessDescriptor)1 Schema (org.apache.beam.sdk.schemas.Schema)1 SchemaRegistry (org.apache.beam.sdk.schemas.SchemaRegistry)1 ConvertHelpers (org.apache.beam.sdk.schemas.utils.ConvertHelpers)1 DoFnSignature (org.apache.beam.sdk.transforms.reflect.DoFnSignature)1 ElementParameter (org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.ElementParameter)1 ProcessContextParameter (org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.ProcessContextParameter)1 SchemaElementParameter (org.apache.beam.sdk.transforms.reflect.DoFnSignature.Parameter.SchemaElementParameter)1 TypeDescriptor (org.apache.beam.sdk.values.TypeDescriptor)1