Search in sources :

Example 11 with FailsafeElementCoder

use of com.google.cloud.teleport.v2.coders.FailsafeElementCoder in project DataflowTemplates by GoogleCloudPlatform.

the class CsvConvertersTest method testLineToFailsafeJsonNoHeadersUdfDeadletter.

/**
 * Tests {@link CsvConverters.LineToFailsafeJson} converts a line to a {@link FailsafeElement}
 * correctly using a Javascript Udf. Udf processing is handled by {@link
 * JavascriptTextTransformer}. Should output record to deadletter table tag.
 */
@Test
public void testLineToFailsafeJsonNoHeadersUdfDeadletter() {
    FailsafeElementCoder<String, String> coder = FAILSAFE_ELEMENT_CODER;
    CoderRegistry coderRegistry = pipeline.getCoderRegistry();
    coderRegistry.registerCoderForType(coder.getEncodedTypeDescriptor(), coder);
    PCollection<String> lines = pipeline.apply(Create.of(BAD_JSON_STRING_RECORD).withCoder(StringUtf8Coder.of()));
    PCollectionTuple linesTuple = PCollectionTuple.of(CSV_LINES, lines);
    CsvConverters.CsvPipelineOptions options = PipelineOptionsFactory.create().as(CsvConverters.CsvPipelineOptions.class);
    options.setDelimiter(",");
    options.setJavascriptTextTransformGcsPath(SCRIPT_PARSE_EXCEPTION_FILE_PATH);
    options.setJavascriptTextTransformFunctionName("transform");
    PCollectionTuple failsafe = linesTuple.apply("TestLineToFailsafeJsonNoHeadersUdfBad", CsvConverters.LineToFailsafeJson.newBuilder().setDelimiter(options.getDelimiter()).setUdfFileSystemPath(options.getJavascriptTextTransformGcsPath()).setUdfFunctionName(options.getJavascriptTextTransformFunctionName()).setJsonSchemaPath(options.getJsonSchemaPath()).setJsonSchemaPath(null).setHeaderTag(CSV_HEADERS).setLineTag(CSV_LINES).setUdfOutputTag(PROCESSING_OUT).setUdfDeadletterTag(PROCESSING_DEADLETTER_OUT).build());
    PAssert.that(failsafe.get(PROCESSING_OUT)).empty();
    PAssert.that(failsafe.get(PROCESSING_DEADLETTER_OUT)).satisfies(collection -> {
        FailsafeElement result = collection.iterator().next();
        assertThat(result.getPayload(), is(equalTo(BAD_JSON_STRING_RECORD)));
        return null;
    });
    pipeline.run();
}
Also used : CoderRegistry(org.apache.beam.sdk.coders.CoderRegistry) PCollectionTuple(org.apache.beam.sdk.values.PCollectionTuple) FailsafeElement(com.google.cloud.teleport.v2.values.FailsafeElement) Test(org.junit.Test)

Aggregations

CoderRegistry (org.apache.beam.sdk.coders.CoderRegistry)10 PCollectionTuple (org.apache.beam.sdk.values.PCollectionTuple)9 FailsafeElement (com.google.cloud.teleport.v2.values.FailsafeElement)8 Test (org.junit.Test)8 TableRow (com.google.api.services.bigquery.model.TableRow)4 GCSToElasticsearchOptions (com.google.cloud.teleport.v2.elasticsearch.options.GCSToElasticsearchOptions)3 KV (org.apache.beam.sdk.values.KV)3 DateTime (org.joda.time.DateTime)3 Instant (org.joda.time.Instant)3 ArrayList (java.util.ArrayList)2 Pipeline (org.apache.beam.sdk.Pipeline)2 PubsubMessage (org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage)2 FailsafeElementCoder (com.google.cloud.teleport.v2.coders.FailsafeElementCoder)1 MessageToTableRow (com.google.cloud.teleport.v2.templates.KafkaToBigQuery.MessageToTableRow)1 FailsafeJsonToTableRow (com.google.cloud.teleport.v2.transforms.BigQueryConverters.FailsafeJsonToTableRow)1 FailedStringToTableRowFn (com.google.cloud.teleport.v2.transforms.ErrorConverters.FailedStringToTableRowFn)1 FormatDatastreamJsonToJson (com.google.cloud.teleport.v2.transforms.FormatDatastreamJsonToJson)1 FormatTransform (com.google.cloud.teleport.v2.transforms.FormatTransform)1 PubSubToFailSafeElement (com.google.cloud.teleport.v2.transforms.PubSubToFailSafeElement)1 InputUDFToTableRow (com.google.cloud.teleport.v2.transforms.UDFTextTransformer.InputUDFToTableRow)1