Search in sources :

Example 1 with AvroOutputFormat

use of org.apache.flink.api.java.io.AvroOutputFormat in project flink by apache.

the class HDFSTest method testAvroOut.

@Test
public void testAvroOut() {
    String type = "one";
    AvroOutputFormat<String> avroOut = new AvroOutputFormat<String>(String.class);
    org.apache.hadoop.fs.Path result = new org.apache.hadoop.fs.Path(hdfsURI + "/avroTest");
    avroOut.setOutputFilePath(new Path(result.toString()));
    avroOut.setWriteMode(FileSystem.WriteMode.NO_OVERWRITE);
    avroOut.setOutputDirectoryMode(FileOutputFormat.OutputDirectoryMode.ALWAYS);
    try {
        avroOut.open(0, 2);
        avroOut.writeRecord(type);
        avroOut.close();
        avroOut.open(1, 2);
        avroOut.writeRecord(type);
        avroOut.close();
        assertTrue("No result file present", hdfs.exists(result));
        FileStatus[] files = hdfs.listStatus(result);
        Assert.assertEquals(2, files.length);
        for (FileStatus file : files) {
            assertTrue("1.avro".equals(file.getPath().getName()) || "2.avro".equals(file.getPath().getName()));
        }
    } catch (IOException e) {
        e.printStackTrace();
        Assert.fail(e.getMessage());
    }
}
Also used : Path(org.apache.flink.core.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) AvroOutputFormat(org.apache.flink.api.java.io.AvroOutputFormat) IOException(java.io.IOException) Test(org.junit.Test)

Example 2 with AvroOutputFormat

use of org.apache.flink.api.java.io.AvroOutputFormat in project flink by apache.

the class AvroOutputFormatITCase method testProgram.

@Override
protected void testProgram() throws Exception {
    ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
    DataSet<Tuple3<String, Integer, String>> input = env.readCsvFile(inputPath).fieldDelimiter("|").types(String.class, Integer.class, String.class);
    //output the data with AvroOutputFormat for specific user type
    DataSet<User> specificUser = input.map(new ConvertToUser());
    AvroOutputFormat<User> avroOutputFormat = new AvroOutputFormat<User>(User.class);
    // FLINK-4771: use a codec
    avroOutputFormat.setCodec(AvroOutputFormat.Codec.SNAPPY);
    //FLINK-3304: Ensure the OF is properly serializing the schema
    avroOutputFormat.setSchema(User.SCHEMA$);
    specificUser.write(avroOutputFormat, outputPath1);
    //output the data with AvroOutputFormat for reflect user type
    DataSet<ReflectiveUser> reflectiveUser = specificUser.map(new ConvertToReflective());
    reflectiveUser.write(new AvroOutputFormat<ReflectiveUser>(ReflectiveUser.class), outputPath2);
    env.execute();
}
Also used : ExecutionEnvironment(org.apache.flink.api.java.ExecutionEnvironment) User(org.apache.flink.api.io.avro.example.User) Tuple3(org.apache.flink.api.java.tuple.Tuple3) AvroOutputFormat(org.apache.flink.api.java.io.AvroOutputFormat)

Aggregations

AvroOutputFormat (org.apache.flink.api.java.io.AvroOutputFormat)2 IOException (java.io.IOException)1 User (org.apache.flink.api.io.avro.example.User)1 ExecutionEnvironment (org.apache.flink.api.java.ExecutionEnvironment)1 Tuple3 (org.apache.flink.api.java.tuple.Tuple3)1 Path (org.apache.flink.core.fs.Path)1 FileStatus (org.apache.hadoop.fs.FileStatus)1 Test (org.junit.Test)1