Search in sources :

Example 1 with EmployeeRecordReader

use of org.apache.beam.sdk.io.hadoop.inputformat.EmployeeInputFormat.EmployeeRecordReader in project beam by apache.

the class HadoopInputFormatIOTest method testReadersStartWhenZeroRecords.

/**
   * This test validates behavior of
   * {@link HadoopInputFormatBoundedSource.HadoopInputFormatReader#start() start()} method if
   * InputFormat's {@link InputFormat#getSplits() getSplits()} returns InputSplitList having zero
   * records.
   */
@Test
public void testReadersStartWhenZeroRecords() throws Exception {
    InputFormat mockInputFormat = Mockito.mock(EmployeeInputFormat.class);
    EmployeeRecordReader mockReader = Mockito.mock(EmployeeRecordReader.class);
    Mockito.when(mockInputFormat.createRecordReader(Mockito.any(InputSplit.class), Mockito.any(TaskAttemptContext.class))).thenReturn(mockReader);
    Mockito.when(mockReader.nextKeyValue()).thenReturn(false);
    InputSplit mockInputSplit = Mockito.mock(NewObjectsEmployeeInputSplit.class);
    HadoopInputFormatBoundedSource<Text, Employee> boundedSource = new HadoopInputFormatBoundedSource<Text, Employee>(serConf, WritableCoder.of(Text.class), AvroCoder.of(Employee.class), // No key translation required.
    null, // No value translation required.
    null, new SerializableSplit(mockInputSplit));
    boundedSource.setInputFormatObj(mockInputFormat);
    BoundedReader<KV<Text, Employee>> reader = boundedSource.createReader(p.getOptions());
    assertEquals(false, reader.start());
    assertEquals(Double.valueOf(1), reader.getFractionConsumed());
    reader.close();
}
Also used : EmployeeRecordReader(org.apache.beam.sdk.io.hadoop.inputformat.EmployeeInputFormat.EmployeeRecordReader) InputFormat(org.apache.hadoop.mapreduce.InputFormat) HadoopInputFormatBoundedSource(org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIO.HadoopInputFormatBoundedSource) SerializableSplit(org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIO.SerializableSplit) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) Text(org.apache.hadoop.io.Text) KV(org.apache.beam.sdk.values.KV) InputSplit(org.apache.hadoop.mapreduce.InputSplit) NewObjectsEmployeeInputSplit(org.apache.beam.sdk.io.hadoop.inputformat.EmployeeInputFormat.NewObjectsEmployeeInputSplit) Test(org.junit.Test)

Example 2 with EmployeeRecordReader

use of org.apache.beam.sdk.io.hadoop.inputformat.EmployeeInputFormat.EmployeeRecordReader in project beam by apache.

the class HadoopInputFormatIOTest method testGetFractionConsumedForBadProgressValue.

/**
   * This test validates the method getFractionConsumed()- when a bad progress value is returned by
   * the inputformat.
   */
@Test
public void testGetFractionConsumedForBadProgressValue() throws Exception {
    InputFormat<Text, Employee> mockInputFormat = Mockito.mock(EmployeeInputFormat.class);
    EmployeeRecordReader mockReader = Mockito.mock(EmployeeRecordReader.class);
    Mockito.when(mockInputFormat.createRecordReader(Mockito.any(InputSplit.class), Mockito.any(TaskAttemptContext.class))).thenReturn(mockReader);
    Mockito.when(mockReader.nextKeyValue()).thenReturn(true);
    // Set to a bad value , not in range of 0 to 1
    Mockito.when(mockReader.getProgress()).thenReturn(2.0F);
    InputSplit mockInputSplit = Mockito.mock(NewObjectsEmployeeInputSplit.class);
    HadoopInputFormatBoundedSource<Text, Employee> boundedSource = new HadoopInputFormatBoundedSource<Text, Employee>(serConf, WritableCoder.of(Text.class), AvroCoder.of(Employee.class), // No key translation required.
    null, // No value translation required.
    null, new SerializableSplit(mockInputSplit));
    boundedSource.setInputFormatObj(mockInputFormat);
    BoundedReader<KV<Text, Employee>> reader = boundedSource.createReader(p.getOptions());
    assertEquals(Double.valueOf(0), reader.getFractionConsumed());
    boolean start = reader.start();
    assertEquals(true, start);
    if (start) {
        boolean advance = reader.advance();
        assertEquals(null, reader.getFractionConsumed());
        assertEquals(true, advance);
        if (advance) {
            advance = reader.advance();
            assertEquals(null, reader.getFractionConsumed());
        }
    }
    // Validate if getFractionConsumed() returns null after few number of reads as getProgress
    // returns invalid value '2' which is not in the range of 0 to 1.
    assertEquals(null, reader.getFractionConsumed());
    reader.close();
}
Also used : EmployeeRecordReader(org.apache.beam.sdk.io.hadoop.inputformat.EmployeeInputFormat.EmployeeRecordReader) HadoopInputFormatBoundedSource(org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIO.HadoopInputFormatBoundedSource) SerializableSplit(org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIO.SerializableSplit) Text(org.apache.hadoop.io.Text) TaskAttemptContext(org.apache.hadoop.mapreduce.TaskAttemptContext) KV(org.apache.beam.sdk.values.KV) InputSplit(org.apache.hadoop.mapreduce.InputSplit) NewObjectsEmployeeInputSplit(org.apache.beam.sdk.io.hadoop.inputformat.EmployeeInputFormat.NewObjectsEmployeeInputSplit) Test(org.junit.Test)

Aggregations

EmployeeRecordReader (org.apache.beam.sdk.io.hadoop.inputformat.EmployeeInputFormat.EmployeeRecordReader)2 NewObjectsEmployeeInputSplit (org.apache.beam.sdk.io.hadoop.inputformat.EmployeeInputFormat.NewObjectsEmployeeInputSplit)2 HadoopInputFormatBoundedSource (org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIO.HadoopInputFormatBoundedSource)2 SerializableSplit (org.apache.beam.sdk.io.hadoop.inputformat.HadoopInputFormatIO.SerializableSplit)2 KV (org.apache.beam.sdk.values.KV)2 Text (org.apache.hadoop.io.Text)2 InputSplit (org.apache.hadoop.mapreduce.InputSplit)2 TaskAttemptContext (org.apache.hadoop.mapreduce.TaskAttemptContext)2 Test (org.junit.Test)2 InputFormat (org.apache.hadoop.mapreduce.InputFormat)1