Search in sources :

Example 26 with MoveWork

use of org.apache.hadoop.hive.ql.plan.MoveWork in project hive by apache.

the class TestGenMapRedUtilsCreateConditionalTask method testMergePathValidMoveWorkReturnsNewMoveWork.

@Test
public void testMergePathValidMoveWorkReturnsNewMoveWork() {
    final Path condInputPath = new Path("s3a://bucket/scratch/-ext-10000");
    final Path condOutputPath = new Path("s3a://bucket/scratch/-ext-10002");
    final Path targetMoveWorkPath = new Path("s3a://bucket/scratch/-ext-10003");
    final MoveWork mockWork = mock(MoveWork.class);
    final LineageState lineageState = new LineageState();
    MoveWork newWork;
    // test using loadFileWork
    when(mockWork.getLoadFileWork()).thenReturn(new LoadFileDesc(condOutputPath, targetMoveWorkPath, false, "", "", false));
    newWork = GenMapRedUtils.mergeMovePaths(condInputPath, mockWork, lineageState);
    assertNotNull(newWork);
    assertNotEquals(newWork, mockWork);
    assertEquals(condInputPath, newWork.getLoadFileWork().getSourcePath());
    assertEquals(targetMoveWorkPath, newWork.getLoadFileWork().getTargetDir());
    // test using loadTableWork
    TableDesc tableDesc = new TableDesc();
    reset(mockWork);
    when(mockWork.getLoadTableWork()).thenReturn(new LoadTableDesc(condOutputPath, tableDesc, null));
    newWork = GenMapRedUtils.mergeMovePaths(condInputPath, mockWork, lineageState);
    assertNotNull(newWork);
    assertNotEquals(newWork, mockWork);
    assertEquals(condInputPath, newWork.getLoadTableWork().getSourcePath());
    assertTrue(newWork.getLoadTableWork().getTable().equals(tableDesc));
}
Also used : Path(org.apache.hadoop.fs.Path) MoveWork(org.apache.hadoop.hive.ql.plan.MoveWork) LoadTableDesc(org.apache.hadoop.hive.ql.plan.LoadTableDesc) LoadFileDesc(org.apache.hadoop.hive.ql.plan.LoadFileDesc) LineageState(org.apache.hadoop.hive.ql.session.LineageState) TableDesc(org.apache.hadoop.hive.ql.plan.TableDesc) LoadTableDesc(org.apache.hadoop.hive.ql.plan.LoadTableDesc) Test(org.junit.Test)

Example 27 with MoveWork

use of org.apache.hadoop.hive.ql.plan.MoveWork in project hive by apache.

the class TestGenMapRedUtilsCreateConditionalTask method createMoveTask.

private Task<MoveWork> createMoveTask(Path source, Path destination) {
    Task<MoveWork> moveTask = mock(MoveTask.class);
    MoveWork moveWork = new MoveWork();
    moveWork.setLoadFileWork(new LoadFileDesc(source, destination, true, null, null, false));
    when(moveTask.getWork()).thenReturn(moveWork);
    return moveTask;
}
Also used : MoveWork(org.apache.hadoop.hive.ql.plan.MoveWork) LoadFileDesc(org.apache.hadoop.hive.ql.plan.LoadFileDesc)

Example 28 with MoveWork

use of org.apache.hadoop.hive.ql.plan.MoveWork in project hive by apache.

the class TestGenMapRedUtilsCreateConditionalTask method testMergePathWithInvalidMoveWorkThrowsException.

@Test(expected = IllegalArgumentException.class)
public void testMergePathWithInvalidMoveWorkThrowsException() {
    final Path condInputPath = new Path("s3a://bucket/scratch/-ext-10000");
    final MoveWork mockWork = mock(MoveWork.class);
    final LineageState lineageState = new LineageState();
    when(mockWork.getLoadMultiFilesWork()).thenReturn(new LoadMultiFilesDesc());
    GenMapRedUtils.mergeMovePaths(condInputPath, mockWork, lineageState);
}
Also used : Path(org.apache.hadoop.fs.Path) MoveWork(org.apache.hadoop.hive.ql.plan.MoveWork) LineageState(org.apache.hadoop.hive.ql.session.LineageState) LoadMultiFilesDesc(org.apache.hadoop.hive.ql.plan.LoadMultiFilesDesc) Test(org.junit.Test)

Example 29 with MoveWork

use of org.apache.hadoop.hive.ql.plan.MoveWork in project hive by apache.

the class TestGenMapRedUtilsCreateConditionalTask method testConditionalMoveOnHdfsIsNotOptimized.

@Test
public void testConditionalMoveOnHdfsIsNotOptimized() throws SemanticException {
    hiveConf.set(HiveConf.ConfVars.HIVE_BLOBSTORE_OPTIMIZATIONS_ENABLED.varname, "true");
    Path sinkDirName = new Path("hdfs://bucket/scratch/-ext-10002");
    FileSinkOperator fileSinkOperator = createFileSinkOperator(sinkDirName);
    Path finalDirName = new Path("hdfs://bucket/scratch/-ext-10000");
    Path tableLocation = new Path("hdfs://bucket/warehouse/table");
    Task<MoveWork> moveTask = createMoveTask(finalDirName, tableLocation);
    List<Task<MoveWork>> moveTaskList = Collections.singletonList(moveTask);
    GenMapRedUtils.createMRWorkForMergingFiles(fileSinkOperator, finalDirName, null, moveTaskList, hiveConf, dummyMRTask, new LineageState());
    ConditionalTask conditionalTask = (ConditionalTask) dummyMRTask.getChildTasks().get(0);
    Task<? extends Serializable> moveOnlyTask = conditionalTask.getListTasks().get(0);
    Task<? extends Serializable> mergeOnlyTask = conditionalTask.getListTasks().get(1);
    Task<? extends Serializable> mergeAndMoveTask = conditionalTask.getListTasks().get(2);
    // Verify moveOnlyTask is NOT optimized
    assertEquals(1, moveOnlyTask.getChildTasks().size());
    verifyMoveTask(moveOnlyTask, sinkDirName, finalDirName);
    verifyMoveTask(moveOnlyTask.getChildTasks().get(0), finalDirName, tableLocation);
    // Verify mergeOnlyTask is NOT optimized
    assertEquals(1, mergeOnlyTask.getChildTasks().size());
    verifyMoveTask(mergeOnlyTask.getChildTasks().get(0), finalDirName, tableLocation);
    // Verify mergeAndMoveTask is NOT optimized
    assertEquals(1, mergeAndMoveTask.getChildTasks().size());
    assertEquals(1, mergeAndMoveTask.getChildTasks().get(0).getChildTasks().size());
    verifyMoveTask(mergeAndMoveTask.getChildTasks().get(0), sinkDirName, finalDirName);
    verifyMoveTask(mergeAndMoveTask.getChildTasks().get(0).getChildTasks().get(0), finalDirName, tableLocation);
}
Also used : Path(org.apache.hadoop.fs.Path) MoveWork(org.apache.hadoop.hive.ql.plan.MoveWork) ConditionalTask(org.apache.hadoop.hive.ql.exec.ConditionalTask) Task(org.apache.hadoop.hive.ql.exec.Task) MapRedTask(org.apache.hadoop.hive.ql.exec.mr.MapRedTask) MoveTask(org.apache.hadoop.hive.ql.exec.MoveTask) FileSinkOperator(org.apache.hadoop.hive.ql.exec.FileSinkOperator) ConditionalTask(org.apache.hadoop.hive.ql.exec.ConditionalTask) LineageState(org.apache.hadoop.hive.ql.session.LineageState) Test(org.junit.Test)

Aggregations

MoveWork (org.apache.hadoop.hive.ql.plan.MoveWork)29 Path (org.apache.hadoop.fs.Path)21 LoadTableDesc (org.apache.hadoop.hive.ql.plan.LoadTableDesc)16 LoadFileDesc (org.apache.hadoop.hive.ql.plan.LoadFileDesc)9 Test (org.junit.Test)7 ConditionalTask (org.apache.hadoop.hive.ql.exec.ConditionalTask)6 HiveException (org.apache.hadoop.hive.ql.metadata.HiveException)6 TableDesc (org.apache.hadoop.hive.ql.plan.TableDesc)6 Task (org.apache.hadoop.hive.ql.exec.Task)5 DDLWork (org.apache.hadoop.hive.ql.plan.DDLWork)5 Serializable (java.io.Serializable)4 ArrayList (java.util.ArrayList)4 Context (org.apache.hadoop.hive.ql.Context)4 FileSinkOperator (org.apache.hadoop.hive.ql.exec.FileSinkOperator)4 MoveTask (org.apache.hadoop.hive.ql.exec.MoveTask)4 MapRedTask (org.apache.hadoop.hive.ql.exec.mr.MapRedTask)4 Partition (org.apache.hadoop.hive.ql.metadata.Partition)4 BasicStatsWork (org.apache.hadoop.hive.ql.plan.BasicStatsWork)4 StatsWork (org.apache.hadoop.hive.ql.plan.StatsWork)4 URISyntaxException (java.net.URISyntaxException)3