Search in sources :

Example 1 with HivePartitionFileSet

use of org.apache.gobblin.data.management.copy.hive.HivePartitionFileSet in project incubator-gobblin by apache.

the class RegistrationTimeSkipPredicateTest method test.

@Test
public void test() throws Exception {
    Path partition1Path = new Path("/path/to/partition1");
    long modTime = 100000;
    CopyContext copyContext = new CopyContext();
    CopyConfiguration copyConfiguration = Mockito.mock(CopyConfiguration.class);
    Mockito.doReturn(copyContext).when(copyConfiguration).getCopyContext();
    HiveDataset dataset = Mockito.mock(HiveDataset.class);
    FileSystem fs = Mockito.spy(FileSystem.getLocal(new Configuration()));
    FileStatus status = new FileStatus(1, false, 1, 1, modTime, partition1Path);
    Path qualifiedPath = fs.makeQualified(partition1Path);
    Mockito.doReturn(status).when(fs).getFileStatus(qualifiedPath);
    Mockito.doReturn(status).when(fs).getFileStatus(partition1Path);
    Mockito.doReturn(fs).when(dataset).getFs();
    HiveCopyEntityHelper helper = Mockito.mock(HiveCopyEntityHelper.class);
    Mockito.doReturn(copyConfiguration).when(helper).getConfiguration();
    Mockito.doReturn(dataset).when(helper).getDataset();
    RegistrationTimeSkipPredicate predicate = new RegistrationTimeSkipPredicate(helper);
    // partition exists, but registration time before modtime => don't skip
    HivePartitionFileSet pc = createPartitionCopy(partition1Path, modTime - 1, true);
    Assert.assertFalse(predicate.apply(pc));
    // partition exists, registration time equal modtime => don't skip
    pc = createPartitionCopy(partition1Path, modTime, true);
    Assert.assertFalse(predicate.apply(pc));
    // partition exists, registration time larger modtime => do skip
    pc = createPartitionCopy(partition1Path, modTime + 1, true);
    Assert.assertTrue(predicate.apply(pc));
    // partition doesn't exist => don't skip
    pc = createPartitionCopy(partition1Path, modTime + 1, false);
    Assert.assertFalse(predicate.apply(pc));
    // partition exists but is not annotated => don't skip
    pc = createPartitionCopy(partition1Path, modTime + 1, true);
    pc.getExistingTargetPartition().get().getParameters().clear();
    Assert.assertFalse(predicate.apply(pc));
}
Also used : Path(org.apache.hadoop.fs.Path) FileStatus(org.apache.hadoop.fs.FileStatus) CopyConfiguration(org.apache.gobblin.data.management.copy.CopyConfiguration) Configuration(org.apache.hadoop.conf.Configuration) FileSystem(org.apache.hadoop.fs.FileSystem) CopyConfiguration(org.apache.gobblin.data.management.copy.CopyConfiguration) HiveDataset(org.apache.gobblin.data.management.copy.hive.HiveDataset) CopyContext(org.apache.gobblin.data.management.copy.CopyContext) HivePartitionFileSet(org.apache.gobblin.data.management.copy.hive.HivePartitionFileSet) HiveCopyEntityHelper(org.apache.gobblin.data.management.copy.hive.HiveCopyEntityHelper) Test(org.testng.annotations.Test)

Example 2 with HivePartitionFileSet

use of org.apache.gobblin.data.management.copy.hive.HivePartitionFileSet in project incubator-gobblin by apache.

the class RegistrationTimeSkipPredicateTest method createPartitionCopy.

public HivePartitionFileSet createPartitionCopy(Path location, long registrationGenerationTime, boolean targetPartitionExists) {
    HivePartitionFileSet partitionCopy = Mockito.mock(HivePartitionFileSet.class);
    Partition partition = Mockito.mock(Partition.class);
    Mockito.doReturn(location).when(partition).getDataLocation();
    Mockito.doReturn(partition).when(partitionCopy).getPartition();
    if (targetPartitionExists) {
        Partition targetPartition = Mockito.mock(Partition.class);
        Map<String, String> parameters = Maps.newHashMap();
        parameters.put(HiveDataset.REGISTRATION_GENERATION_TIME_MILLIS, Long.toString(registrationGenerationTime));
        Mockito.doReturn(parameters).when(targetPartition).getParameters();
        Mockito.doReturn(Optional.of(targetPartition)).when(partitionCopy).getExistingTargetPartition();
    } else {
        Mockito.doReturn(Optional.absent()).when(partitionCopy).getExistingTargetPartition();
    }
    return partitionCopy;
}
Also used : Partition(org.apache.hadoop.hive.ql.metadata.Partition) HivePartitionFileSet(org.apache.gobblin.data.management.copy.hive.HivePartitionFileSet)

Aggregations

HivePartitionFileSet (org.apache.gobblin.data.management.copy.hive.HivePartitionFileSet)2 CopyConfiguration (org.apache.gobblin.data.management.copy.CopyConfiguration)1 CopyContext (org.apache.gobblin.data.management.copy.CopyContext)1 HiveCopyEntityHelper (org.apache.gobblin.data.management.copy.hive.HiveCopyEntityHelper)1 HiveDataset (org.apache.gobblin.data.management.copy.hive.HiveDataset)1 Configuration (org.apache.hadoop.conf.Configuration)1 FileStatus (org.apache.hadoop.fs.FileStatus)1 FileSystem (org.apache.hadoop.fs.FileSystem)1 Path (org.apache.hadoop.fs.Path)1 Partition (org.apache.hadoop.hive.ql.metadata.Partition)1 Test (org.testng.annotations.Test)1