Search in sources :

Example 1 with HiveTestUtil.fileSystem

use of org.apache.hudi.hive.testutils.HiveTestUtil.fileSystem in project hudi by apache.

the class TestHiveSyncTool method testNotPickingOlderParquetFileWhenLatestCommitReadFailsForExistingTable.

@ParameterizedTest
@MethodSource("syncMode")
public void testNotPickingOlderParquetFileWhenLatestCommitReadFailsForExistingTable(String syncMode) throws Exception {
    hiveSyncConfig.syncMode = syncMode;
    HiveTestUtil.hiveSyncConfig.batchSyncNum = 2;
    final String commitTime = "100";
    HiveTestUtil.createCOWTable(commitTime, 1, true);
    HoodieCommitMetadata commitMetadata = new HoodieCommitMetadata();
    // create empty commit
    final String emptyCommitTime = "200";
    HiveTestUtil.createCommitFileWithSchema(commitMetadata, emptyCommitTime, true);
    // HiveTestUtil.createCommitFile(commitMetadata, emptyCommitTime);
    HoodieHiveClient hiveClient = new HoodieHiveClient(HiveTestUtil.hiveSyncConfig, HiveTestUtil.getHiveConf(), HiveTestUtil.fileSystem);
    assertFalse(hiveClient.doesTableExist(HiveTestUtil.hiveSyncConfig.tableName), "Table " + HiveTestUtil.hiveSyncConfig.tableName + " should not exist initially");
    HiveSyncTool tool = new HiveSyncTool(HiveTestUtil.hiveSyncConfig, HiveTestUtil.getHiveConf(), HiveTestUtil.fileSystem);
    tool.syncHoodieTable();
    verifyOldParquetFileTest(hiveClient, emptyCommitTime);
    // evolve the schema
    ZonedDateTime dateTime = ZonedDateTime.now().plusDays(6);
    String commitTime2 = "301";
    HiveTestUtil.addCOWPartitions(1, false, true, dateTime, commitTime2);
    // HiveTestUtil.createCommitFileWithSchema(commitMetadata, "400", false); // create another empty commit
    // HiveTestUtil.createCommitFile(commitMetadata, "400"); // create another empty commit
    tool = new HiveSyncTool(HiveTestUtil.hiveSyncConfig, HiveTestUtil.getHiveConf(), HiveTestUtil.fileSystem);
    HoodieHiveClient hiveClientLatest = new HoodieHiveClient(HiveTestUtil.hiveSyncConfig, HiveTestUtil.getHiveConf(), HiveTestUtil.fileSystem);
    // now delete the evolved commit instant
    Path fullPath = new Path(HiveTestUtil.hiveSyncConfig.basePath + "/" + HoodieTableMetaClient.METAFOLDER_NAME + "/" + hiveClientLatest.getActiveTimeline().getInstants().filter(inst -> inst.getTimestamp().equals(commitTime2)).findFirst().get().getFileName());
    assertTrue(HiveTestUtil.fileSystem.delete(fullPath, false));
    try {
        tool.syncHoodieTable();
    } catch (RuntimeException e) {
    // we expect the table sync to fail
    }
    // old sync values should be left intact
    verifyOldParquetFileTest(hiveClient, emptyCommitTime);
}
Also used : HoodieCommitMetadata(org.apache.hudi.common.model.HoodieCommitMetadata) Path(org.apache.hadoop.fs.Path) ImmutablePair(org.apache.hudi.common.util.collection.ImmutablePair) Assertions.assertThrows(org.junit.jupiter.api.Assertions.assertThrows) BeforeEach(org.junit.jupiter.api.BeforeEach) Arrays(java.util.Arrays) MetaException(org.apache.hadoop.hive.metastore.api.MetaException) URISyntaxException(java.net.URISyntaxException) ZonedDateTime(java.time.ZonedDateTime) Option(org.apache.hudi.common.util.Option) HashMap(java.util.HashMap) HiveTestUtil.ddlExecutor(org.apache.hudi.hive.testutils.HiveTestUtil.ddlExecutor) Partition(org.apache.hadoop.hive.metastore.api.Partition) ArrayList(java.util.ArrayList) AfterAll(org.junit.jupiter.api.AfterAll) StringUtils(org.apache.hudi.common.util.StringUtils) Assertions.assertFalse(org.junit.jupiter.api.Assertions.assertFalse) HoodieTableMetaClient(org.apache.hudi.common.table.HoodieTableMetaClient) Locale(java.util.Locale) Map(java.util.Map) HiveTestUtil.fileSystem(org.apache.hudi.hive.testutils.HiveTestUtil.fileSystem) SchemaTestUtil(org.apache.hudi.common.testutils.SchemaTestUtil) Path(org.apache.hadoop.fs.Path) Assertions.assertEquals(org.junit.jupiter.api.Assertions.assertEquals) MethodSource(org.junit.jupiter.params.provider.MethodSource) PartitionEventType(org.apache.hudi.sync.common.AbstractSyncHoodieClient.PartitionEvent.PartitionEventType) HoodieRecord(org.apache.hudi.common.model.HoodieRecord) Schema(org.apache.avro.Schema) Field(org.apache.avro.Schema.Field) HoodieCommitMetadata(org.apache.hudi.common.model.HoodieCommitMetadata) Driver(org.apache.hadoop.hive.ql.Driver) IOException(java.io.IOException) SessionState(org.apache.hadoop.hive.ql.session.SessionState) Collectors(java.util.stream.Collectors) ConfigUtils(org.apache.hudi.hive.util.ConfigUtils) Test(org.junit.jupiter.api.Test) FieldSchema(org.apache.hadoop.hive.metastore.api.FieldSchema) AfterEach(org.junit.jupiter.api.AfterEach) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest) List(java.util.List) HiveTestUtil(org.apache.hudi.hive.testutils.HiveTestUtil) NetworkTestUtils(org.apache.hudi.common.testutils.NetworkTestUtils) Assertions.assertTrue(org.junit.jupiter.api.Assertions.assertTrue) PartitionEvent(org.apache.hudi.sync.common.AbstractSyncHoodieClient.PartitionEvent) WriteOperationType(org.apache.hudi.common.model.WriteOperationType) HiveTestUtil.hiveSyncConfig(org.apache.hudi.hive.testutils.HiveTestUtil.hiveSyncConfig) Assertions.assertDoesNotThrow(org.junit.jupiter.api.Assertions.assertDoesNotThrow) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) ZonedDateTime(java.time.ZonedDateTime) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest) MethodSource(org.junit.jupiter.params.provider.MethodSource)

Example 2 with HiveTestUtil.fileSystem

use of org.apache.hudi.hive.testutils.HiveTestUtil.fileSystem in project hudi by apache.

the class TestHiveSyncTool method testNotPickingOlderParquetFileWhenLatestCommitReadFails.

@ParameterizedTest
@MethodSource("syncMode")
public void testNotPickingOlderParquetFileWhenLatestCommitReadFails(String syncMode) throws Exception {
    hiveSyncConfig.syncMode = syncMode;
    HiveTestUtil.hiveSyncConfig.batchSyncNum = 2;
    final String commitTime = "100";
    HiveTestUtil.createCOWTable(commitTime, 1, true);
    HoodieCommitMetadata commitMetadata = new HoodieCommitMetadata();
    // evolve the schema
    ZonedDateTime dateTime = ZonedDateTime.now().plusDays(6);
    String commitTime2 = "101";
    HiveTestUtil.addCOWPartitions(1, false, true, dateTime, commitTime2);
    // create empty commit
    final String emptyCommitTime = "200";
    HiveTestUtil.createCommitFile(commitMetadata, emptyCommitTime, hiveSyncConfig.basePath);
    HoodieHiveClient hiveClient = new HoodieHiveClient(HiveTestUtil.hiveSyncConfig, HiveTestUtil.getHiveConf(), HiveTestUtil.fileSystem);
    assertFalse(hiveClient.doesTableExist(HiveTestUtil.hiveSyncConfig.tableName), "Table " + HiveTestUtil.hiveSyncConfig.tableName + " should not exist initially");
    HiveSyncTool tool = new HiveSyncTool(HiveTestUtil.hiveSyncConfig, HiveTestUtil.getHiveConf(), HiveTestUtil.fileSystem);
    // now delete the evolved commit instant
    Path fullPath = new Path(HiveTestUtil.hiveSyncConfig.basePath + "/" + HoodieTableMetaClient.METAFOLDER_NAME + "/" + hiveClient.getActiveTimeline().getInstants().filter(inst -> inst.getTimestamp().equals(commitTime2)).findFirst().get().getFileName());
    assertTrue(HiveTestUtil.fileSystem.delete(fullPath, false));
    try {
        tool.syncHoodieTable();
    } catch (RuntimeException e) {
    // we expect the table sync to fail
    }
    // table should not be synced yet
    assertFalse(hiveClient.doesTableExist(HiveTestUtil.hiveSyncConfig.tableName), "Table " + HiveTestUtil.hiveSyncConfig.tableName + " should not exist at all");
}
Also used : HoodieCommitMetadata(org.apache.hudi.common.model.HoodieCommitMetadata) Path(org.apache.hadoop.fs.Path) ImmutablePair(org.apache.hudi.common.util.collection.ImmutablePair) Assertions.assertThrows(org.junit.jupiter.api.Assertions.assertThrows) BeforeEach(org.junit.jupiter.api.BeforeEach) Arrays(java.util.Arrays) MetaException(org.apache.hadoop.hive.metastore.api.MetaException) URISyntaxException(java.net.URISyntaxException) ZonedDateTime(java.time.ZonedDateTime) Option(org.apache.hudi.common.util.Option) HashMap(java.util.HashMap) HiveTestUtil.ddlExecutor(org.apache.hudi.hive.testutils.HiveTestUtil.ddlExecutor) Partition(org.apache.hadoop.hive.metastore.api.Partition) ArrayList(java.util.ArrayList) AfterAll(org.junit.jupiter.api.AfterAll) StringUtils(org.apache.hudi.common.util.StringUtils) Assertions.assertFalse(org.junit.jupiter.api.Assertions.assertFalse) HoodieTableMetaClient(org.apache.hudi.common.table.HoodieTableMetaClient) Locale(java.util.Locale) Map(java.util.Map) HiveTestUtil.fileSystem(org.apache.hudi.hive.testutils.HiveTestUtil.fileSystem) SchemaTestUtil(org.apache.hudi.common.testutils.SchemaTestUtil) Path(org.apache.hadoop.fs.Path) Assertions.assertEquals(org.junit.jupiter.api.Assertions.assertEquals) MethodSource(org.junit.jupiter.params.provider.MethodSource) PartitionEventType(org.apache.hudi.sync.common.AbstractSyncHoodieClient.PartitionEvent.PartitionEventType) HoodieRecord(org.apache.hudi.common.model.HoodieRecord) Schema(org.apache.avro.Schema) Field(org.apache.avro.Schema.Field) HoodieCommitMetadata(org.apache.hudi.common.model.HoodieCommitMetadata) Driver(org.apache.hadoop.hive.ql.Driver) IOException(java.io.IOException) SessionState(org.apache.hadoop.hive.ql.session.SessionState) Collectors(java.util.stream.Collectors) ConfigUtils(org.apache.hudi.hive.util.ConfigUtils) Test(org.junit.jupiter.api.Test) FieldSchema(org.apache.hadoop.hive.metastore.api.FieldSchema) AfterEach(org.junit.jupiter.api.AfterEach) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest) List(java.util.List) HiveTestUtil(org.apache.hudi.hive.testutils.HiveTestUtil) NetworkTestUtils(org.apache.hudi.common.testutils.NetworkTestUtils) Assertions.assertTrue(org.junit.jupiter.api.Assertions.assertTrue) PartitionEvent(org.apache.hudi.sync.common.AbstractSyncHoodieClient.PartitionEvent) WriteOperationType(org.apache.hudi.common.model.WriteOperationType) HiveTestUtil.hiveSyncConfig(org.apache.hudi.hive.testutils.HiveTestUtil.hiveSyncConfig) Assertions.assertDoesNotThrow(org.junit.jupiter.api.Assertions.assertDoesNotThrow) HiveException(org.apache.hadoop.hive.ql.metadata.HiveException) ZonedDateTime(java.time.ZonedDateTime) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest) MethodSource(org.junit.jupiter.params.provider.MethodSource)

Aggregations

IOException (java.io.IOException)2 URISyntaxException (java.net.URISyntaxException)2 ZonedDateTime (java.time.ZonedDateTime)2 ArrayList (java.util.ArrayList)2 Arrays (java.util.Arrays)2 HashMap (java.util.HashMap)2 List (java.util.List)2 Locale (java.util.Locale)2 Map (java.util.Map)2 Collectors (java.util.stream.Collectors)2 Schema (org.apache.avro.Schema)2 Field (org.apache.avro.Schema.Field)2 Path (org.apache.hadoop.fs.Path)2 FieldSchema (org.apache.hadoop.hive.metastore.api.FieldSchema)2 MetaException (org.apache.hadoop.hive.metastore.api.MetaException)2 Partition (org.apache.hadoop.hive.metastore.api.Partition)2 Driver (org.apache.hadoop.hive.ql.Driver)2 HiveException (org.apache.hadoop.hive.ql.metadata.HiveException)2 SessionState (org.apache.hadoop.hive.ql.session.SessionState)2 HoodieCommitMetadata (org.apache.hudi.common.model.HoodieCommitMetadata)2