Search in sources :

Example 6 with HdfsConfigurationInitializer

use of io.trino.plugin.hive.HdfsConfigurationInitializer in project trino by trinodb.

the class TestHiveProjectionPushdownIntoTableScan method createLocalQueryRunner.

@Override
protected LocalQueryRunner createLocalQueryRunner() {
    baseDir = Files.createTempDir();
    HdfsConfig config = new HdfsConfig();
    HdfsConfiguration configuration = new HiveHdfsConfiguration(new HdfsConfigurationInitializer(config), ImmutableSet.of());
    HdfsEnvironment environment = new HdfsEnvironment(configuration, config, new NoHdfsAuthentication());
    HiveMetastore metastore = new FileHiveMetastore(new NodeVersion("test_version"), environment, new MetastoreConfig(), new FileHiveMetastoreConfig().setCatalogDirectory(baseDir.toURI().toString()).setMetastoreUser("test"));
    Database database = Database.builder().setDatabaseName(SCHEMA_NAME).setOwnerName(Optional.of("public")).setOwnerType(Optional.of(PrincipalType.ROLE)).build();
    metastore.createDatabase(database);
    LocalQueryRunner queryRunner = LocalQueryRunner.create(HIVE_SESSION);
    queryRunner.createCatalog(HIVE_CATALOG_NAME, new TestingHiveConnectorFactory(metastore), ImmutableMap.of());
    return queryRunner;
}
Also used : HdfsConfigurationInitializer(io.trino.plugin.hive.HdfsConfigurationInitializer) HiveHdfsConfiguration(io.trino.plugin.hive.HiveHdfsConfiguration) MetastoreConfig(io.trino.plugin.hive.metastore.MetastoreConfig) FileHiveMetastoreConfig(io.trino.plugin.hive.metastore.file.FileHiveMetastoreConfig) FileHiveMetastore(io.trino.plugin.hive.metastore.file.FileHiveMetastore) HiveMetastore(io.trino.plugin.hive.metastore.HiveMetastore) HdfsConfig(io.trino.plugin.hive.HdfsConfig) HiveHdfsConfiguration(io.trino.plugin.hive.HiveHdfsConfiguration) HdfsConfiguration(io.trino.plugin.hive.HdfsConfiguration) NoHdfsAuthentication(io.trino.plugin.hive.authentication.NoHdfsAuthentication) LocalQueryRunner(io.trino.testing.LocalQueryRunner) HdfsEnvironment(io.trino.plugin.hive.HdfsEnvironment) NodeVersion(io.trino.plugin.hive.NodeVersion) FileHiveMetastoreConfig(io.trino.plugin.hive.metastore.file.FileHiveMetastoreConfig) FileHiveMetastore(io.trino.plugin.hive.metastore.file.FileHiveMetastore) TestingHiveConnectorFactory(io.trino.plugin.hive.TestingHiveConnectorFactory) Database(io.trino.plugin.hive.metastore.Database)

Example 7 with HdfsConfigurationInitializer

use of io.trino.plugin.hive.HdfsConfigurationInitializer in project trino by trinodb.

the class TestRubixCaching method getNonCachingFileSystem.

private FileSystem getNonCachingFileSystem() throws IOException {
    HdfsConfigurationInitializer configurationInitializer = new HdfsConfigurationInitializer(config);
    HiveHdfsConfiguration configuration = new HiveHdfsConfiguration(configurationInitializer, ImmutableSet.of());
    HdfsEnvironment environment = new HdfsEnvironment(configuration, config, new NoHdfsAuthentication());
    return environment.getFileSystem(context, cacheStoragePath);
}
Also used : HdfsConfigurationInitializer(io.trino.plugin.hive.HdfsConfigurationInitializer) HiveHdfsConfiguration(io.trino.plugin.hive.HiveHdfsConfiguration) NoHdfsAuthentication(io.trino.plugin.hive.authentication.NoHdfsAuthentication) HdfsEnvironment(io.trino.plugin.hive.HdfsEnvironment)

Example 8 with HdfsConfigurationInitializer

use of io.trino.plugin.hive.HdfsConfigurationInitializer in project trino by trinodb.

the class TestRubixCaching method testCoordinatorNotJoining.

@Test
public void testCoordinatorNotJoining() {
    RubixConfig rubixConfig = new RubixConfig().setCacheLocation("/tmp/not/existing/dir");
    HdfsConfigurationInitializer configurationInitializer = new HdfsConfigurationInitializer(config, ImmutableSet.of());
    InternalNode workerNode = new InternalNode("worker", URI.create("http://127.0.0.2:8080"), UNKNOWN, false);
    RubixInitializer rubixInitializer = new RubixInitializer(retry().maxAttempts(1), rubixConfig.setStartServerOnCoordinator(true), new TestingNodeManager(ImmutableList.of(workerNode)), new CatalogName("catalog"), configurationInitializer, new DefaultRubixHdfsInitializer(new HdfsAuthenticationConfig()));
    assertThatThrownBy(rubixInitializer::initializeRubix).hasMessage("No coordinator node available");
}
Also used : DefaultRubixHdfsInitializer(io.trino.plugin.hive.rubix.RubixModule.DefaultRubixHdfsInitializer) HdfsConfigurationInitializer(io.trino.plugin.hive.HdfsConfigurationInitializer) TestingNodeManager(io.trino.testing.TestingNodeManager) HdfsAuthenticationConfig(io.trino.plugin.hive.authentication.HdfsAuthenticationConfig) CatalogName(io.trino.plugin.base.CatalogName) InternalNode(io.trino.metadata.InternalNode) Test(org.testng.annotations.Test)

Example 9 with HdfsConfigurationInitializer

use of io.trino.plugin.hive.HdfsConfigurationInitializer in project trino by trinodb.

the class TestRubixCaching method initializeRubix.

private void initializeRubix(RubixConfig rubixConfig, List<Node> nodes) throws Exception {
    tempDirectory = createTempDirectory(getClass().getSimpleName());
    // create cache directories
    List<java.nio.file.Path> cacheDirectories = ImmutableList.of(tempDirectory.resolve("cache1"), tempDirectory.resolve("cache2"));
    for (java.nio.file.Path directory : cacheDirectories) {
        createDirectories(directory);
    }
    // initialize rubix in master-only mode
    rubixConfig.setStartServerOnCoordinator(true);
    rubixConfig.setCacheLocation(Joiner.on(",").join(cacheDirectories.stream().map(java.nio.file.Path::toString).collect(toImmutableList())));
    HdfsConfigurationInitializer configurationInitializer = new HdfsConfigurationInitializer(config, ImmutableSet.of(// fetch data immediately in async mode
    config -> setRemoteFetchProcessInterval(config, 0)));
    TestingNodeManager nodeManager = new TestingNodeManager(nodes);
    rubixInitializer = new RubixInitializer(rubixConfig, nodeManager, new CatalogName("catalog"), configurationInitializer, new DefaultRubixHdfsInitializer(new HdfsAuthenticationConfig()));
    rubixConfigInitializer = new RubixConfigurationInitializer(rubixInitializer);
    rubixInitializer.initializeRubix();
    retry().run("wait for rubix to startup", () -> {
        if (!rubixInitializer.isServerUp()) {
            throw new IllegalStateException("Rubix server has not started");
        }
        return null;
    });
}
Also used : Path(org.apache.hadoop.fs.Path) Arrays(java.util.Arrays) Assertions.assertInstanceOf(io.airlift.testing.Assertions.assertInstanceOf) BlockLocation(org.apache.hadoop.fs.BlockLocation) FileSystem(org.apache.hadoop.fs.FileSystem) MoreFiles.deleteRecursively(com.google.common.io.MoreFiles.deleteRecursively) Assertions.assertGreaterThan(io.airlift.testing.Assertions.assertGreaterThan) Test(org.testng.annotations.Test) Random(java.util.Random) ReadMode(io.trino.plugin.hive.rubix.RubixConfig.ReadMode) FileStatus(org.apache.hadoop.fs.FileStatus) AfterMethod(org.testng.annotations.AfterMethod) Duration(io.airlift.units.Duration) NoHdfsAuthentication(io.trino.plugin.hive.authentication.NoHdfsAuthentication) ASYNC(io.trino.plugin.hive.rubix.RubixConfig.ReadMode.ASYNC) Future(java.util.concurrent.Future) InetAddress.getLocalHost(java.net.InetAddress.getLocalHost) Files.createTempDirectory(java.nio.file.Files.createTempDirectory) Path(org.apache.hadoop.fs.Path) HiveHdfsConfiguration(io.trino.plugin.hive.HiveHdfsConfiguration) URI(java.net.URI) FSDataInputStream(org.apache.hadoop.fs.FSDataInputStream) ImmutableSet(com.google.common.collect.ImmutableSet) HdfsEnvironment(io.trino.plugin.hive.HdfsEnvironment) TestingNodeManager(io.trino.testing.TestingNodeManager) Collections.nCopies(java.util.Collections.nCopies) BeforeClass(org.testng.annotations.BeforeClass) ImmutableList.toImmutableList(com.google.common.collect.ImmutableList.toImmutableList) BeforeMethod(org.testng.annotations.BeforeMethod) PropertyMetadata(io.trino.spi.session.PropertyMetadata) ObjectName(javax.management.ObjectName) Files.createDirectories(java.nio.file.Files.createDirectories) String.format(java.lang.String.format) DataSize(io.airlift.units.DataSize) HdfsContext(io.trino.plugin.hive.HdfsEnvironment.HdfsContext) List(java.util.List) CachingPrestoAzureBlobFileSystem(com.qubole.rubix.prestosql.CachingPrestoAzureBlobFileSystem) HdfsAuthenticationConfig(io.trino.plugin.hive.authentication.HdfsAuthenticationConfig) OrcReaderConfig(io.trino.plugin.hive.orc.OrcReaderConfig) UNKNOWN(io.trino.client.NodeVersion.UNKNOWN) HdfsConfig(io.trino.plugin.hive.HdfsConfig) ByteStreams(com.google.common.io.ByteStreams) HdfsConfigurationInitializer(io.trino.plugin.hive.HdfsConfigurationInitializer) Joiner(com.google.common.base.Joiner) DataProvider(org.testng.annotations.DataProvider) READ_THROUGH(io.trino.plugin.hive.rubix.RubixConfig.ReadMode.READ_THROUGH) CachingPrestoDistributedFileSystem(com.qubole.rubix.prestosql.CachingPrestoDistributedFileSystem) MEGABYTE(io.airlift.units.DataSize.Unit.MEGABYTE) Assert.assertEquals(org.testng.Assert.assertEquals) Callable(java.util.concurrent.Callable) CachingPrestoGoogleHadoopFileSystem(com.qubole.rubix.prestosql.CachingPrestoGoogleHadoopFileSystem) CachingPrestoSecureAzureBlobFileSystem(com.qubole.rubix.prestosql.CachingPrestoSecureAzureBlobFileSystem) FSDataOutputStream(org.apache.hadoop.fs.FSDataOutputStream) ALLOW_INSECURE(com.google.common.io.RecursiveDeleteOption.ALLOW_INSECURE) ImmutableList(com.google.common.collect.ImmutableList) FilterFileSystem(org.apache.hadoop.fs.FilterFileSystem) Assertions.assertThatThrownBy(org.assertj.core.api.Assertions.assertThatThrownBy) Closer(com.google.common.io.Closer) CachingPrestoAdlFileSystem(com.qubole.rubix.prestosql.CachingPrestoAdlFileSystem) MBeanServer(javax.management.MBeanServer) ManagementFactory(java.lang.management.ManagementFactory) ExecutorService(java.util.concurrent.ExecutorService) Node(io.trino.spi.Node) AfterClass(org.testng.annotations.AfterClass) RetryDriver.retry(io.trino.plugin.hive.util.RetryDriver.retry) CachingFileSystem(com.qubole.rubix.core.CachingFileSystem) UTF_8(java.nio.charset.StandardCharsets.UTF_8) DefaultRubixHdfsInitializer(io.trino.plugin.hive.rubix.RubixModule.DefaultRubixHdfsInitializer) IOException(java.io.IOException) HiveTestUtils.getHiveSessionProperties(io.trino.plugin.hive.HiveTestUtils.getHiveSessionProperties) CatalogName(io.trino.plugin.base.CatalogName) Executors.newFixedThreadPool(java.util.concurrent.Executors.newFixedThreadPool) TestingConnectorSession(io.trino.testing.TestingConnectorSession) InternalNode(io.trino.metadata.InternalNode) Assert.assertEventually(io.trino.testing.assertions.Assert.assertEventually) CacheConfig.setRemoteFetchProcessInterval(com.qubole.rubix.spi.CacheConfig.setRemoteFetchProcessInterval) Assert.assertTrue(org.testng.Assert.assertTrue) HiveConfig(io.trino.plugin.hive.HiveConfig) SECONDS(java.util.concurrent.TimeUnit.SECONDS) DefaultRubixHdfsInitializer(io.trino.plugin.hive.rubix.RubixModule.DefaultRubixHdfsInitializer) HdfsConfigurationInitializer(io.trino.plugin.hive.HdfsConfigurationInitializer) TestingNodeManager(io.trino.testing.TestingNodeManager) HdfsAuthenticationConfig(io.trino.plugin.hive.authentication.HdfsAuthenticationConfig) CatalogName(io.trino.plugin.base.CatalogName)

Example 10 with HdfsConfigurationInitializer

use of io.trino.plugin.hive.HdfsConfigurationInitializer in project trino by trinodb.

the class TestIcebergSplitSource method createQueryRunner.

@Override
protected QueryRunner createQueryRunner() throws Exception {
    HdfsConfig config = new HdfsConfig();
    HdfsConfiguration configuration = new HiveHdfsConfiguration(new HdfsConfigurationInitializer(config), ImmutableSet.of());
    HdfsEnvironment hdfsEnvironment = new HdfsEnvironment(configuration, config, new NoHdfsAuthentication());
    File tempDir = Files.createTempDirectory("test_iceberg_split_source").toFile();
    this.metastoreDir = new File(tempDir, "iceberg_data");
    HiveMetastore metastore = createTestingFileHiveMetastore(metastoreDir);
    IcebergTableOperationsProvider operationsProvider = new FileMetastoreTableOperationsProvider(new HdfsFileIoProvider(hdfsEnvironment));
    this.catalog = new TrinoHiveCatalog(new CatalogName("hive"), memoizeMetastore(metastore, 1000), hdfsEnvironment, new TestingTypeManager(), operationsProvider, "test", false, false, false);
    return createIcebergQueryRunner(ImmutableMap.of(), ImmutableMap.of(), ImmutableList.of(NATION), Optional.of(metastoreDir));
}
Also used : HdfsConfigurationInitializer(io.trino.plugin.hive.HdfsConfigurationInitializer) HiveHdfsConfiguration(io.trino.plugin.hive.HiveHdfsConfiguration) HiveMetastore(io.trino.plugin.hive.metastore.HiveMetastore) FileHiveMetastore.createTestingFileHiveMetastore(io.trino.plugin.hive.metastore.file.FileHiveMetastore.createTestingFileHiveMetastore) HdfsConfig(io.trino.plugin.hive.HdfsConfig) HiveHdfsConfiguration(io.trino.plugin.hive.HiveHdfsConfiguration) HdfsConfiguration(io.trino.plugin.hive.HdfsConfiguration) NoHdfsAuthentication(io.trino.plugin.hive.authentication.NoHdfsAuthentication) HdfsEnvironment(io.trino.plugin.hive.HdfsEnvironment) FileMetastoreTableOperationsProvider(io.trino.plugin.iceberg.catalog.file.FileMetastoreTableOperationsProvider) TrinoHiveCatalog(io.trino.plugin.iceberg.catalog.hms.TrinoHiveCatalog) CatalogName(io.trino.plugin.base.CatalogName) IcebergTableOperationsProvider(io.trino.plugin.iceberg.catalog.IcebergTableOperationsProvider) File(java.io.File) TestingTypeManager(io.trino.spi.type.TestingTypeManager)

Aggregations

HdfsConfigurationInitializer (io.trino.plugin.hive.HdfsConfigurationInitializer)25 HdfsEnvironment (io.trino.plugin.hive.HdfsEnvironment)24 HiveHdfsConfiguration (io.trino.plugin.hive.HiveHdfsConfiguration)24 NoHdfsAuthentication (io.trino.plugin.hive.authentication.NoHdfsAuthentication)24 HdfsConfig (io.trino.plugin.hive.HdfsConfig)23 HdfsConfiguration (io.trino.plugin.hive.HdfsConfiguration)19 NodeVersion (io.trino.plugin.hive.NodeVersion)11 MetastoreConfig (io.trino.plugin.hive.metastore.MetastoreConfig)11 FileHiveMetastore (io.trino.plugin.hive.metastore.file.FileHiveMetastore)10 FileHiveMetastoreConfig (io.trino.plugin.hive.metastore.file.FileHiveMetastoreConfig)10 File (java.io.File)8 HiveMetastore (io.trino.plugin.hive.metastore.HiveMetastore)7 CatalogName (io.trino.plugin.base.CatalogName)6 Test (org.testng.annotations.Test)6 CheckpointSchemaManager (io.trino.plugin.deltalake.transactionlog.checkpoint.CheckpointSchemaManager)5 Path (org.apache.hadoop.fs.Path)5 FileFormatDataSourceStats (io.trino.plugin.hive.FileFormatDataSourceStats)4 HdfsContext (io.trino.plugin.hive.HdfsEnvironment.HdfsContext)4 ParquetReaderConfig (io.trino.plugin.hive.parquet.ParquetReaderConfig)4 DistributedQueryRunner (io.trino.testing.DistributedQueryRunner)4