Search in sources :

Example 6 with SparkCatalog

use of org.apache.iceberg.spark.SparkCatalog in project iceberg by apache.

the class TestRemoveOrphanFilesAction3 method testSparkCatalogTable.

@Test
public void testSparkCatalogTable() throws Exception {
    spark.conf().set("spark.sql.catalog.mycat", "org.apache.iceberg.spark.SparkCatalog");
    spark.conf().set("spark.sql.catalog.mycat.type", "hadoop");
    spark.conf().set("spark.sql.catalog.mycat.warehouse", tableLocation);
    SparkCatalog cat = (SparkCatalog) spark.sessionState().catalogManager().catalog("mycat");
    String[] database = { "default" };
    Identifier id = Identifier.of(database, "table");
    Map<String, String> options = Maps.newHashMap();
    Transform[] transforms = {};
    cat.createTable(id, SparkSchemaUtil.convert(SCHEMA), transforms, options);
    SparkTable table = cat.loadTable(id);
    spark.sql("INSERT INTO mycat.default.table VALUES (1,1,1)");
    String location = table.table().location().replaceFirst("file:", "");
    new File(location + "/data/trashfile").createNewFile();
    DeleteOrphanFiles.Result results = SparkActions.get().deleteOrphanFiles(table.table()).olderThan(System.currentTimeMillis() + 1000).execute();
    Assert.assertTrue("trash file should be removed", StreamSupport.stream(results.orphanFileLocations().spliterator(), false).anyMatch(file -> file.contains("file:" + location + "/data/trashfile")));
}
Also used : SparkCatalog(org.apache.iceberg.spark.SparkCatalog) Maps(org.apache.iceberg.relocated.com.google.common.collect.Maps) Test(org.junit.Test) DeleteOrphanFiles(org.apache.iceberg.actions.DeleteOrphanFiles) SparkSchemaUtil(org.apache.iceberg.spark.SparkSchemaUtil) File(java.io.File) SparkSessionCatalog(org.apache.iceberg.spark.SparkSessionCatalog) Map(java.util.Map) After(org.junit.After) Transform(org.apache.spark.sql.connector.expressions.Transform) StreamSupport(java.util.stream.StreamSupport) Identifier(org.apache.spark.sql.connector.catalog.Identifier) Assert(org.junit.Assert) SparkTable(org.apache.iceberg.spark.source.SparkTable) Identifier(org.apache.spark.sql.connector.catalog.Identifier) SparkCatalog(org.apache.iceberg.spark.SparkCatalog) DeleteOrphanFiles(org.apache.iceberg.actions.DeleteOrphanFiles) Transform(org.apache.spark.sql.connector.expressions.Transform) SparkTable(org.apache.iceberg.spark.source.SparkTable) File(java.io.File) Test(org.junit.Test)

Example 7 with SparkCatalog

use of org.apache.iceberg.spark.SparkCatalog in project iceberg by apache.

the class TestPathIdentifier method before.

@Before
public void before() throws IOException {
    tableLocation = temp.newFolder();
    identifier = new PathIdentifier(tableLocation.getAbsolutePath());
    sparkCatalog = new SparkCatalog();
    sparkCatalog.initialize("test", new CaseInsensitiveStringMap(ImmutableMap.of()));
}
Also used : SparkCatalog(org.apache.iceberg.spark.SparkCatalog) PathIdentifier(org.apache.iceberg.spark.PathIdentifier) CaseInsensitiveStringMap(org.apache.spark.sql.util.CaseInsensitiveStringMap) Before(org.junit.Before)

Example 8 with SparkCatalog

use of org.apache.iceberg.spark.SparkCatalog in project iceberg by apache.

the class TestSparkCatalogCacheExpiration method testCacheEnabledAndExpirationDisabled.

@Test
public void testCacheEnabledAndExpirationDisabled() {
    SparkCatalog sparkCatalog = getSparkCatalog("expiration_disabled");
    Assertions.assertThat(sparkCatalog).extracting("cacheEnabled").isEqualTo(true);
    Assertions.assertThat(sparkCatalog).extracting("icebergCatalog").isInstanceOfSatisfying(CachingCatalog.class, icebergCatalog -> {
        Assertions.assertThat(icebergCatalog).extracting("expirationIntervalMillis").isEqualTo(-1L);
    });
}
Also used : SparkCatalog(org.apache.iceberg.spark.SparkCatalog) Test(org.junit.Test)

Example 9 with SparkCatalog

use of org.apache.iceberg.spark.SparkCatalog in project iceberg by apache.

the class TestSparkCatalogCacheExpiration method testCacheDisabledImplicitly.

@Test
public void testCacheDisabledImplicitly() {
    SparkCatalog sparkCatalog = getSparkCatalog("cache_disabled_implicitly");
    Assertions.assertThat(sparkCatalog).extracting("cacheEnabled").isEqualTo(false);
    Assertions.assertThat(sparkCatalog).extracting("icebergCatalog").isInstanceOfSatisfying(Catalog.class, icebergCatalog -> Assertions.assertThat(icebergCatalog).isNotInstanceOf(CachingCatalog.class));
}
Also used : SparkCatalog(org.apache.iceberg.spark.SparkCatalog) CachingCatalog(org.apache.iceberg.CachingCatalog) Test(org.junit.Test)

Example 10 with SparkCatalog

use of org.apache.iceberg.spark.SparkCatalog in project OpenLineage by OpenLineage.

the class IcebergHandlerTest method testGetVersionString.

@Test
public void testGetVersionString() throws NoSuchTableException {
    SparkCatalog sparkCatalog = mock(SparkCatalog.class);
    SparkTable sparkTable = mock(SparkTable.class, RETURNS_DEEP_STUBS);
    Identifier identifier = Identifier.of(new String[] { "database", "schema" }, "table");
    when(sparkCatalog.loadTable(identifier)).thenReturn(sparkTable);
    when(sparkTable.table().currentSnapshot().snapshotId()).thenReturn(1500100900L);
    Optional<String> version = icebergHandler.getDatasetVersion(sparkCatalog, identifier, Collections.emptyMap());
    assertTrue(version.isPresent());
    assertEquals(version.get(), "1500100900");
}
Also used : DatasetIdentifier(io.openlineage.spark.agent.util.DatasetIdentifier) Identifier(org.apache.spark.sql.connector.catalog.Identifier) SparkCatalog(org.apache.iceberg.spark.SparkCatalog) SparkTable(org.apache.iceberg.spark.source.SparkTable) Test(org.junit.jupiter.api.Test) ParameterizedTest(org.junit.jupiter.params.ParameterizedTest)

Aggregations

SparkCatalog (org.apache.iceberg.spark.SparkCatalog)11 SparkTable (org.apache.iceberg.spark.source.SparkTable)6 Identifier (org.apache.spark.sql.connector.catalog.Identifier)5 Test (org.junit.Test)5 DatasetIdentifier (io.openlineage.spark.agent.util.DatasetIdentifier)4 Map (java.util.Map)4 File (java.io.File)3 StreamSupport (java.util.stream.StreamSupport)3 DeleteOrphanFiles (org.apache.iceberg.actions.DeleteOrphanFiles)3 Maps (org.apache.iceberg.relocated.com.google.common.collect.Maps)3 SparkSchemaUtil (org.apache.iceberg.spark.SparkSchemaUtil)3 SparkSessionCatalog (org.apache.iceberg.spark.SparkSessionCatalog)3 Transform (org.apache.spark.sql.connector.expressions.Transform)3 After (org.junit.After)3 Assert (org.junit.Assert)3 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)3 HashMap (java.util.HashMap)2 SneakyThrows (lombok.SneakyThrows)2 NoSuchTableException (org.apache.spark.sql.catalyst.analysis.NoSuchTableException)2 Test (org.junit.jupiter.api.Test)2