Search in sources :

Example 16 with ContentKey

use of org.projectnessie.model.ContentKey in project iceberg by apache.

the class NessieCatalog method table.

private IcebergTable table(TableIdentifier tableIdentifier) {
    try {
        ContentKey key = NessieUtil.toKey(tableIdentifier);
        Content table = api.getContent().key(key).reference(reference.getReference()).get().get(key);
        return table != null ? table.unwrap(IcebergTable.class).orElse(null) : null;
    } catch (NessieNotFoundException e) {
        return null;
    }
}
Also used : ContentKey(org.projectnessie.model.ContentKey) Content(org.projectnessie.model.Content) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException)

Example 17 with ContentKey

use of org.projectnessie.model.ContentKey in project nessie by projectnessie.

the class AbstractCompatibilityTests method commit.

@Test
void commit() throws Exception {
    Branch defaultBranch = api.getDefaultBranch();
    Branch branch = Branch.of("commitToBranch", defaultBranch.getHash());
    Reference created = api.createReference().sourceRefName(defaultBranch.getName()).reference(branch).create();
    assertThat(created).isEqualTo(branch);
    ContentKey key = ContentKey.of("my", "tables", "table_name");
    IcebergTable content = IcebergTable.of("metadata-location", 42L, 43, 44, 45, "content-id");
    String commitMessage = "hello world";
    Put operation = Put.of(key, content);
    Branch branchNew = api.commitMultipleOperations().commitMeta(CommitMeta.fromMessage(commitMessage)).operation(operation).branch(branch).commit();
    assertThat(branchNew).isNotEqualTo(branch).extracting(Branch::getName).isEqualTo(branch.getName());
    LogResponse commitLog = api.getCommitLog().refName(branch.getName()).get();
    assertThat(commitLog.getLogEntries()).hasSize(1).map(LogEntry::getCommitMeta).map(CommitMeta::getMessage).containsExactly(commitMessage);
}
Also used : ContentKey(org.projectnessie.model.ContentKey) LogResponse(org.projectnessie.model.LogResponse) Branch(org.projectnessie.model.Branch) Reference(org.projectnessie.model.Reference) IcebergTable(org.projectnessie.model.IcebergTable) Put(org.projectnessie.model.Operation.Put) LogEntry(org.projectnessie.model.LogResponse.LogEntry) Test(org.junit.jupiter.api.Test)

Example 18 with ContentKey

use of org.projectnessie.model.ContentKey in project nessie by projectnessie.

the class IdentifyContentsPerExecutor method walkLiveCommitsInReference.

private Map<String, ContentBloomFilter> walkLiveCommitsInReference(GCStateParamsPerTask gcStateParamsPerTask) {
    Map<String, ContentBloomFilter> bloomFilterMap = new HashMap<>();
    Set<ContentKey> liveContentKeys = new HashSet<>();
    try (Stream<LogResponse.LogEntry> commits = StreamingUtil.getCommitLogStream(gcStateParamsPerTask.getApi(), builder -> builder.hashOnRef(gcStateParamsPerTask.getReference().getHash()).refName(Detached.REF_NAME).fetch(FetchOption.ALL), OptionalInt.empty())) {
        MutableBoolean foundAllLiveCommitHeadsBeforeCutoffTime = new MutableBoolean(false);
        // commit handler for the spliterator
        Consumer<LogResponse.LogEntry> commitHandler = logEntry -> handleLiveCommit(gcStateParamsPerTask, logEntry, bloomFilterMap, foundAllLiveCommitHeadsBeforeCutoffTime, liveContentKeys);
        // traverse commits using the spliterator
        GCUtil.traverseLiveCommits(foundAllLiveCommitHeadsBeforeCutoffTime, commits, commitHandler);
    } catch (NessieNotFoundException e) {
        throw new RuntimeException(e);
    }
    return bloomFilterMap;
}
Also used : Operation(org.projectnessie.model.Operation) Detached(org.projectnessie.model.Detached) LogResponse(org.projectnessie.model.LogResponse) Predicate(java.util.function.Predicate) Set(java.util.Set) HashMap(java.util.HashMap) Instant(java.time.Instant) OptionalInt(java.util.OptionalInt) Reference(org.projectnessie.model.Reference) Serializable(java.io.Serializable) NessieApiV1(org.projectnessie.client.api.NessieApiV1) HashSet(java.util.HashSet) Consumer(java.util.function.Consumer) FetchOption(org.projectnessie.api.params.FetchOption) Stream(java.util.stream.Stream) StreamingUtil(org.projectnessie.client.StreamingUtil) Map(java.util.Map) Content(org.projectnessie.model.Content) ContentKey(org.projectnessie.model.ContentKey) Function(org.apache.spark.api.java.function.Function) CommitMeta(org.projectnessie.model.CommitMeta) MutableBoolean(org.apache.commons.lang3.mutable.MutableBoolean) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) SparkSession(org.apache.spark.sql.SparkSession) HashMap(java.util.HashMap) MutableBoolean(org.apache.commons.lang3.mutable.MutableBoolean) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) ContentKey(org.projectnessie.model.ContentKey) HashSet(java.util.HashSet)

Example 19 with ContentKey

use of org.projectnessie.model.ContentKey in project nessie by projectnessie.

the class IdentifyContentsPerExecutor method handleLiveCommit.

private void handleLiveCommit(GCStateParamsPerTask gcStateParamsPerTask, LogResponse.LogEntry logEntry, Map<String, ContentBloomFilter> bloomFilterMap, MutableBoolean foundAllLiveCommitHeadsBeforeCutoffTime, Set<ContentKey> liveContentKeys) {
    if (logEntry.getOperations() != null) {
        boolean isExpired = !gcStateParamsPerTask.getLiveCommitPredicate().test(logEntry.getCommitMeta());
        if (isExpired && liveContentKeys.isEmpty()) {
            // as it is the first expired commit. Time travel is supported till this state.
            try {
                gcStateParamsPerTask.getApi().getEntries().refName(Detached.REF_NAME).hashOnRef(logEntry.getCommitMeta().getHash()).get().getEntries().forEach(entries -> liveContentKeys.add(entries.getName()));
            } catch (NessieNotFoundException e) {
                throw new RuntimeException(e);
            }
        }
        logEntry.getOperations().stream().filter(operation -> operation instanceof Operation.Put).forEach(operation -> {
            boolean addContent;
            if (liveContentKeys.contains(operation.getKey())) {
                // commit head of this key
                addContent = true;
                liveContentKeys.remove(operation.getKey());
                if (liveContentKeys.isEmpty()) {
                    // found all the live commit heads before cutoff time.
                    foundAllLiveCommitHeadsBeforeCutoffTime.setTrue();
                }
            } else {
                addContent = !isExpired;
            }
            if (addContent) {
                Content content = ((Operation.Put) operation).getContent();
                bloomFilterMap.computeIfAbsent(content.getId(), k -> new ContentBloomFilter(gcStateParamsPerTask.getBloomFilterSize(), gcParams.getBloomFilterFpp())).put(content);
            }
        });
    }
}
Also used : Operation(org.projectnessie.model.Operation) Detached(org.projectnessie.model.Detached) LogResponse(org.projectnessie.model.LogResponse) Predicate(java.util.function.Predicate) Set(java.util.Set) HashMap(java.util.HashMap) Instant(java.time.Instant) OptionalInt(java.util.OptionalInt) Reference(org.projectnessie.model.Reference) Serializable(java.io.Serializable) NessieApiV1(org.projectnessie.client.api.NessieApiV1) HashSet(java.util.HashSet) Consumer(java.util.function.Consumer) FetchOption(org.projectnessie.api.params.FetchOption) Stream(java.util.stream.Stream) StreamingUtil(org.projectnessie.client.StreamingUtil) Map(java.util.Map) Content(org.projectnessie.model.Content) ContentKey(org.projectnessie.model.ContentKey) Function(org.apache.spark.api.java.function.Function) CommitMeta(org.projectnessie.model.CommitMeta) MutableBoolean(org.apache.commons.lang3.mutable.MutableBoolean) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) SparkSession(org.apache.spark.sql.SparkSession) Content(org.projectnessie.model.Content) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException)

Example 20 with ContentKey

use of org.projectnessie.model.ContentKey in project nessie by projectnessie.

the class ITUpgradePath method keysUpgradeAddCommits.

@Test
@Order(203)
void keysUpgradeAddCommits() throws Exception {
    keysUpgradeBranch = (Branch) api.getReference().refName(keysUpgradeBranch.getName()).get();
    Map<ContentKey, IcebergTable> currentKeyValues = keysUpgradeAtHash.getOrDefault(keysUpgradeBranch.getHash(), Collections.emptyMap());
    for (int i = 0; i < keysUpgradeCommitsPerVersion; i++) {
        ContentKey key = ContentKey.of("keys.upgrade.table" + i);
        if ((i % 10) == 9) {
            keysUpgradeBranch = commitMaybeRetry(api.commitMultipleOperations().branch(keysUpgradeBranch).commitMeta(CommitMeta.fromMessage("Commit #" + i + "/delete from Nessie version " + version)).operation(Delete.of(key)));
            Map<ContentKey, IcebergTable> newKeyValues = new HashMap<>(currentKeyValues);
            newKeyValues.remove(key);
            keysUpgradeAtHash.put(keysUpgradeBranch.getHash(), newKeyValues);
        }
        Content currentContent = api.getContent().refName(keysUpgradeBranch.getName()).key(key).get().get(key);
        String cid = currentContent == null ? "table-" + i + "-" + version : currentContent.getId();
        IcebergTable newContent = IcebergTable.of("pointer-" + version + "-commit-" + i, keysUpgradeSequence++, i, i, i, cid);
        Put put = currentContent != null ? Put.of(key, newContent, currentContent) : Put.of(key, newContent);
        keysUpgradeBranch = commitMaybeRetry(api.commitMultipleOperations().branch(keysUpgradeBranch).commitMeta(CommitMeta.fromMessage("Commit #" + i + "/put from Nessie version " + version)).operation(put));
        Map<ContentKey, IcebergTable> newKeyValues = new HashMap<>(currentKeyValues);
        newKeyValues.remove(key);
        keysUpgradeAtHash.put(keysUpgradeBranch.getHash(), newKeyValues);
    }
}
Also used : ContentKey(org.projectnessie.model.ContentKey) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) Content(org.projectnessie.model.Content) IcebergTable(org.projectnessie.model.IcebergTable) Put(org.projectnessie.model.Operation.Put) Order(org.junit.jupiter.api.Order) TestMethodOrder(org.junit.jupiter.api.TestMethodOrder) Test(org.junit.jupiter.api.Test)

Aggregations

ContentKey (org.projectnessie.model.ContentKey)35 Branch (org.projectnessie.model.Branch)20 Test (org.junit.jupiter.api.Test)15 Content (org.projectnessie.model.Content)15 IcebergTable (org.projectnessie.model.IcebergTable)15 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)10 CommitMeta (org.projectnessie.model.CommitMeta)10 Put (org.projectnessie.model.Operation.Put)7 Map (java.util.Map)6 Entry (org.projectnessie.model.EntriesResponse.Entry)6 List (java.util.List)5 Collectors (java.util.stream.Collectors)5 NessieApiV1 (org.projectnessie.client.api.NessieApiV1)5 NessieNotFoundException (org.projectnessie.error.NessieNotFoundException)5 LogResponse (org.projectnessie.model.LogResponse)5 Reference (org.projectnessie.model.Reference)5 HashMap (java.util.HashMap)4 EnumSource (org.junit.jupiter.params.provider.EnumSource)4 NessieReferenceNotFoundException (org.projectnessie.error.NessieReferenceNotFoundException)4 LogEntry (org.projectnessie.model.LogResponse.LogEntry)4