Search in sources :

Example 16 with Content

use of org.projectnessie.model.Content in project nessie by projectnessie.

the class IdentifyContentsPerExecutor method handleCommitForExpiredContents.

private static void handleCommitForExpiredContents(Reference reference, LogResponse.LogEntry logEntry, Map<String, ContentBloomFilter> liveContentsBloomFilterMap, IdentifiedResult result) {
    if (logEntry.getOperations() != null) {
        logEntry.getOperations().stream().filter(operation -> operation instanceof Operation.Put).forEach(operation -> {
            Content content = ((Operation.Put) operation).getContent();
            ContentBloomFilter bloomFilter = liveContentsBloomFilterMap.get(content.getId());
            // But live contents never be considered as expired.
            if (bloomFilter == null || !bloomFilter.mightContain(content)) {
                result.addContent(reference.getName(), content);
            }
        });
    }
}
Also used : Operation(org.projectnessie.model.Operation) Detached(org.projectnessie.model.Detached) LogResponse(org.projectnessie.model.LogResponse) Predicate(java.util.function.Predicate) Set(java.util.Set) HashMap(java.util.HashMap) Instant(java.time.Instant) OptionalInt(java.util.OptionalInt) Reference(org.projectnessie.model.Reference) Serializable(java.io.Serializable) NessieApiV1(org.projectnessie.client.api.NessieApiV1) HashSet(java.util.HashSet) Consumer(java.util.function.Consumer) FetchOption(org.projectnessie.api.params.FetchOption) Stream(java.util.stream.Stream) StreamingUtil(org.projectnessie.client.StreamingUtil) Map(java.util.Map) Content(org.projectnessie.model.Content) ContentKey(org.projectnessie.model.ContentKey) Function(org.apache.spark.api.java.function.Function) CommitMeta(org.projectnessie.model.CommitMeta) MutableBoolean(org.apache.commons.lang3.mutable.MutableBoolean) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) SparkSession(org.apache.spark.sql.SparkSession) Content(org.projectnessie.model.Content)

Example 17 with Content

use of org.projectnessie.model.Content in project nessie by projectnessie.

the class IdentifyContentsPerExecutor method handleLiveCommit.

private void handleLiveCommit(GCStateParamsPerTask gcStateParamsPerTask, LogResponse.LogEntry logEntry, Map<String, ContentBloomFilter> bloomFilterMap, MutableBoolean foundAllLiveCommitHeadsBeforeCutoffTime, Set<ContentKey> liveContentKeys) {
    if (logEntry.getOperations() != null) {
        boolean isExpired = !gcStateParamsPerTask.getLiveCommitPredicate().test(logEntry.getCommitMeta());
        if (isExpired && liveContentKeys.isEmpty()) {
            // as it is the first expired commit. Time travel is supported till this state.
            try {
                gcStateParamsPerTask.getApi().getEntries().refName(Detached.REF_NAME).hashOnRef(logEntry.getCommitMeta().getHash()).get().getEntries().forEach(entries -> liveContentKeys.add(entries.getName()));
            } catch (NessieNotFoundException e) {
                throw new RuntimeException(e);
            }
        }
        logEntry.getOperations().stream().filter(operation -> operation instanceof Operation.Put).forEach(operation -> {
            boolean addContent;
            if (liveContentKeys.contains(operation.getKey())) {
                // commit head of this key
                addContent = true;
                liveContentKeys.remove(operation.getKey());
                if (liveContentKeys.isEmpty()) {
                    // found all the live commit heads before cutoff time.
                    foundAllLiveCommitHeadsBeforeCutoffTime.setTrue();
                }
            } else {
                addContent = !isExpired;
            }
            if (addContent) {
                Content content = ((Operation.Put) operation).getContent();
                bloomFilterMap.computeIfAbsent(content.getId(), k -> new ContentBloomFilter(gcStateParamsPerTask.getBloomFilterSize(), gcParams.getBloomFilterFpp())).put(content);
            }
        });
    }
}
Also used : Operation(org.projectnessie.model.Operation) Detached(org.projectnessie.model.Detached) LogResponse(org.projectnessie.model.LogResponse) Predicate(java.util.function.Predicate) Set(java.util.Set) HashMap(java.util.HashMap) Instant(java.time.Instant) OptionalInt(java.util.OptionalInt) Reference(org.projectnessie.model.Reference) Serializable(java.io.Serializable) NessieApiV1(org.projectnessie.client.api.NessieApiV1) HashSet(java.util.HashSet) Consumer(java.util.function.Consumer) FetchOption(org.projectnessie.api.params.FetchOption) Stream(java.util.stream.Stream) StreamingUtil(org.projectnessie.client.StreamingUtil) Map(java.util.Map) Content(org.projectnessie.model.Content) ContentKey(org.projectnessie.model.ContentKey) Function(org.apache.spark.api.java.function.Function) CommitMeta(org.projectnessie.model.CommitMeta) MutableBoolean(org.apache.commons.lang3.mutable.MutableBoolean) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) SparkSession(org.apache.spark.sql.SparkSession) Content(org.projectnessie.model.Content) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException)

Example 18 with Content

use of org.projectnessie.model.Content in project nessie by projectnessie.

the class AbstractRestGC method fillExpectedContents.

void fillExpectedContents(Branch branch, int numCommits, IdentifiedResult expected) throws NessieNotFoundException {
    fetchLogEntries(branch, numCommits).stream().map(LogEntry::getOperations).filter(Objects::nonNull).flatMap(Collection::stream).filter(op -> op instanceof Put).forEach(op -> {
        Content content = ((Put) op).getContent();
        expected.addContent(branch.getName(), content);
    });
}
Also used : Put(org.projectnessie.model.Operation.Put) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) HashMap(java.util.HashMap) Reference(org.projectnessie.model.Reference) NessieConflictException(org.projectnessie.error.NessieConflictException) ArrayList(java.util.ArrayList) Duration(java.time.Duration) Map(java.util.Map) Content(org.projectnessie.model.Content) CommitMeta(org.projectnessie.model.CommitMeta) SparkSession(org.apache.spark.sql.SparkSession) Operation(org.projectnessie.model.Operation) ImmutableMap(com.google.common.collect.ImmutableMap) Collection(java.util.Collection) Branch(org.projectnessie.model.Branch) LogEntry(org.projectnessie.model.LogResponse.LogEntry) Instant(java.time.Instant) NotNull(javax.validation.constraints.NotNull) Objects(java.util.Objects) List(java.util.List) FetchOption(org.projectnessie.api.params.FetchOption) IcebergView(org.projectnessie.model.IcebergView) IcebergTable(org.projectnessie.model.IcebergTable) CONF_NESSIE_URI(org.projectnessie.client.NessieConfigConstants.CONF_NESSIE_URI) AbstractRest(org.projectnessie.jaxrs.AbstractRest) ContentKey(org.projectnessie.model.ContentKey) Comparator(java.util.Comparator) CommitMultipleOperationsBuilder(org.projectnessie.client.api.CommitMultipleOperationsBuilder) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) Content(org.projectnessie.model.Content) Collection(java.util.Collection) LogEntry(org.projectnessie.model.LogResponse.LogEntry) Put(org.projectnessie.model.Operation.Put)

Example 19 with Content

use of org.projectnessie.model.Content in project nessie by projectnessie.

the class ITUpgradePath method keysUpgradeAddCommits.

@Test
@Order(203)
void keysUpgradeAddCommits() throws Exception {
    keysUpgradeBranch = (Branch) api.getReference().refName(keysUpgradeBranch.getName()).get();
    Map<ContentKey, IcebergTable> currentKeyValues = keysUpgradeAtHash.getOrDefault(keysUpgradeBranch.getHash(), Collections.emptyMap());
    for (int i = 0; i < keysUpgradeCommitsPerVersion; i++) {
        ContentKey key = ContentKey.of("keys.upgrade.table" + i);
        if ((i % 10) == 9) {
            keysUpgradeBranch = commitMaybeRetry(api.commitMultipleOperations().branch(keysUpgradeBranch).commitMeta(CommitMeta.fromMessage("Commit #" + i + "/delete from Nessie version " + version)).operation(Delete.of(key)));
            Map<ContentKey, IcebergTable> newKeyValues = new HashMap<>(currentKeyValues);
            newKeyValues.remove(key);
            keysUpgradeAtHash.put(keysUpgradeBranch.getHash(), newKeyValues);
        }
        Content currentContent = api.getContent().refName(keysUpgradeBranch.getName()).key(key).get().get(key);
        String cid = currentContent == null ? "table-" + i + "-" + version : currentContent.getId();
        IcebergTable newContent = IcebergTable.of("pointer-" + version + "-commit-" + i, keysUpgradeSequence++, i, i, i, cid);
        Put put = currentContent != null ? Put.of(key, newContent, currentContent) : Put.of(key, newContent);
        keysUpgradeBranch = commitMaybeRetry(api.commitMultipleOperations().branch(keysUpgradeBranch).commitMeta(CommitMeta.fromMessage("Commit #" + i + "/put from Nessie version " + version)).operation(put));
        Map<ContentKey, IcebergTable> newKeyValues = new HashMap<>(currentKeyValues);
        newKeyValues.remove(key);
        keysUpgradeAtHash.put(keysUpgradeBranch.getHash(), newKeyValues);
    }
}
Also used : ContentKey(org.projectnessie.model.ContentKey) HashMap(java.util.HashMap) LinkedHashMap(java.util.LinkedHashMap) Content(org.projectnessie.model.Content) IcebergTable(org.projectnessie.model.IcebergTable) Put(org.projectnessie.model.Operation.Put) Order(org.junit.jupiter.api.Order) TestMethodOrder(org.junit.jupiter.api.TestMethodOrder) Test(org.junit.jupiter.api.Test)

Example 20 with Content

use of org.projectnessie.model.Content in project nessie by projectnessie.

the class ITUpgradePath method commitLog.

@Test
@Order(104)
void commitLog() {
    assertThat(api.getAllReferences().get().getReferences().stream().filter(r -> r.getName().startsWith(VERSION_BRANCH_PREFIX))).isNotEmpty().allSatisfy(ref -> {
        String versionFromRef = ref.getName().substring(VERSION_BRANCH_PREFIX.length());
        LogResponse commitLog = api.getCommitLog().refName(ref.getName()).get();
        String commitMessage = "hello world " + versionFromRef;
        assertThat(commitLog.getLogEntries()).hasSize(1).map(LogEntry::getCommitMeta).map(CommitMeta::getMessage).containsExactly(commitMessage);
    }).allSatisfy(ref -> {
        String versionFromRef = ref.getName().substring(VERSION_BRANCH_PREFIX.length());
        ContentKey key = ContentKey.of("my", "tables", "table_name");
        IcebergTable content = IcebergTable.of("metadata-location", 42L, 43, 44, 45, "content-id-" + versionFromRef);
        Map<ContentKey, Content> contents = api.getContent().reference(ref).key(key).get();
        assertThat(contents).containsExactly(entry(key, content));
    });
}
Also used : BeforeEach(org.junit.jupiter.api.BeforeEach) NessieVersion(org.projectnessie.tools.compatibility.api.NessieVersion) InstanceOfAssertFactories(org.assertj.core.api.InstanceOfAssertFactories) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) Order(org.junit.jupiter.api.Order) NessieConflictException(org.projectnessie.error.NessieConflictException) AfterAll(org.junit.jupiter.api.AfterAll) ExtendWith(org.junit.jupiter.api.extension.ExtendWith) BeforeAll(org.junit.jupiter.api.BeforeAll) Map(java.util.Map) Content(org.projectnessie.model.Content) Branch(org.projectnessie.model.Branch) Set(java.util.Set) LogEntry(org.projectnessie.model.LogResponse.LogEntry) Collectors(java.util.stream.Collectors) NessieApiV1(org.projectnessie.client.api.NessieApiV1) Test(org.junit.jupiter.api.Test) NessieUpgradesExtension(org.projectnessie.tools.compatibility.internal.NessieUpgradesExtension) List(java.util.List) Delete(org.projectnessie.model.Operation.Delete) StreamingUtil(org.projectnessie.client.StreamingUtil) Entry(java.util.Map.Entry) DatabaseAdapterConfig(org.projectnessie.versioned.persist.adapter.DatabaseAdapterConfig) ContentKey(org.projectnessie.model.ContentKey) CommitMultipleOperationsBuilder(org.projectnessie.client.api.CommitMultipleOperationsBuilder) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) NonTransactionalDatabaseAdapterConfig(org.projectnessie.versioned.persist.nontx.NonTransactionalDatabaseAdapterConfig) Version(org.projectnessie.tools.compatibility.api.Version) IntStream(java.util.stream.IntStream) LogResponse(org.projectnessie.model.LogResponse) Put(org.projectnessie.model.Operation.Put) RefLogResponseEntry(org.projectnessie.model.RefLogResponse.RefLogResponseEntry) HashMap(java.util.HashMap) OptionalInt(java.util.OptionalInt) Reference(org.projectnessie.model.Reference) ArrayList(java.util.ArrayList) VersionCondition(org.projectnessie.tools.compatibility.api.VersionCondition) HashSet(java.util.HashSet) LinkedHashMap(java.util.LinkedHashMap) NessieReferenceConflictException(org.projectnessie.error.NessieReferenceConflictException) CommitMeta(org.projectnessie.model.CommitMeta) NessieAPI(org.projectnessie.tools.compatibility.api.NessieAPI) Tuple(org.assertj.core.groups.Tuple) TestMethodOrder(org.junit.jupiter.api.TestMethodOrder) Assumptions.assumeThat(org.assertj.core.api.Assumptions.assumeThat) Assertions.tuple(org.assertj.core.api.Assertions.tuple) Assertions.entry(org.assertj.core.api.Assertions.entry) MethodOrderer(org.junit.jupiter.api.MethodOrderer) AfterEach(org.junit.jupiter.api.AfterEach) IcebergTable(org.projectnessie.model.IcebergTable) Collections(java.util.Collections) ContentKey(org.projectnessie.model.ContentKey) LogResponse(org.projectnessie.model.LogResponse) Content(org.projectnessie.model.Content) IcebergTable(org.projectnessie.model.IcebergTable) CommitMeta(org.projectnessie.model.CommitMeta) Order(org.junit.jupiter.api.Order) TestMethodOrder(org.junit.jupiter.api.TestMethodOrder) Test(org.junit.jupiter.api.Test)

Aggregations

Content (org.projectnessie.model.Content)32 ContentKey (org.projectnessie.model.ContentKey)16 CommitMeta (org.projectnessie.model.CommitMeta)11 IcebergTable (org.projectnessie.model.IcebergTable)11 Test (org.junit.jupiter.api.Test)9 NessieNotFoundException (org.projectnessie.error.NessieNotFoundException)9 Branch (org.projectnessie.model.Branch)8 Map (java.util.Map)7 ByteString (com.google.protobuf.ByteString)6 IcebergView (org.projectnessie.model.IcebergView)6 HashMap (java.util.HashMap)5 List (java.util.List)5 ParameterizedTest (org.junit.jupiter.params.ParameterizedTest)5 NessieApiV1 (org.projectnessie.client.api.NessieApiV1)5 Reference (org.projectnessie.model.Reference)5 Instant (java.time.Instant)4 ArrayList (java.util.ArrayList)4 LogResponse (org.projectnessie.model.LogResponse)4 Put (org.projectnessie.model.Operation.Put)4 HashSet (java.util.HashSet)3