Search in sources :

Example 1 with Operation

use of org.projectnessie.model.Operation in project iceberg by apache.

the class NessieCatalog method renameTable.

@Override
public void renameTable(TableIdentifier from, TableIdentifier toOriginal) {
    reference.checkMutable();
    TableIdentifier to = NessieUtil.removeCatalogName(toOriginal, name());
    IcebergTable existingFromTable = table(from);
    if (existingFromTable == null) {
        throw new NoSuchTableException("table %s doesn't exists", from.name());
    }
    IcebergTable existingToTable = table(to);
    if (existingToTable != null) {
        throw new AlreadyExistsException("table %s already exists", to.name());
    }
    CommitMultipleOperationsBuilder operations = api.commitMultipleOperations().commitMeta(NessieUtil.buildCommitMetadata(String.format("Iceberg rename table from '%s' to '%s'", from, to), catalogOptions)).operation(Operation.Put.of(NessieUtil.toKey(to), existingFromTable, existingFromTable)).operation(Operation.Delete.of(NessieUtil.toKey(from)));
    try {
        Tasks.foreach(operations).retry(5).stopRetryOn(NessieNotFoundException.class).throwFailureWhenFinished().onFailure((o, exception) -> refresh()).run(ops -> {
            Branch branch = ops.branch(reference.getAsBranch()).commit();
            reference.updateReference(branch);
        }, BaseNessieClientServerException.class);
    } catch (NessieNotFoundException e) {
        // and removed by another.
        throw new RuntimeException("Failed to drop table as ref is no longer valid.", e);
    } catch (BaseNessieClientServerException e) {
        throw new CommitFailedException(e, "Failed to rename table: the current reference is not up to date.");
    } catch (HttpClientException ex) {
        // safe than sorry.
        throw new CommitStateUnknownException(ex);
    }
// Intentionally just "throw through" Nessie's HttpClientException here and do not "special case"
// just the "timeout" variant to propagate all kinds of network errors (e.g. connection reset).
// Network code implementation details and all kinds of network devices can induce unexpected
// behavior. So better be safe than sorry.
}
Also used : TableIdentifier(org.apache.iceberg.catalog.TableIdentifier) AlreadyExistsException(org.apache.iceberg.exceptions.AlreadyExistsException) HttpClientBuilder(org.projectnessie.client.http.HttpClientBuilder) CatalogUtil(org.apache.iceberg.CatalogUtil) ImmutableMap(org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap) CommitStateUnknownException(org.apache.iceberg.exceptions.CommitStateUnknownException) LoggerFactory(org.slf4j.LoggerFactory) HadoopFileIO(org.apache.iceberg.hadoop.HadoopFileIO) Function(java.util.function.Function) Reference(org.projectnessie.model.Reference) NessieClientBuilder(org.projectnessie.client.NessieClientBuilder) HttpClientException(org.projectnessie.client.http.HttpClientException) NessieConflictException(org.projectnessie.error.NessieConflictException) CatalogProperties(org.apache.iceberg.CatalogProperties) TableOperations(org.apache.iceberg.TableOperations) NoSuchNamespaceException(org.apache.iceberg.exceptions.NoSuchNamespaceException) Map(java.util.Map) Configuration(org.apache.hadoop.conf.Configuration) BaseMetastoreCatalog(org.apache.iceberg.BaseMetastoreCatalog) NoSuchTableException(org.apache.iceberg.exceptions.NoSuchTableException) Namespace(org.apache.iceberg.catalog.Namespace) Content(org.projectnessie.model.Content) Configurable(org.apache.hadoop.conf.Configurable) SupportsNamespaces(org.apache.iceberg.catalog.SupportsNamespaces) CommitFailedException(org.apache.iceberg.exceptions.CommitFailedException) Operation(org.projectnessie.model.Operation) Logger(org.slf4j.Logger) TableIdentifier(org.apache.iceberg.catalog.TableIdentifier) Branch(org.projectnessie.model.Branch) Set(java.util.Set) Collectors(java.util.stream.Collectors) Joiner(org.apache.iceberg.relocated.com.google.common.base.Joiner) NessieApiV1(org.projectnessie.client.api.NessieApiV1) List(java.util.List) Stream(java.util.stream.Stream) NessieConfigConstants(org.projectnessie.client.NessieConfigConstants) IcebergTable(org.projectnessie.model.IcebergTable) Tasks(org.apache.iceberg.util.Tasks) DynMethods(org.apache.iceberg.common.DynMethods) Preconditions(org.apache.iceberg.relocated.com.google.common.base.Preconditions) Tag(org.projectnessie.model.Tag) BaseNessieClientServerException(org.projectnessie.error.BaseNessieClientServerException) ContentKey(org.projectnessie.model.ContentKey) FileIO(org.apache.iceberg.io.FileIO) CommitMultipleOperationsBuilder(org.projectnessie.client.api.CommitMultipleOperationsBuilder) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) VisibleForTesting(org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting) TableReference(org.projectnessie.model.TableReference) CommitMultipleOperationsBuilder(org.projectnessie.client.api.CommitMultipleOperationsBuilder) AlreadyExistsException(org.apache.iceberg.exceptions.AlreadyExistsException) NoSuchTableException(org.apache.iceberg.exceptions.NoSuchTableException) CommitStateUnknownException(org.apache.iceberg.exceptions.CommitStateUnknownException) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) HttpClientException(org.projectnessie.client.http.HttpClientException) Branch(org.projectnessie.model.Branch) IcebergTable(org.projectnessie.model.IcebergTable) CommitFailedException(org.apache.iceberg.exceptions.CommitFailedException) BaseNessieClientServerException(org.projectnessie.error.BaseNessieClientServerException)

Example 2 with Operation

use of org.projectnessie.model.Operation in project iceberg by apache.

the class NessieCatalog method dropTable.

@Override
public boolean dropTable(TableIdentifier identifier, boolean purge) {
    reference.checkMutable();
    IcebergTable existingTable = table(identifier);
    if (existingTable == null) {
        return false;
    }
    if (purge) {
        LOG.info("Purging data for table {} was set to true but is ignored", identifier.toString());
    }
    CommitMultipleOperationsBuilder commitBuilderBase = api.commitMultipleOperations().commitMeta(NessieUtil.buildCommitMetadata(String.format("Iceberg delete table %s", identifier), catalogOptions)).operation(Operation.Delete.of(NessieUtil.toKey(identifier)));
    // We try to drop the table. Simple retry after ref update.
    boolean threw = true;
    try {
        Tasks.foreach(commitBuilderBase).retry(5).stopRetryOn(NessieNotFoundException.class).throwFailureWhenFinished().onFailure((o, exception) -> refresh()).run(commitBuilder -> {
            Branch branch = commitBuilder.branch(reference.getAsBranch()).commit();
            reference.updateReference(branch);
        }, BaseNessieClientServerException.class);
        threw = false;
    } catch (NessieConflictException e) {
        LOG.error("Cannot drop table: failed after retry (update ref and retry)", e);
    } catch (NessieNotFoundException e) {
        LOG.error("Cannot drop table: ref is no longer valid.", e);
    } catch (BaseNessieClientServerException e) {
        LOG.error("Cannot drop table: unknown error", e);
    }
    return !threw;
}
Also used : AlreadyExistsException(org.apache.iceberg.exceptions.AlreadyExistsException) HttpClientBuilder(org.projectnessie.client.http.HttpClientBuilder) CatalogUtil(org.apache.iceberg.CatalogUtil) ImmutableMap(org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap) CommitStateUnknownException(org.apache.iceberg.exceptions.CommitStateUnknownException) LoggerFactory(org.slf4j.LoggerFactory) HadoopFileIO(org.apache.iceberg.hadoop.HadoopFileIO) Function(java.util.function.Function) Reference(org.projectnessie.model.Reference) NessieClientBuilder(org.projectnessie.client.NessieClientBuilder) HttpClientException(org.projectnessie.client.http.HttpClientException) NessieConflictException(org.projectnessie.error.NessieConflictException) CatalogProperties(org.apache.iceberg.CatalogProperties) TableOperations(org.apache.iceberg.TableOperations) NoSuchNamespaceException(org.apache.iceberg.exceptions.NoSuchNamespaceException) Map(java.util.Map) Configuration(org.apache.hadoop.conf.Configuration) BaseMetastoreCatalog(org.apache.iceberg.BaseMetastoreCatalog) NoSuchTableException(org.apache.iceberg.exceptions.NoSuchTableException) Namespace(org.apache.iceberg.catalog.Namespace) Content(org.projectnessie.model.Content) Configurable(org.apache.hadoop.conf.Configurable) SupportsNamespaces(org.apache.iceberg.catalog.SupportsNamespaces) CommitFailedException(org.apache.iceberg.exceptions.CommitFailedException) Operation(org.projectnessie.model.Operation) Logger(org.slf4j.Logger) TableIdentifier(org.apache.iceberg.catalog.TableIdentifier) Branch(org.projectnessie.model.Branch) Set(java.util.Set) Collectors(java.util.stream.Collectors) Joiner(org.apache.iceberg.relocated.com.google.common.base.Joiner) NessieApiV1(org.projectnessie.client.api.NessieApiV1) List(java.util.List) Stream(java.util.stream.Stream) NessieConfigConstants(org.projectnessie.client.NessieConfigConstants) IcebergTable(org.projectnessie.model.IcebergTable) Tasks(org.apache.iceberg.util.Tasks) DynMethods(org.apache.iceberg.common.DynMethods) Preconditions(org.apache.iceberg.relocated.com.google.common.base.Preconditions) Tag(org.projectnessie.model.Tag) BaseNessieClientServerException(org.projectnessie.error.BaseNessieClientServerException) ContentKey(org.projectnessie.model.ContentKey) FileIO(org.apache.iceberg.io.FileIO) CommitMultipleOperationsBuilder(org.projectnessie.client.api.CommitMultipleOperationsBuilder) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) VisibleForTesting(org.apache.iceberg.relocated.com.google.common.annotations.VisibleForTesting) TableReference(org.projectnessie.model.TableReference) CommitMultipleOperationsBuilder(org.projectnessie.client.api.CommitMultipleOperationsBuilder) Branch(org.projectnessie.model.Branch) IcebergTable(org.projectnessie.model.IcebergTable) NessieConflictException(org.projectnessie.error.NessieConflictException) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) BaseNessieClientServerException(org.projectnessie.error.BaseNessieClientServerException)

Example 3 with Operation

use of org.projectnessie.model.Operation in project nessie by projectnessie.

the class AbstractRestRefLog method testReflog.

@Test
public void testReflog() throws BaseNessieClientServerException {
    String tagName = "tag1_test_reflog";
    String branch1 = "branch1_test_reflog";
    String branch2 = "branch2_test_reflog";
    String branch3 = "branch3_test_reflog";
    String root = "ref_name_test_reflog";
    List<Tuple> expectedEntries = new ArrayList<>(12);
    // reflog 1: creating the default branch0
    Branch branch0 = createBranch(root);
    expectedEntries.add(Tuple.tuple(root, "CREATE_REFERENCE"));
    // reflog 2: create tag1
    Reference createdTag = getApi().createReference().sourceRefName(branch0.getName()).reference(Tag.of(tagName, branch0.getHash())).create();
    expectedEntries.add(Tuple.tuple(tagName, "CREATE_REFERENCE"));
    // reflog 3: create branch1
    Reference createdBranch1 = getApi().createReference().sourceRefName(branch0.getName()).reference(Branch.of(branch1, branch0.getHash())).create();
    expectedEntries.add(Tuple.tuple(branch1, "CREATE_REFERENCE"));
    // reflog 4: create branch2
    Reference createdBranch2 = getApi().createReference().sourceRefName(branch0.getName()).reference(Branch.of(branch2, branch0.getHash())).create();
    expectedEntries.add(Tuple.tuple(branch2, "CREATE_REFERENCE"));
    // reflog 5: create branch2
    Branch createdBranch3 = (Branch) getApi().createReference().sourceRefName(branch0.getName()).reference(Branch.of(branch3, branch0.getHash())).create();
    expectedEntries.add(Tuple.tuple(branch3, "CREATE_REFERENCE"));
    // reflog 6: commit on default branch0
    IcebergTable meta = IcebergTable.of("meep", 42, 42, 42, 42);
    branch0 = getApi().commitMultipleOperations().branchName(branch0.getName()).hash(branch0.getHash()).commitMeta(CommitMeta.builder().message("dummy commit log").properties(ImmutableMap.of("prop1", "val1", "prop2", "val2")).build()).operation(Operation.Put.of(ContentKey.of("meep"), meta)).commit();
    expectedEntries.add(Tuple.tuple(root, "COMMIT"));
    // reflog 7: assign tag
    getApi().assignTag().tagName(tagName).hash(createdTag.getHash()).assignTo(branch0).assign();
    expectedEntries.add(Tuple.tuple(tagName, "ASSIGN_REFERENCE"));
    // reflog 8: assign ref
    getApi().assignBranch().branchName(branch1).hash(createdBranch1.getHash()).assignTo(branch0).assign();
    expectedEntries.add(Tuple.tuple(branch1, "ASSIGN_REFERENCE"));
    // reflog 9: merge
    getApi().mergeRefIntoBranch().branchName(branch2).hash(createdBranch2.getHash()).fromRefName(branch1).fromHash(branch0.getHash()).merge();
    expectedEntries.add(Tuple.tuple(branch2, "MERGE"));
    // reflog 10: transplant
    getApi().transplantCommitsIntoBranch().hashesToTransplant(ImmutableList.of(Objects.requireNonNull(branch0.getHash()))).fromRefName(branch1).branch(createdBranch3).transplant();
    expectedEntries.add(Tuple.tuple(branch3, "TRANSPLANT"));
    // reflog 11: delete branch
    getApi().deleteBranch().branchName(branch1).hash(branch0.getHash()).delete();
    expectedEntries.add(Tuple.tuple(branch1, "DELETE_REFERENCE"));
    // reflog 12: delete tag
    getApi().deleteTag().tagName(tagName).hash(branch0.getHash()).delete();
    expectedEntries.add(Tuple.tuple(tagName, "DELETE_REFERENCE"));
    // In the reflog output new entry will be the head. Hence, reverse the expected list
    Collections.reverse(expectedEntries);
    RefLogResponse refLogResponse = getApi().getRefLog().get();
    // verify reflog entries
    assertThat(refLogResponse.getLogEntries().subList(0, 12)).extracting(RefLogResponse.RefLogResponseEntry::getRefName, RefLogResponse.RefLogResponseEntry::getOperation).isEqualTo(expectedEntries);
    // verify pagination (limit and token)
    RefLogResponse refLogResponse1 = getApi().getRefLog().maxRecords(2).get();
    assertThat(refLogResponse1.getLogEntries()).isEqualTo(refLogResponse.getLogEntries().subList(0, 2));
    assertThat(refLogResponse1.isHasMore()).isTrue();
    RefLogResponse refLogResponse2 = getApi().getRefLog().pageToken(refLogResponse1.getToken()).get();
    // should start from the token.
    assertThat(refLogResponse2.getLogEntries().get(0).getRefLogId()).isEqualTo(refLogResponse1.getToken());
    assertThat(refLogResponse2.getLogEntries().subList(0, 10)).isEqualTo(refLogResponse.getLogEntries().subList(2, 12));
    // verify startHash and endHash
    RefLogResponse refLogResponse3 = getApi().getRefLog().fromHash(refLogResponse.getLogEntries().get(10).getRefLogId()).get();
    assertThat(refLogResponse3.getLogEntries().subList(0, 2)).isEqualTo(refLogResponse.getLogEntries().subList(10, 12));
    RefLogResponse refLogResponse4 = getApi().getRefLog().fromHash(refLogResponse.getLogEntries().get(3).getRefLogId()).untilHash(refLogResponse.getLogEntries().get(5).getRefLogId()).get();
    assertThat(refLogResponse4.getLogEntries()).isEqualTo(refLogResponse.getLogEntries().subList(3, 6));
    // use invalid reflog id f1234d75178d892a133a410355a5a990cf75d2f33eba25d575943d4df632f3a4
    // computed using Hash.of(
    // UnsafeByteOperations.unsafeWrap(newHasher().putString("invalid",
    // StandardCharsets.UTF_8).hash().asBytes()));
    assertThatThrownBy(() -> getApi().getRefLog().fromHash("f1234d75178d892a133a410355a5a990cf75d2f33eba25d575943d4df632f3a4").get()).isInstanceOf(NessieRefLogNotFoundException.class).hasMessageContaining("RefLog entry for 'f1234d75178d892a133a410355a5a990cf75d2f33eba25d575943d4df632f3a4' does not exist");
    // verify source hashes for assign reference
    assertThat(refLogResponse.getLogEntries().get(4).getSourceHashes()).isEqualTo(Collections.singletonList(createdBranch1.getHash()));
    // verify source hashes for merge
    assertThat(refLogResponse.getLogEntries().get(3).getSourceHashes()).isEqualTo(Collections.singletonList(branch0.getHash()));
    // verify source hashes for transplant
    assertThat(refLogResponse.getLogEntries().get(2).getSourceHashes()).isEqualTo(Collections.singletonList(branch0.getHash()));
    // test filter with stream
    List<RefLogResponse.RefLogResponseEntry> filteredResult = StreamingUtil.getReflogStream(getApi(), builder -> builder.filter("reflog.operation == 'ASSIGN_REFERENCE' " + "&& reflog.refName == 'tag1_test_reflog'"), OptionalInt.empty()).collect(Collectors.toList());
    assertThat(filteredResult.size()).isEqualTo(1);
    assertThat(filteredResult.get(0)).extracting(RefLogResponse.RefLogResponseEntry::getRefName, RefLogResponse.RefLogResponseEntry::getOperation).isEqualTo(expectedEntries.get(5).toList());
}
Also used : Operation(org.projectnessie.model.Operation) Tuple(org.assertj.core.groups.Tuple) ImmutableMap(com.google.common.collect.ImmutableMap) Assertions.assertThat(org.assertj.core.api.Assertions.assertThat) Branch(org.projectnessie.model.Branch) OptionalInt(java.util.OptionalInt) Collectors(java.util.stream.Collectors) Reference(org.projectnessie.model.Reference) NessieRefLogNotFoundException(org.projectnessie.error.NessieRefLogNotFoundException) ArrayList(java.util.ArrayList) Objects(java.util.Objects) Test(org.junit.jupiter.api.Test) List(java.util.List) Assertions.assertThatThrownBy(org.assertj.core.api.Assertions.assertThatThrownBy) ImmutableList(com.google.common.collect.ImmutableList) StreamingUtil(org.projectnessie.client.StreamingUtil) IcebergTable(org.projectnessie.model.IcebergTable) Tag(org.projectnessie.model.Tag) BaseNessieClientServerException(org.projectnessie.error.BaseNessieClientServerException) ContentKey(org.projectnessie.model.ContentKey) CommitMeta(org.projectnessie.model.CommitMeta) Collections(java.util.Collections) RefLogResponse(org.projectnessie.model.RefLogResponse) Branch(org.projectnessie.model.Branch) Reference(org.projectnessie.model.Reference) ArrayList(java.util.ArrayList) IcebergTable(org.projectnessie.model.IcebergTable) RefLogResponse(org.projectnessie.model.RefLogResponse) NessieRefLogNotFoundException(org.projectnessie.error.NessieRefLogNotFoundException) Tuple(org.assertj.core.groups.Tuple) Test(org.junit.jupiter.api.Test)

Example 4 with Operation

use of org.projectnessie.model.Operation in project nessie by projectnessie.

the class TreeApiImpl method filterCommitLog.

/**
 * Applies different filters to the {@link Stream} of commits based on the filter.
 *
 * @param logEntries The commit log that different filters will be applied to
 * @param filter The filter to filter by
 * @return A potentially filtered {@link Stream} of commits based on the filter
 */
private Stream<LogEntry> filterCommitLog(Stream<LogEntry> logEntries, String filter) {
    if (Strings.isNullOrEmpty(filter)) {
        return logEntries;
    }
    final Script script;
    try {
        script = SCRIPT_HOST.buildScript(filter).withContainer(CONTAINER).withDeclarations(COMMIT_LOG_DECLARATIONS).withTypes(COMMIT_LOG_TYPES).build();
    } catch (ScriptException e) {
        throw new IllegalArgumentException(e);
    }
    return logEntries.filter(logEntry -> {
        try {
            List<Operation> operations = logEntry.getOperations();
            if (operations == null) {
                operations = Collections.emptyList();
            }
            // ContentKey has some @JsonIgnore attributes, which would otherwise not be accessible.
            List<Object> operationsForCel = operations.stream().map(CELUtil::forCel).collect(Collectors.toList());
            return script.execute(Boolean.class, ImmutableMap.of(VAR_COMMIT, logEntry.getCommitMeta(), VAR_OPERATIONS, operationsForCel));
        } catch (ScriptException e) {
            throw new RuntimeException(e);
        }
    });
}
Also used : Script(org.projectnessie.cel.tools.Script) ScriptException(org.projectnessie.cel.tools.ScriptException) Operation(org.projectnessie.model.Operation)

Example 5 with Operation

use of org.projectnessie.model.Operation in project nessie by projectnessie.

the class IdentifyContentsPerExecutor method handleCommitForExpiredContents.

private static void handleCommitForExpiredContents(Reference reference, LogResponse.LogEntry logEntry, Map<String, ContentBloomFilter> liveContentsBloomFilterMap, IdentifiedResult result) {
    if (logEntry.getOperations() != null) {
        logEntry.getOperations().stream().filter(operation -> operation instanceof Operation.Put).forEach(operation -> {
            Content content = ((Operation.Put) operation).getContent();
            ContentBloomFilter bloomFilter = liveContentsBloomFilterMap.get(content.getId());
            // But live contents never be considered as expired.
            if (bloomFilter == null || !bloomFilter.mightContain(content)) {
                result.addContent(reference.getName(), content);
            }
        });
    }
}
Also used : Operation(org.projectnessie.model.Operation) Detached(org.projectnessie.model.Detached) LogResponse(org.projectnessie.model.LogResponse) Predicate(java.util.function.Predicate) Set(java.util.Set) HashMap(java.util.HashMap) Instant(java.time.Instant) OptionalInt(java.util.OptionalInt) Reference(org.projectnessie.model.Reference) Serializable(java.io.Serializable) NessieApiV1(org.projectnessie.client.api.NessieApiV1) HashSet(java.util.HashSet) Consumer(java.util.function.Consumer) FetchOption(org.projectnessie.api.params.FetchOption) Stream(java.util.stream.Stream) StreamingUtil(org.projectnessie.client.StreamingUtil) Map(java.util.Map) Content(org.projectnessie.model.Content) ContentKey(org.projectnessie.model.ContentKey) Function(org.apache.spark.api.java.function.Function) CommitMeta(org.projectnessie.model.CommitMeta) MutableBoolean(org.apache.commons.lang3.mutable.MutableBoolean) NessieNotFoundException(org.projectnessie.error.NessieNotFoundException) SparkSession(org.apache.spark.sql.SparkSession) Content(org.projectnessie.model.Content)

Aggregations

Operation (org.projectnessie.model.Operation)10 ContentKey (org.projectnessie.model.ContentKey)8 Map (java.util.Map)7 Stream (java.util.stream.Stream)6 NessieApiV1 (org.projectnessie.client.api.NessieApiV1)6 BaseNessieClientServerException (org.projectnessie.error.BaseNessieClientServerException)6 Branch (org.projectnessie.model.Branch)6 Reference (org.projectnessie.model.Reference)6 List (java.util.List)5 Collectors (java.util.stream.Collectors)5 CommitMeta (org.projectnessie.model.CommitMeta)5 IcebergTable (org.projectnessie.model.IcebergTable)5 Set (java.util.Set)4 FetchOption (org.projectnessie.api.params.FetchOption)4 CommitMultipleOperationsBuilder (org.projectnessie.client.api.CommitMultipleOperationsBuilder)4 NessieNotFoundException (org.projectnessie.error.NessieNotFoundException)4 Content (org.projectnessie.model.Content)4 LogResponse (org.projectnessie.model.LogResponse)4 Function (java.util.function.Function)3 Predicate (java.util.function.Predicate)3