Search in sources :

Example 1 with MetadataEntry

use of co.cask.cdap.data2.metadata.dataset.MetadataEntry in project cdap by caskdata.

the class DefaultMetadataStore method removeProperties.

/**
 * Removes the specified properties of the {@link NamespacedEntityId}.
 */
@Override
public void removeProperties(final MetadataScope scope, final NamespacedEntityId namespacedEntityId, final String... keys) {
    final AtomicReference<MetadataRecord> previousRef = new AtomicReference<>();
    final ImmutableMap.Builder<String, String> deletesBuilder = ImmutableMap.builder();
    execute(new TransactionExecutor.Procedure<MetadataDataset>() {

        @Override
        public void apply(MetadataDataset input) throws Exception {
            previousRef.set(new MetadataRecord(namespacedEntityId, scope, input.getProperties(namespacedEntityId), input.getTags(namespacedEntityId)));
            for (String key : keys) {
                MetadataEntry record = input.getProperty(namespacedEntityId, key);
                if (record == null) {
                    continue;
                }
                deletesBuilder.put(record.getKey(), record.getValue());
            }
            input.removeProperties(namespacedEntityId, keys);
        }
    }, scope);
    publishAudit(previousRef.get(), new MetadataRecord(namespacedEntityId, scope), new MetadataRecord(namespacedEntityId, scope, deletesBuilder.build(), EMPTY_TAGS));
}
Also used : MetadataDataset(co.cask.cdap.data2.metadata.dataset.MetadataDataset) MetadataEntry(co.cask.cdap.data2.metadata.dataset.MetadataEntry) AtomicReference(java.util.concurrent.atomic.AtomicReference) TransactionExecutor(org.apache.tephra.TransactionExecutor) MetadataRecord(co.cask.cdap.common.metadata.MetadataRecord) ImmutableMap(com.google.common.collect.ImmutableMap) BadRequestException(co.cask.cdap.common.BadRequestException) DatasetManagementException(co.cask.cdap.api.dataset.DatasetManagementException) IOException(java.io.IOException)

Example 2 with MetadataEntry

use of co.cask.cdap.data2.metadata.dataset.MetadataEntry in project cdap by caskdata.

the class DefaultMetadataStore method search.

private MetadataSearchResponse search(Set<MetadataScope> scopes, String namespaceId, String searchQuery, Set<EntityTypeSimpleName> types, SortInfo sortInfo, int offset, int limit, int numCursors, String cursor, boolean showHidden, Set<EntityScope> entityScope) throws BadRequestException {
    if (offset < 0) {
        throw new IllegalArgumentException("offset must not be negative");
    }
    if (limit < 0) {
        throw new IllegalArgumentException("limit must not be negative");
    }
    List<MetadataEntry> results = new LinkedList<>();
    List<String> cursors = new LinkedList<>();
    for (MetadataScope scope : scopes) {
        SearchResults searchResults = getSearchResults(scope, namespaceId, searchQuery, types, sortInfo, offset, limit, numCursors, cursor, showHidden, entityScope);
        results.addAll(searchResults.getResults());
        cursors.addAll(searchResults.getCursors());
    }
    // sort if required
    Set<NamespacedEntityId> sortedEntities = getSortedEntities(results, sortInfo);
    int total = sortedEntities.size();
    // pagination is not performed at the dataset level, because:
    // 1. scoring is needed for DEFAULT sort info. So perform it here for now.
    // 2. Even when using custom sorting, we need to remove elements from the beginning to the offset and the cursors
    // at the end
    // TODO: Figure out how all of this can be done server (HBase) side
    int startIndex = Math.min(offset, sortedEntities.size());
    // Account for overflow
    int endIndex = (int) Math.min(Integer.MAX_VALUE, (long) offset + limit);
    endIndex = Math.min(endIndex, sortedEntities.size());
    // add 1 to maxIndex because end index is exclusive
    sortedEntities = new LinkedHashSet<>(ImmutableList.copyOf(sortedEntities).subList(startIndex, endIndex));
    // Fetch metadata for entities in the result list
    // Note: since the fetch is happening in a different transaction, the metadata for entities may have been
    // removed. It is okay not to have metadata for some results in case this happens.
    Map<NamespacedEntityId, Metadata> systemMetadata = fetchMetadata(sortedEntities, MetadataScope.SYSTEM);
    Map<NamespacedEntityId, Metadata> userMetadata = fetchMetadata(sortedEntities, MetadataScope.USER);
    return new MetadataSearchResponse(sortInfo.getSortBy() + " " + sortInfo.getSortOrder(), offset, limit, numCursors, total, addMetadataToEntities(sortedEntities, systemMetadata, userMetadata), cursors, showHidden, entityScope);
}
Also used : Metadata(co.cask.cdap.data2.metadata.dataset.Metadata) MetadataSearchResponse(co.cask.cdap.proto.metadata.MetadataSearchResponse) SearchResults(co.cask.cdap.data2.metadata.dataset.SearchResults) LinkedList(java.util.LinkedList) NamespacedEntityId(co.cask.cdap.proto.id.NamespacedEntityId) MetadataEntry(co.cask.cdap.data2.metadata.dataset.MetadataEntry) MetadataScope(co.cask.cdap.api.metadata.MetadataScope)

Example 3 with MetadataEntry

use of co.cask.cdap.data2.metadata.dataset.MetadataEntry in project cdap by caskdata.

the class DefaultMetadataStore method getSortedEntities.

private Set<NamespacedEntityId> getSortedEntities(List<MetadataEntry> results, SortInfo sortInfo) {
    // in this case, the backing storage is expected to return results in the expected order.
    if (SortInfo.SortOrder.WEIGHTED != sortInfo.getSortOrder()) {
        Set<NamespacedEntityId> entities = new LinkedHashSet<>(results.size());
        for (MetadataEntry metadataEntry : results) {
            entities.add(metadataEntry.getTargetId());
        }
        return entities;
    }
    // if sort order is weighted, score results by weight, and return in descending order of weights
    // Score results
    final Map<NamespacedEntityId, Integer> weightedResults = new HashMap<>();
    for (MetadataEntry metadataEntry : results) {
        Integer score = weightedResults.get(metadataEntry.getTargetId());
        score = (score == null) ? 0 : score;
        weightedResults.put(metadataEntry.getTargetId(), score + 1);
    }
    // Sort the results by score
    List<Map.Entry<NamespacedEntityId, Integer>> resultList = new ArrayList<>(weightedResults.entrySet());
    Collections.sort(resultList, SEARCH_RESULT_DESC_SCORE_COMPARATOR);
    Set<NamespacedEntityId> result = new LinkedHashSet<>(resultList.size());
    for (Map.Entry<NamespacedEntityId, Integer> entry : resultList) {
        result.add(entry.getKey());
    }
    return result;
}
Also used : LinkedHashSet(java.util.LinkedHashSet) NamespacedEntityId(co.cask.cdap.proto.id.NamespacedEntityId) MetadataEntry(co.cask.cdap.data2.metadata.dataset.MetadataEntry) HashMap(java.util.HashMap) ArrayList(java.util.ArrayList) MetadataEntry(co.cask.cdap.data2.metadata.dataset.MetadataEntry) Map(java.util.Map) ImmutableMap(com.google.common.collect.ImmutableMap) HashMap(java.util.HashMap)

Example 4 with MetadataEntry

use of co.cask.cdap.data2.metadata.dataset.MetadataEntry in project cdap by caskdata.

the class MetadataDataset method write.

private void write(NamespacedEntityId targetId, MetadataEntry entry, Set<Indexer> indexers) {
    String key = entry.getKey();
    MDSKey mdsValueKey = MdsKey.getMDSValueKey(targetId, key);
    Put put = new Put(mdsValueKey.getKey());
    // add the metadata value
    put.add(Bytes.toBytes(VALUE_COLUMN), Bytes.toBytes(entry.getValue()));
    indexedTable.put(put);
    storeIndexes(targetId, key, indexers, entry);
    writeHistory(targetId);
}
Also used : MDSKey(co.cask.cdap.data2.dataset2.lib.table.MDSKey) Put(co.cask.cdap.api.dataset.table.Put)

Example 5 with MetadataEntry

use of co.cask.cdap.data2.metadata.dataset.MetadataEntry in project cdap by caskdata.

the class MetadataDataset method storeIndexes.

/**
 * Store indexes for a {@link MetadataEntry}
 *
 * @param targetId the {@link NamespacedEntityId} from which the metadata indexes has to be stored
 * @param metadataKey the metadata key for which the indexes are to be stored
 * @param indexers {@link Set<String>} of {@link Indexer indexers} for this {@link MetadataEntry}
 * @param metadataEntry {@link MetadataEntry} for which indexes are to be stored
 */
private void storeIndexes(NamespacedEntityId targetId, String metadataKey, Set<Indexer> indexers, MetadataEntry metadataEntry) {
    // Delete existing indexes for targetId-key
    deleteIndexes(targetId, metadataKey);
    for (Indexer indexer : indexers) {
        Set<String> indexes = indexer.getIndexes(metadataEntry);
        String indexColumn = getIndexColumn(metadataKey, indexer.getSortOrder());
        for (String index : indexes) {
            // store just the index value
            indexedTable.put(getIndexPut(targetId, metadataKey, index, indexColumn));
        }
    }
}
Also used : ValueOnlyIndexer(co.cask.cdap.data2.metadata.indexer.ValueOnlyIndexer) DefaultValueIndexer(co.cask.cdap.data2.metadata.indexer.DefaultValueIndexer) InvertedValueIndexer(co.cask.cdap.data2.metadata.indexer.InvertedValueIndexer) SchemaIndexer(co.cask.cdap.data2.metadata.indexer.SchemaIndexer) Indexer(co.cask.cdap.data2.metadata.indexer.Indexer) InvertedTimeIndexer(co.cask.cdap.data2.metadata.indexer.InvertedTimeIndexer)

Aggregations

MetadataEntry (co.cask.cdap.data2.metadata.dataset.MetadataEntry)7 Test (org.junit.Test)6 Indexer (co.cask.cdap.data2.metadata.indexer.Indexer)4 InvertedValueIndexer (co.cask.cdap.data2.metadata.indexer.InvertedValueIndexer)4 NamespacedEntityId (co.cask.cdap.proto.id.NamespacedEntityId)4 Schema (co.cask.cdap.api.data.schema.Schema)3 Row (co.cask.cdap.api.dataset.table.Row)3 BadRequestException (co.cask.cdap.common.BadRequestException)3 DatasetId (co.cask.cdap.proto.id.DatasetId)3 ImmutableMap (com.google.common.collect.ImmutableMap)3 ArrayList (java.util.ArrayList)3 TransactionExecutor (org.apache.tephra.TransactionExecutor)3 Scanner (co.cask.cdap.api.dataset.table.Scanner)2 MDSKey (co.cask.cdap.data2.dataset2.lib.table.MDSKey)2 DefaultValueIndexer (co.cask.cdap.data2.metadata.indexer.DefaultValueIndexer)2 InvertedTimeIndexer (co.cask.cdap.data2.metadata.indexer.InvertedTimeIndexer)2 SchemaIndexer (co.cask.cdap.data2.metadata.indexer.SchemaIndexer)2 ValueOnlyIndexer (co.cask.cdap.data2.metadata.indexer.ValueOnlyIndexer)2 HashMap (java.util.HashMap)2 Map (java.util.Map)2