Search in sources :

Example 11 with DocumentSet

use of org.exist.dom.persistent.DocumentSet in project exist by eXist-db.

the class NativeStructuralIndexWorkerTest method documentIdSet.

private DocumentSet documentIdSet(final List<Integer> documentIds) {
    final DocumentSet mockDocumentSet = createMock(DocumentSet.class);
    final List<DocumentImpl> docs = documentIds.stream().map(id -> {
        final DocumentImpl mockDocument = createMock(DocumentImpl.class);
        expect(mockDocument.getDocId()).andReturn(id).anyTimes();
        return mockDocument;
    }).collect(Collectors.toList());
    expect(mockDocumentSet.getDocumentIterator()).andReturn(docs.iterator());
    replay(mockDocumentSet);
    docs.forEach(EasyMock::replay);
    return mockDocumentSet;
}
Also used : Arrays(java.util.Arrays) List(java.util.List) RunWith(org.junit.runner.RunWith) DocumentImpl(org.exist.dom.persistent.DocumentImpl) Test(org.junit.Test) EasyMock(org.easymock.EasyMock) DocumentSet(org.exist.dom.persistent.DocumentSet) ParallelRunner(com.googlecode.junittoolbox.ParallelRunner) Collectors(java.util.stream.Collectors) Assert.assertEquals(org.junit.Assert.assertEquals) EasyMock(org.easymock.EasyMock) DocumentSet(org.exist.dom.persistent.DocumentSet) DocumentImpl(org.exist.dom.persistent.DocumentImpl)

Example 12 with DocumentSet

use of org.exist.dom.persistent.DocumentSet in project exist by eXist-db.

the class NativeStructuralIndexWorkerTest method getDocIdRanges_singleId.

@Test
public void getDocIdRanges_singleId() {
    final NativeStructuralIndexWorker indexWorker = new NativeStructuralIndexWorker(null);
    final DocumentSet docs = documentIdSet(Arrays.asList(6574));
    final List<NativeStructuralIndexWorker.Range> ranges = indexWorker.getDocIdRanges(docs);
    assertEquals(1, ranges.size());
    assertEquals(6574, ranges.get(0).start);
    assertEquals(6574, ranges.get(0).end);
}
Also used : DocumentSet(org.exist.dom.persistent.DocumentSet) Test(org.junit.Test)

Example 13 with DocumentSet

use of org.exist.dom.persistent.DocumentSet in project exist by eXist-db.

the class LuceneIndexTest method reindex.

@Test
public void reindex() throws EXistException, CollectionConfigurationException, PermissionDeniedException, SAXException, LockException, IOException, QName.IllegalQNameException {
    final DocumentSet docs = configureAndStore(COLLECTION_CONFIG1, XML1, "dropDocument.xml");
    final BrokerPool pool = existEmbeddedServer.getBrokerPool();
    final TransactionManager transact = pool.getTransactionManager();
    try (final DBBroker broker = pool.get(Optional.of(pool.getSecurityManager().getSystemSubject()));
        final Txn transaction = transact.beginTransaction()) {
        broker.reindexCollection(transaction, TestConstants.TEST_COLLECTION_URI);
        checkIndex(docs, broker, new QName[] { new QName("head") }, "title", 1);
        final Occurrences[] o = checkIndex(docs, broker, new QName[] { new QName("p") }, "with", 1);
        assertEquals(2, o[0].getOccurrences());
        checkIndex(docs, broker, new QName[] { new QName("hi") }, "just", 1);
        checkIndex(docs, broker, null, "in", 1);
        final QName attrQN = new QName("rend", XMLConstants.NULL_NS_URI, ElementValue.ATTRIBUTE);
        checkIndex(docs, broker, new QName[] { attrQN }, null, 2);
        checkIndex(docs, broker, new QName[] { attrQN }, "center", 1);
        transaction.commit();
    }
}
Also used : DBBroker(org.exist.storage.DBBroker) TransactionManager(org.exist.storage.txn.TransactionManager) QName(org.exist.dom.QName) DefaultDocumentSet(org.exist.dom.persistent.DefaultDocumentSet) DocumentSet(org.exist.dom.persistent.DocumentSet) MutableDocumentSet(org.exist.dom.persistent.MutableDocumentSet) Txn(org.exist.storage.txn.Txn) BrokerPool(org.exist.storage.BrokerPool)

Example 14 with DocumentSet

use of org.exist.dom.persistent.DocumentSet in project exist by eXist-db.

the class NGramSearch method eval.

@Override
public Sequence eval(Sequence contextSequence, Item contextItem) throws XPathException {
    if (contextItem != null)
        contextSequence = contextItem.toSequence();
    NodeSet result;
    if (preselectResult == null) {
        Sequence input = getArgument(0).eval(contextSequence, contextItem);
        if (input.isEmpty())
            result = NodeSet.EMPTY_SET;
        else {
            long start = System.currentTimeMillis();
            NodeSet inNodes = input.toNodeSet();
            DocumentSet docs = inNodes.getDocumentSet();
            NGramIndexWorker index = (NGramIndexWorker) context.getBroker().getIndexController().getWorkerByIndexId(NGramIndex.ID);
            // Alternate design
            // NGramIndexWorker index =
            // (NGramIndexWorker)context.getBroker().getBrokerPool().getIndexManager().getIndexById(NGramIndex.ID).getWorker();
            String key = getArgument(1).eval(contextSequence, contextItem).getStringValue();
            List<QName> qnames = null;
            if (contextQName != null) {
                qnames = new ArrayList<>(1);
                qnames.add(contextQName);
            }
            result = processMatches(index, docs, qnames, key, inNodes, NodeSet.ANCESTOR);
            if (context.getProfiler().traceFunctions()) {
                // report index use
                context.getProfiler().traceIndexUsage(context, "ngram", this, PerformanceStats.BASIC_INDEX, System.currentTimeMillis() - start);
            }
        }
    } else {
        contextStep.setPreloadedData(contextSequence.getDocumentSet(), preselectResult);
        result = getArgument(0).eval(contextSequence).toNodeSet();
    }
    return result;
}
Also used : NodeSet(org.exist.dom.persistent.NodeSet) EmptyNodeSet(org.exist.dom.persistent.EmptyNodeSet) QName(org.exist.dom.QName) NGramIndexWorker(org.exist.indexing.ngram.NGramIndexWorker) WildcardedExpressionSequence(org.exist.xquery.modules.ngram.query.WildcardedExpressionSequence) DocumentSet(org.exist.dom.persistent.DocumentSet) FixedString(org.exist.xquery.modules.ngram.query.FixedString)

Example 15 with DocumentSet

use of org.exist.dom.persistent.DocumentSet in project exist by eXist-db.

the class NGramSearch method fixedStringSearch.

public NodeSet fixedStringSearch(final NGramIndexWorker index, final DocumentSet docs, final List<QName> qnames, final String query, final NodeSet nodeSet, final int axis) throws XPathException {
    String[] ngrams = NGramSearch.getDistinctNGrams(query, index.getN());
    // Nothing to search for? The find nothing.
    if (ngrams.length == 0)
        return new EmptyNodeSet();
    String firstNgramm = ngrams[0];
    LOG.trace("First NGRAM: {}", firstNgramm);
    NodeSet result = index.search(getExpressionId(), docs, qnames, firstNgramm, firstNgramm, context, nodeSet, axis);
    for (int i = 1; i < ngrams.length; i++) {
        String ngram = ngrams[i];
        int len = ngram.codePointCount(0, ngram.length());
        int fillSize = index.getN() - len;
        String filledNgram = ngram;
        // ngrams lead to a considerable performance loss.
        if (fillSize > 0) {
            String filler = ngrams[i - 1];
            StringBuilder buf = new StringBuilder();
            int pos = filler.offsetByCodePoints(0, len);
            for (int j = 0; j < fillSize; j++) {
                int codepoint = filler.codePointAt(pos);
                pos += Character.charCount(codepoint);
                buf.appendCodePoint(codepoint);
            }
            buf.append(ngram);
            filledNgram = buf.toString();
            LOG.debug("Filled: {}", filledNgram);
        }
        NodeSet nodes = index.search(getExpressionId(), docs, qnames, filledNgram, ngram, context, nodeSet, axis);
        final NodeSet nodesContainingFirstINgrams = result;
        result = NodeSets.transformNodes(nodes, proxy -> Optional.ofNullable(nodesContainingFirstINgrams.get(proxy)).map(before -> getContinuousMatches(before, proxy)).orElse(null));
    }
    return result;
}
Also used : NodeSet(org.exist.dom.persistent.NodeSet) EmptyNodeSet(org.exist.dom.persistent.EmptyNodeSet) Match(org.exist.dom.persistent.Match) EvaluatableExpression(org.exist.xquery.modules.ngram.query.EvaluatableExpression) java.util(java.util) QName(org.exist.dom.QName) NodeProxy(org.exist.dom.persistent.NodeProxy) org.exist.xquery.value(org.exist.xquery.value) Wildcard(org.exist.xquery.modules.ngram.query.Wildcard) NodeSet(org.exist.dom.persistent.NodeSet) EmptyExpression(org.exist.xquery.modules.ngram.query.EmptyExpression) org.exist.xquery(org.exist.xquery) NodeProxies(org.exist.xquery.modules.ngram.utils.NodeProxies) NGramIndex(org.exist.indexing.ngram.NGramIndex) Matcher(java.util.regex.Matcher) NodeSets(org.exist.xquery.modules.ngram.utils.NodeSets) ElementValue(org.exist.storage.ElementValue) Error(org.exist.xquery.util.Error) DocumentSet(org.exist.dom.persistent.DocumentSet) AlternativeStrings(org.exist.xquery.modules.ngram.query.AlternativeStrings) StartAnchor(org.exist.xquery.modules.ngram.query.StartAnchor) EmptyNodeSet(org.exist.dom.persistent.EmptyNodeSet) NGramIndexWorker(org.exist.indexing.ngram.NGramIndexWorker) Logger(org.apache.logging.log4j.Logger) FixedString(org.exist.xquery.modules.ngram.query.FixedString) EndAnchor(org.exist.xquery.modules.ngram.query.EndAnchor) WildcardedExpressionSequence(org.exist.xquery.modules.ngram.query.WildcardedExpressionSequence) Pattern(java.util.regex.Pattern) WildcardedExpression(org.exist.xquery.modules.ngram.query.WildcardedExpression) LogManager(org.apache.logging.log4j.LogManager) EmptyNodeSet(org.exist.dom.persistent.EmptyNodeSet) FixedString(org.exist.xquery.modules.ngram.query.FixedString)

Aggregations

DocumentSet (org.exist.dom.persistent.DocumentSet)50 QName (org.exist.dom.QName)20 DefaultDocumentSet (org.exist.dom.persistent.DefaultDocumentSet)18 Sequence (org.exist.xquery.value.Sequence)18 MutableDocumentSet (org.exist.dom.persistent.MutableDocumentSet)16 NodeSet (org.exist.dom.persistent.NodeSet)14 DBBroker (org.exist.storage.DBBroker)14 BrokerPool (org.exist.storage.BrokerPool)13 CompiledXQuery (org.exist.xquery.CompiledXQuery)12 XQuery (org.exist.xquery.XQuery)12 IOException (java.io.IOException)9 Txn (org.exist.storage.txn.Txn)9 TransactionManager (org.exist.storage.txn.TransactionManager)8 Test (org.junit.Test)7 DocumentImpl (org.exist.dom.persistent.DocumentImpl)6 InputSource (org.xml.sax.InputSource)6 StringReader (java.io.StringReader)5 LuceneIndexWorker (org.exist.indexing.lucene.LuceneIndexWorker)5 XPathException (org.exist.xquery.XPathException)5 VirtualNodeSet (org.exist.dom.persistent.VirtualNodeSet)4