Search in sources :

Example 51 with DirectoryTaxonomyWriter

use of org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter in project lucene-solr by apache.

the class TestTaxonomyFacetSumValueSource method testWrongIndexFieldName.

public void testWrongIndexFieldName() throws Exception {
    Directory dir = newDirectory();
    Directory taxoDir = newDirectory();
    // Writes facet ords to a separate directory from the
    // main index:
    DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir, IndexWriterConfig.OpenMode.CREATE);
    FacetsConfig config = new FacetsConfig();
    config.setIndexFieldName("a", "$facets2");
    RandomIndexWriter writer = new RandomIndexWriter(random(), dir);
    Document doc = new Document();
    doc.add(new NumericDocValuesField("num", 10));
    doc.add(new FacetField("a", "foo1"));
    writer.addDocument(config.build(taxoWriter, doc));
    // NRT open
    IndexSearcher searcher = newSearcher(writer.getReader());
    writer.close();
    // NRT open
    TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
    taxoWriter.close();
    FacetsCollector c = new FacetsCollector();
    searcher.search(new MatchAllDocsQuery(), c);
    TaxonomyFacetSumValueSource facets = new TaxonomyFacetSumValueSource(taxoReader, config, c, DoubleValuesSource.fromIntField("num"));
    // Ask for top 10 labels for any dims that have counts:
    List<FacetResult> results = facets.getAllDims(10);
    assertTrue(results.isEmpty());
    expectThrows(IllegalArgumentException.class, () -> {
        facets.getSpecificValue("a");
    });
    expectThrows(IllegalArgumentException.class, () -> {
        facets.getTopChildren(10, "a");
    });
    IOUtils.close(searcher.getIndexReader(), taxoReader, dir, taxoDir);
}
Also used : IndexSearcher(org.apache.lucene.search.IndexSearcher) FacetsConfig(org.apache.lucene.facet.FacetsConfig) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader) FacetField(org.apache.lucene.facet.FacetField) Document(org.apache.lucene.document.Document) MatchAllDocsQuery(org.apache.lucene.search.MatchAllDocsQuery) FacetsCollector(org.apache.lucene.facet.FacetsCollector) DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) NumericDocValuesField(org.apache.lucene.document.NumericDocValuesField) FacetResult(org.apache.lucene.facet.FacetResult) RandomIndexWriter(org.apache.lucene.index.RandomIndexWriter) Directory(org.apache.lucene.store.Directory) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader)

Example 52 with DirectoryTaxonomyWriter

use of org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter in project lucene-solr by apache.

the class TestTaxonomyCombined method testReaderParent.

/**  Tests for TaxonomyReader's getParent() method.
    We check it by comparing its results to those we could have gotten by
    looking at the category string paths (where the parentage is obvious).
    Note that after testReaderBasic(), we already know we can trust the
    ordinal &lt;=&gt; category conversions.
    
    Note: At the moment, the parent methods in the reader are deprecated,
    but this does not mean they should not be tested! Until they are
    removed (*if* they are removed), these tests should remain to see
    that they still work correctly.
   */
@Test
public void testReaderParent() throws Exception {
    Directory indexDir = newDirectory();
    TaxonomyWriter tw = new DirectoryTaxonomyWriter(indexDir);
    fillTaxonomy(tw);
    tw.close();
    TaxonomyReader tr = new DirectoryTaxonomyReader(indexDir);
    // check that the parent of the root ordinal is the invalid ordinal:
    int[] parents = tr.getParallelTaxonomyArrays().parents();
    assertEquals(TaxonomyReader.INVALID_ORDINAL, parents[0]);
    // check parent of non-root ordinals:
    for (int ordinal = 1; ordinal < tr.getSize(); ordinal++) {
        FacetLabel me = tr.getPath(ordinal);
        int parentOrdinal = parents[ordinal];
        FacetLabel parent = tr.getPath(parentOrdinal);
        if (parent == null) {
            fail("Parent of " + ordinal + " is " + parentOrdinal + ", but this is not a valid category.");
        }
        // verify that the parent is indeed my parent, according to the strings
        if (!me.subpath(me.length - 1).equals(parent)) {
            fail("Got parent " + parentOrdinal + " for ordinal " + ordinal + " but categories are " + showcat(parent) + " and " + showcat(me) + " respectively.");
        }
    }
    tr.close();
    indexDir.close();
}
Also used : DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader) SlowRAMDirectory(org.apache.lucene.facet.SlowRAMDirectory) Directory(org.apache.lucene.store.Directory) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader) Test(org.junit.Test)

Example 53 with DirectoryTaxonomyWriter

use of org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter in project lucene-solr by apache.

the class TestTaxonomyCombined method testChildrenArraysInvariants.

/**
   * Similar to testChildrenArrays, except rather than look at
   * expected results, we test for several "invariants" that the results
   * should uphold, e.g., that a child of a category indeed has this category
   * as its parent. This sort of test can more easily be extended to larger
   * example taxonomies, because we do not need to build the expected list
   * of categories like we did in the above test.
   */
@Test
public void testChildrenArraysInvariants() throws Exception {
    Directory indexDir = newDirectory();
    TaxonomyWriter tw = new DirectoryTaxonomyWriter(indexDir);
    fillTaxonomy(tw);
    tw.close();
    TaxonomyReader tr = new DirectoryTaxonomyReader(indexDir);
    ParallelTaxonomyArrays ca = tr.getParallelTaxonomyArrays();
    int[] children = ca.children();
    assertEquals(tr.getSize(), children.length);
    int[] olderSiblingArray = ca.siblings();
    assertEquals(tr.getSize(), olderSiblingArray.length);
    // test that the "youngest child" of every category is indeed a child:
    int[] parents = tr.getParallelTaxonomyArrays().parents();
    for (int i = 0; i < tr.getSize(); i++) {
        int youngestChild = children[i];
        if (youngestChild != TaxonomyReader.INVALID_ORDINAL) {
            assertEquals(i, parents[youngestChild]);
        }
    }
    // (it can also be INVALID_ORDINAL, which is lower than any ordinal)
    for (int i = 0; i < tr.getSize(); i++) {
        assertTrue("olderSiblingArray[" + i + "] should be <" + i, olderSiblingArray[i] < i);
    }
    // (they share the same parent)
    for (int i = 0; i < tr.getSize(); i++) {
        int sibling = olderSiblingArray[i];
        if (sibling == TaxonomyReader.INVALID_ORDINAL) {
            continue;
        }
        assertEquals(parents[i], parents[sibling]);
    }
    // miss the first children in the chain)
    for (int i = 0; i < tr.getSize(); i++) {
        // Find the really youngest child:
        int j;
        for (j = tr.getSize() - 1; j > i; j--) {
            if (parents[j] == i) {
                // found youngest child
                break;
            }
        }
        if (j == i) {
            // no child found
            j = TaxonomyReader.INVALID_ORDINAL;
        }
        assertEquals(j, children[i]);
    }
    // middle or the end of the chain).
    for (int i = 0; i < tr.getSize(); i++) {
        // Find the youngest older sibling:
        int j;
        for (j = i - 1; j >= 0; j--) {
            if (parents[j] == parents[i]) {
                // found youngest older sibling
                break;
            }
        }
        if (j < 0) {
            // no sibling found
            j = TaxonomyReader.INVALID_ORDINAL;
        }
        assertEquals(j, olderSiblingArray[i]);
    }
    tr.close();
    indexDir.close();
}
Also used : DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader) SlowRAMDirectory(org.apache.lucene.facet.SlowRAMDirectory) Directory(org.apache.lucene.store.Directory) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader) Test(org.junit.Test)

Example 54 with DirectoryTaxonomyWriter

use of org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter in project lucene-solr by apache.

the class TestTaxonomyCombined method testChildrenArraysGrowth.

/**
   * Test how getChildrenArrays() deals with the taxonomy's growth:
   */
@Test
public void testChildrenArraysGrowth() throws Exception {
    Directory indexDir = newDirectory();
    TaxonomyWriter tw = new DirectoryTaxonomyWriter(indexDir);
    tw.addCategory(new FacetLabel("hi", "there"));
    tw.commit();
    TaxonomyReader tr = new DirectoryTaxonomyReader(indexDir);
    ParallelTaxonomyArrays ca = tr.getParallelTaxonomyArrays();
    assertEquals(3, tr.getSize());
    assertEquals(3, ca.siblings().length);
    assertEquals(3, ca.children().length);
    assertTrue(Arrays.equals(new int[] { 1, 2, -1 }, ca.children()));
    assertTrue(Arrays.equals(new int[] { -1, -1, -1 }, ca.siblings()));
    tw.addCategory(new FacetLabel("hi", "ho"));
    tw.addCategory(new FacetLabel("hello"));
    tw.commit();
    // Before refresh, nothing changed..
    ParallelTaxonomyArrays newca = tr.getParallelTaxonomyArrays();
    // we got exactly the same object
    assertSame(newca, ca);
    assertEquals(3, tr.getSize());
    assertEquals(3, ca.siblings().length);
    assertEquals(3, ca.children().length);
    // After the refresh, things change:
    TaxonomyReader newtr = TaxonomyReader.openIfChanged(tr);
    assertNotNull(newtr);
    tr.close();
    tr = newtr;
    ca = tr.getParallelTaxonomyArrays();
    assertEquals(5, tr.getSize());
    assertEquals(5, ca.siblings().length);
    assertEquals(5, ca.children().length);
    assertTrue(Arrays.equals(new int[] { 4, 3, -1, -1, -1 }, ca.children()));
    assertTrue(Arrays.equals(new int[] { -1, -1, -1, 2, 1 }, ca.siblings()));
    tw.close();
    tr.close();
    indexDir.close();
}
Also used : DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader) SlowRAMDirectory(org.apache.lucene.facet.SlowRAMDirectory) Directory(org.apache.lucene.store.Directory) DirectoryTaxonomyReader(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader) Test(org.junit.Test)

Example 55 with DirectoryTaxonomyWriter

use of org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter in project lucene-solr by apache.

the class TestTaxonomyCombined method testWriterTwice3.

/**
   * testWriterTwice3 is yet another test which tests creating a taxonomy
   * in two separate writing sessions. This test used to fail because of
   * a bug involving commit(), explained below, and now should succeed.
   */
@Test
public void testWriterTwice3() throws Exception {
    Directory indexDir = newDirectory();
    // First, create and fill the taxonomy
    TaxonomyWriter tw = new DirectoryTaxonomyWriter(indexDir);
    fillTaxonomy(tw);
    tw.close();
    // Now, open the same taxonomy and add the same categories again.
    // After a few categories, the LuceneTaxonomyWriter implementation
    // will stop looking for each category on disk, and rather read them
    // all into memory and close its reader. The bug was that it closed
    // the reader, but forgot that it did (because it didn't set the reader
    // reference to null).
    tw = new DirectoryTaxonomyWriter(indexDir);
    fillTaxonomy(tw);
    // Add one new category, just to make commit() do something:
    tw.addCategory(new FacetLabel("hi"));
    // Do a commit(). Here was a bug - if tw had a reader open, it should
    // be reopened after the commit. However, in our case the reader should
    // not be open (as explained above) but because it was not set to null,
    // we forgot that, tried to reopen it, and got an AlreadyClosedException.
    tw.commit();
    assertEquals(expectedCategories.length + 1, tw.getSize());
    tw.close();
    indexDir.close();
}
Also used : DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) DirectoryTaxonomyWriter(org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter) SlowRAMDirectory(org.apache.lucene.facet.SlowRAMDirectory) Directory(org.apache.lucene.store.Directory) Test(org.junit.Test)

Aggregations

DirectoryTaxonomyWriter (org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyWriter)85 Directory (org.apache.lucene.store.Directory)73 DirectoryTaxonomyReader (org.apache.lucene.facet.taxonomy.directory.DirectoryTaxonomyReader)52 Document (org.apache.lucene.document.Document)46 RandomIndexWriter (org.apache.lucene.index.RandomIndexWriter)45 FacetsConfig (org.apache.lucene.facet.FacetsConfig)35 FacetField (org.apache.lucene.facet.FacetField)31 Test (org.junit.Test)28 IndexSearcher (org.apache.lucene.search.IndexSearcher)27 MockAnalyzer (org.apache.lucene.analysis.MockAnalyzer)26 IndexWriter (org.apache.lucene.index.IndexWriter)25 Facets (org.apache.lucene.facet.Facets)22 SlowRAMDirectory (org.apache.lucene.facet.SlowRAMDirectory)21 FacetsCollector (org.apache.lucene.facet.FacetsCollector)17 IndexWriterConfig (org.apache.lucene.index.IndexWriterConfig)15 MatchAllDocsQuery (org.apache.lucene.search.MatchAllDocsQuery)15 FacetResult (org.apache.lucene.facet.FacetResult)14 TaxonomyReader (org.apache.lucene.facet.taxonomy.TaxonomyReader)13 DirectoryReader (org.apache.lucene.index.DirectoryReader)12 TaxonomyWriter (org.apache.lucene.facet.taxonomy.TaxonomyWriter)9