Search in sources :

Example 6 with FrequentItemsetsResult

use of de.lmu.ifi.dbs.elki.result.FrequentItemsetsResult in project elki by elki-project.

the class EclatTest method testLarge.

@Test
public void testLarge() {
    Database db = loadTransactions(UNITTEST + "itemsets/zutaten.txt.gz", 16401);
    FrequentItemsetsResult res = // 
    new ELKIBuilder<>(Eclat.class).with(Eclat.Parameterizer.MINSUPP_ID, 200).build().run(db);
    assertEquals("Size not as expected.", 184, res.getItemsets().size());
}
Also used : Database(de.lmu.ifi.dbs.elki.database.Database) FrequentItemsetsResult(de.lmu.ifi.dbs.elki.result.FrequentItemsetsResult) Test(org.junit.Test)

Example 7 with FrequentItemsetsResult

use of de.lmu.ifi.dbs.elki.result.FrequentItemsetsResult in project elki by elki-project.

the class Eclat method run.

/**
 * Run the Eclat algorithm
 *
 * @param db Database to process
 * @param relation Bit vector relation
 * @return Frequent patterns found
 */
public FrequentItemsetsResult run(Database db, final Relation<BitVector> relation) {
    // TODO: implement with resizable arrays, to not need dim.
    final int dim = RelationUtil.dimensionality(relation);
    final VectorFieldTypeInformation<BitVector> meta = RelationUtil.assumeVectorField(relation);
    // Compute absolute minsupport
    final int minsupp = getMinimumSupport(relation.size());
    LOG.verbose("Build 1-dimensional transaction lists.");
    Duration ctime = LOG.newDuration(STAT + "eclat.transposition.time").begin();
    DBIDs[] idx = buildIndex(relation, dim, minsupp);
    LOG.statistics(ctime.end());
    FiniteProgress prog = LOG.isVerbose() ? new FiniteProgress("Building frequent itemsets", idx.length, LOG) : null;
    Duration etime = LOG.newDuration(STAT + "eclat.extraction.time").begin();
    final List<Itemset> solution = new ArrayList<>();
    for (int i = 0; i < idx.length; i++) {
        LOG.incrementProcessed(prog);
        extractItemsets(idx, i, minsupp, solution);
    }
    LOG.ensureCompleted(prog);
    Collections.sort(solution);
    LOG.statistics(etime.end());
    LOG.statistics(new LongStatistic(STAT + "frequent-itemsets", solution.size()));
    return new FrequentItemsetsResult("Eclat", "eclat", solution, meta, relation.size());
}
Also used : BitVector(de.lmu.ifi.dbs.elki.data.BitVector) LongStatistic(de.lmu.ifi.dbs.elki.logging.statistics.LongStatistic) ArrayModifiableDBIDs(de.lmu.ifi.dbs.elki.database.ids.ArrayModifiableDBIDs) DBIDs(de.lmu.ifi.dbs.elki.database.ids.DBIDs) HashSetDBIDs(de.lmu.ifi.dbs.elki.database.ids.HashSetDBIDs) FiniteProgress(de.lmu.ifi.dbs.elki.logging.progress.FiniteProgress) ArrayList(java.util.ArrayList) Duration(de.lmu.ifi.dbs.elki.logging.statistics.Duration) FrequentItemsetsResult(de.lmu.ifi.dbs.elki.result.FrequentItemsetsResult)

Aggregations

FrequentItemsetsResult (de.lmu.ifi.dbs.elki.result.FrequentItemsetsResult)7 BitVector (de.lmu.ifi.dbs.elki.data.BitVector)4 Database (de.lmu.ifi.dbs.elki.database.Database)3 Duration (de.lmu.ifi.dbs.elki.logging.statistics.Duration)3 LongStatistic (de.lmu.ifi.dbs.elki.logging.statistics.LongStatistic)3 ArrayList (java.util.ArrayList)3 Test (org.junit.Test)3 ArrayModifiableDBIDs (de.lmu.ifi.dbs.elki.database.ids.ArrayModifiableDBIDs)2 DBIDs (de.lmu.ifi.dbs.elki.database.ids.DBIDs)2 APRIORI (de.lmu.ifi.dbs.elki.algorithm.itemsetmining.APRIORI)1 Itemset (de.lmu.ifi.dbs.elki.algorithm.itemsetmining.Itemset)1 HashmapDatabase (de.lmu.ifi.dbs.elki.database.HashmapDatabase)1 UpdatableDatabase (de.lmu.ifi.dbs.elki.database.UpdatableDatabase)1 DBIDIter (de.lmu.ifi.dbs.elki.database.ids.DBIDIter)1 HashSetDBIDs (de.lmu.ifi.dbs.elki.database.ids.HashSetDBIDs)1 SingleObjectBundle (de.lmu.ifi.dbs.elki.datasource.bundle.SingleObjectBundle)1 FiniteProgress (de.lmu.ifi.dbs.elki.logging.progress.FiniteProgress)1 IndefiniteProgress (de.lmu.ifi.dbs.elki.logging.progress.IndefiniteProgress)1 DoubleStatistic (de.lmu.ifi.dbs.elki.logging.statistics.DoubleStatistic)1