Search in sources :

Example 1 with Schema

use of org.hillview.table.Schema in project hillview by vmware.

the class SampleDistinctRowsSketch method create.

@Override
public MinKSet<RowSnapshot> create(@Nullable ITable data) {
    IndexComparator comp = this.recordOrder.getIndexComparator(Converters.checkNull(data));
    Schema schema = this.recordOrder.toSchema();
    VirtualRowSnapshot vw = new VirtualRowSnapshot(data, schema);
    MinKRows mkRows = new MinKRows(numSamples);
    LongHashFunction hash = LongHashFunction.xx(this.seed);
    IRowIterator rowIt = data.getRowIterator();
    int currRow = rowIt.getNextRow();
    int maxRow, minRow;
    if (currRow == -1)
        return this.zero();
    else {
        minRow = currRow;
        maxRow = currRow;
    }
    while (currRow != -1) {
        vw.setRow(currRow);
        mkRows.push(hash.hashLong(vw.hashCode()), currRow);
        if (comp.compare(minRow, currRow) > 0)
            minRow = currRow;
        if (comp.compare(maxRow, currRow) < 0)
            maxRow = currRow;
        currRow = rowIt.getNextRow();
    }
    Long2ObjectRBTreeMap<RowSnapshot> hMap = new Long2ObjectRBTreeMap<RowSnapshot>();
    for (long hashKey : mkRows.hashMap.keySet()) hMap.put(hashKey, new RowSnapshot(data, mkRows.hashMap.get(hashKey), schema));
    RowSnapshot minRS = new RowSnapshot(data, minRow, schema);
    RowSnapshot maxRS = new RowSnapshot(data, maxRow, schema);
    return new MinKSet<RowSnapshot>(numSamples, hMap, this.recordOrder.getRowComparator(), minRS, maxRS, data.getNumOfRows(), 0);
}
Also used : VirtualRowSnapshot(org.hillview.table.rows.VirtualRowSnapshot) RowSnapshot(org.hillview.table.rows.RowSnapshot) Long2ObjectRBTreeMap(it.unimi.dsi.fastutil.longs.Long2ObjectRBTreeMap) VirtualRowSnapshot(org.hillview.table.rows.VirtualRowSnapshot) IndexComparator(org.hillview.table.api.IndexComparator) Schema(org.hillview.table.Schema) MinKRows(org.hillview.sketches.results.MinKRows) MinKSet(org.hillview.sketches.results.MinKSet) IRowIterator(org.hillview.table.api.IRowIterator) LongHashFunction(net.openhft.hashing.LongHashFunction)

Example 2 with Schema

use of org.hillview.table.Schema in project hillview by vmware.

the class DPAccuracyBenchmarks method benchmarkHistogramL1Accuracy.

public void benchmarkHistogramL1Accuracy() throws IOException {
    HillviewLogger.instance.setLogLevel(Level.OFF);
    @Nullable IDataSet<ITable> table = this.loadData();
    if (table == null) {
        System.out.println("Skipping test: no data");
        return;
    }
    Schema schema = this.loadSchema(table);
    List<String> cols = schema.getColumnNames();
    PrivacySchema mdSchema = PrivacySchema.loadFromFile(ontime_directory + privacy_metadata_name);
    Assert.assertNotNull(mdSchema);
    Assert.assertNotNull(mdSchema.quantization);
    HashMap<String, ArrayList<Double>> results = new HashMap<String, ArrayList<Double>>();
    int iterations = 10;
    for (String col : cols) {
        ColumnQuantization quantization = mdSchema.quantization.get(col);
        Assert.assertNotNull(quantization);
        double epsilon = mdSchema.epsilon(col);
        Pair<Double, Double> res = this.computeSingleColumnAccuracy(col, mdSchema.getColumnIndex(col), quantization, epsilon, table, iterations);
        System.out.println("Averaged absolute error over " + iterations + " iterations: " + res.first);
        // for JSON parsing convenience
        ArrayList<Double> resArr = new ArrayList<Double>();
        // noise
        resArr.add(res.first);
        // stdev
        resArr.add(res.second);
        results.put(col, resArr);
    }
    FileWriter writer = new FileWriter(histogram_results_filename);
    Gson resultsGson = new GsonBuilder().create();
    writer.write(resultsGson.toJson(results));
    writer.flush();
    writer.close();
}
Also used : HashMap(java.util.HashMap) PrivacySchema(org.hillview.table.PrivacySchema) Schema(org.hillview.table.Schema) PrivacySchema(org.hillview.table.PrivacySchema) FileWriter(java.io.FileWriter) ArrayList(java.util.ArrayList) ITable(org.hillview.table.api.ITable) ColumnQuantization(org.hillview.table.columns.ColumnQuantization) StringColumnQuantization(org.hillview.table.columns.StringColumnQuantization) DoubleColumnQuantization(org.hillview.table.columns.DoubleColumnQuantization) Nullable(javax.annotation.Nullable)

Example 3 with Schema

use of org.hillview.table.Schema in project hillview by vmware.

the class DPAccuracyBenchmarks method benchmarkHeatmapL1Accuracy.

public void benchmarkHeatmapL1Accuracy() throws IOException {
    HillviewLogger.instance.setLogLevel(Level.OFF);
    @Nullable IDataSet<ITable> table = this.loadData();
    if (table == null) {
        System.out.println("Skipping test: no data");
        return;
    }
    Schema schema = this.loadSchema(table);
    List<String> cols = schema.getColumnNames();
    PrivacySchema mdSchema = PrivacySchema.loadFromFile(ontime_directory + privacy_metadata_name);
    Assert.assertNotNull(mdSchema);
    Assert.assertNotNull(mdSchema.quantization);
    HashMap<String, ArrayList<Double>> results = new HashMap<String, ArrayList<Double>>();
    int iterations = 5;
    for (String col1 : cols) {
        for (String col2 : cols) {
            if (col1.equals(col2))
                continue;
            ColumnQuantization q1 = mdSchema.quantization.get(col1);
            Assert.assertNotNull(q1);
            ColumnQuantization q2 = mdSchema.quantization.get(col2);
            Assert.assertNotNull(q2);
            String key = mdSchema.getKeyForColumns(col1, col2);
            double epsilon = mdSchema.epsilon(key);
            Pair<Double, Double> res = this.computeHeatmapAccuracy(col1, q1, col2, q2, mdSchema.getColumnIndex(col1, col2), epsilon, table, iterations);
            System.out.println("Averaged absolute error over " + iterations + " iterations: " + res.first);
            // for JSON parsing convenience
            ArrayList<Double> resArr = new ArrayList<Double>();
            // noise
            resArr.add(res.first);
            // stdev
            resArr.add(res.second);
            System.out.println("Key: " + key + ", mean: " + res.first);
            results.put(key, resArr);
        }
    }
    FileWriter writer = new FileWriter(heatmap_results_filename);
    Gson resultsGson = new GsonBuilder().create();
    writer.write(resultsGson.toJson(results));
    writer.flush();
    writer.close();
}
Also used : HashMap(java.util.HashMap) PrivacySchema(org.hillview.table.PrivacySchema) Schema(org.hillview.table.Schema) PrivacySchema(org.hillview.table.PrivacySchema) FileWriter(java.io.FileWriter) ArrayList(java.util.ArrayList) ITable(org.hillview.table.api.ITable) ColumnQuantization(org.hillview.table.columns.ColumnQuantization) StringColumnQuantization(org.hillview.table.columns.StringColumnQuantization) DoubleColumnQuantization(org.hillview.table.columns.DoubleColumnQuantization) Nullable(javax.annotation.Nullable)

Example 4 with Schema

use of org.hillview.table.Schema in project hillview by vmware.

the class PrivateTableTarget method project.

@HillviewRpc
public void project(RpcRequest request, RpcRequestContext context) {
    Schema proj = request.parseArgs(Schema.class);
    ProjectMap map = new ProjectMap(proj);
    this.runMap(this.table, map, (d, c) -> new PrivateTableTarget(d, c, this.wrapper, this.metadataDirectory), request, context);
}
Also used : ProjectMap(org.hillview.maps.ProjectMap) PrivacySchema(org.hillview.table.PrivacySchema) QuantizationSchema(org.hillview.table.QuantizationSchema) Schema(org.hillview.table.Schema)

Example 5 with Schema

use of org.hillview.table.Schema in project hillview by vmware.

the class JdbcDatabase method getSchema.

public Schema getSchema() {
    try {
        Schema result = new Schema();
        ResultSetMetaData meta = this.getTableSchema();
        for (int i = 0; i < meta.getColumnCount(); i++) {
            ColumnDescription cd = JdbcDatabase.getDescription(meta, i);
            result.append(cd);
        }
        return result;
    } catch (SQLException e) {
        throw new RuntimeException(e);
    }
}
Also used : ColumnDescription(org.hillview.table.ColumnDescription) Schema(org.hillview.table.Schema)

Aggregations

Schema (org.hillview.table.Schema)27 ColumnDescription (org.hillview.table.ColumnDescription)12 ITable (org.hillview.table.api.ITable)12 Test (org.junit.Test)9 LazySchema (org.hillview.table.LazySchema)7 BaseTest (org.hillview.test.BaseTest)7 Table (org.hillview.table.Table)6 Nullable (javax.annotation.Nullable)5 IRowIterator (org.hillview.table.api.IRowIterator)5 RowSnapshot (org.hillview.table.rows.RowSnapshot)5 ArrayList (java.util.ArrayList)4 IColumn (org.hillview.table.api.IColumn)4 File (java.io.File)3 HashMap (java.util.HashMap)3 FilterMap (org.hillview.maps.FilterMap)3 PrivacySchema (org.hillview.table.PrivacySchema)3 JsonElement (com.google.gson.JsonElement)2 CsvFormat (com.univocity.parsers.csv.CsvFormat)2 FileWriter (java.io.FileWriter)2 IOException (java.io.IOException)2