Search in sources :

Example 6 with AccumuloProperties

use of uk.gov.gchq.gaffer.accumulostore.AccumuloProperties in project Gaffer by gchq.

the class TableUtilsTest method shouldThrowExceptionIfTableNameIsNotSpecifiedWhenCreatingAGraph.

@Test(expected = AccumuloRuntimeException.class)
public void shouldThrowExceptionIfTableNameIsNotSpecifiedWhenCreatingAGraph() {
    // Given
    final Schema schema = new Schema.Builder().type("int", Integer.class).type("string", String.class).type("boolean", Boolean.class).edge("EDGE", new SchemaEdgeDefinition.Builder().source("string").destination("string").directed("boolean").build()).build();
    final AccumuloProperties properties = new AccumuloProperties();
    properties.setStoreClass(SingleUseMockAccumuloStore.class.getName());
    // When
    new Graph.Builder().addSchema(schema).storeProperties(properties).build();
    fail("The expected exception was not thrown.");
}
Also used : SingleUseMockAccumuloStore(uk.gov.gchq.gaffer.accumulostore.SingleUseMockAccumuloStore) Graph(uk.gov.gchq.gaffer.graph.Graph) AccumuloProperties(uk.gov.gchq.gaffer.accumulostore.AccumuloProperties) Schema(uk.gov.gchq.gaffer.store.schema.Schema) Test(org.junit.Test)

Example 7 with AccumuloProperties

use of uk.gov.gchq.gaffer.accumulostore.AccumuloProperties in project Gaffer by gchq.

the class AccumuloStoreRelationTest method testBuildScanSpecifyColumnsAndFiltersWithView.

private void testBuildScanSpecifyColumnsAndFiltersWithView(final String name, final View view, final String[] requiredColumns, final Filter[] filters, final Predicate<Element> returnElement) throws OperationException, StoreException {
    // Given
    final SQLContext sqlContext = getSqlContext(name);
    final Schema schema = getSchema();
    final AccumuloProperties properties = AccumuloProperties.loadStoreProperties(getClass().getResourceAsStream("/store.properties"));
    final SingleUseMockAccumuloStore store = new SingleUseMockAccumuloStore();
    store.initialise(schema, properties);
    addElements(store);
    // When
    final AccumuloStoreRelation relation = new AccumuloStoreRelation(sqlContext, Collections.emptyList(), view, store, new User());
    final RDD<Row> rdd = relation.buildScan(requiredColumns, filters);
    final Row[] returnedElements = (Row[]) rdd.collect();
    // Then
    //  - Actual results are:
    final Set<Row> results = new HashSet<>();
    for (int i = 0; i < returnedElements.length; i++) {
        results.add(returnedElements[i]);
    }
    //  - Expected results are:
    final SchemaToStructTypeConverter schemaConverter = new SchemaToStructTypeConverter(schema, view, new ArrayList<>());
    final ConvertElementToRow elementConverter = new ConvertElementToRow(new LinkedHashSet<>(Arrays.asList(requiredColumns)), schemaConverter.getPropertyNeedsConversion(), schemaConverter.getConverterByProperty());
    final Set<Row> expectedRows = new HashSet<>();
    StreamSupport.stream(getElements().spliterator(), false).filter(returnElement).map(elementConverter::apply).forEach(expectedRows::add);
    assertEquals(expectedRows, results);
    sqlContext.sparkContext().stop();
}
Also used : SingleUseMockAccumuloStore(uk.gov.gchq.gaffer.accumulostore.SingleUseMockAccumuloStore) User(uk.gov.gchq.gaffer.user.User) AccumuloProperties(uk.gov.gchq.gaffer.accumulostore.AccumuloProperties) Schema(uk.gov.gchq.gaffer.store.schema.Schema) ConvertElementToRow(uk.gov.gchq.gaffer.spark.operation.dataframe.ConvertElementToRow) Row(org.apache.spark.sql.Row) SchemaToStructTypeConverter(uk.gov.gchq.gaffer.spark.operation.dataframe.converter.schema.SchemaToStructTypeConverter) SQLContext(org.apache.spark.sql.SQLContext) HashSet(java.util.HashSet) LinkedHashSet(java.util.LinkedHashSet) ConvertElementToRow(uk.gov.gchq.gaffer.spark.operation.dataframe.ConvertElementToRow)

Example 8 with AccumuloProperties

use of uk.gov.gchq.gaffer.accumulostore.AccumuloProperties in project Gaffer by gchq.

the class AccumuloStoreRelationTest method testBuildScanSpecifyColumnsWithView.

private void testBuildScanSpecifyColumnsWithView(final String name, final View view, final String[] requiredColumns, final Predicate<Element> returnElement) throws OperationException, StoreException {
    // Given
    final SQLContext sqlContext = getSqlContext(name);
    final Schema schema = getSchema();
    final AccumuloProperties properties = AccumuloProperties.loadStoreProperties(getClass().getResourceAsStream("/store.properties"));
    final SingleUseMockAccumuloStore store = new SingleUseMockAccumuloStore();
    store.initialise(schema, properties);
    addElements(store);
    // When
    final AccumuloStoreRelation relation = new AccumuloStoreRelation(sqlContext, Collections.emptyList(), view, store, new User());
    final RDD<Row> rdd = relation.buildScan(requiredColumns);
    final Row[] returnedElements = (Row[]) rdd.collect();
    // Then
    //  - Actual results are:
    final Set<Row> results = new HashSet<>();
    for (int i = 0; i < returnedElements.length; i++) {
        results.add(returnedElements[i]);
    }
    //  - Expected results are:
    final SchemaToStructTypeConverter schemaConverter = new SchemaToStructTypeConverter(schema, view, new ArrayList<>());
    final ConvertElementToRow elementConverter = new ConvertElementToRow(new LinkedHashSet<>(Arrays.asList(requiredColumns)), schemaConverter.getPropertyNeedsConversion(), schemaConverter.getConverterByProperty());
    final Set<Row> expectedRows = new HashSet<>();
    StreamSupport.stream(getElements().spliterator(), false).filter(returnElement).map(elementConverter::apply).forEach(expectedRows::add);
    assertEquals(expectedRows, results);
    sqlContext.sparkContext().stop();
}
Also used : SingleUseMockAccumuloStore(uk.gov.gchq.gaffer.accumulostore.SingleUseMockAccumuloStore) User(uk.gov.gchq.gaffer.user.User) AccumuloProperties(uk.gov.gchq.gaffer.accumulostore.AccumuloProperties) Schema(uk.gov.gchq.gaffer.store.schema.Schema) ConvertElementToRow(uk.gov.gchq.gaffer.spark.operation.dataframe.ConvertElementToRow) Row(org.apache.spark.sql.Row) SchemaToStructTypeConverter(uk.gov.gchq.gaffer.spark.operation.dataframe.converter.schema.SchemaToStructTypeConverter) SQLContext(org.apache.spark.sql.SQLContext) HashSet(java.util.HashSet) LinkedHashSet(java.util.LinkedHashSet) ConvertElementToRow(uk.gov.gchq.gaffer.spark.operation.dataframe.ConvertElementToRow)

Example 9 with AccumuloProperties

use of uk.gov.gchq.gaffer.accumulostore.AccumuloProperties in project Gaffer by gchq.

the class InputFormatTest method shouldReturnCorrectDataToMapReduceJob.

private void shouldReturnCorrectDataToMapReduceJob(final Schema schema, final KeyPackage kp, final List<Element> data, final View view, final User user, final String instanceName, final Set<String> expectedResults) throws Exception {
    final AccumuloStore store = new MockAccumuloStore();
    final AccumuloProperties properties = AccumuloProperties.loadStoreProperties(StreamUtil.storeProps(getClass()));
    switch(kp) {
        case BYTE_ENTITY_KEY_PACKAGE:
            properties.setKeyPackageClass(ByteEntityKeyPackage.class.getName());
            properties.setInstance(instanceName + "_BYTE_ENTITY");
            break;
        case CLASSIC_KEY_PACKAGE:
            properties.setKeyPackageClass(ClassicKeyPackage.class.getName());
            properties.setInstance(instanceName + "_CLASSIC");
    }
    try {
        store.initialise(schema, properties);
    } catch (StoreException e) {
        fail("StoreException thrown: " + e);
    }
    setupGraph(store, data);
    // Set up local conf
    final JobConf conf = new JobConf();
    conf.set("fs.default.name", "file:///");
    conf.set("mapred.job.tracker", "local");
    final FileSystem fs = FileSystem.getLocal(conf);
    // Update configuration with instance, table name, etc.
    store.updateConfiguration(conf, view, user);
    // Run Driver
    final File outputFolder = testFolder.newFolder();
    FileUtils.deleteDirectory(outputFolder);
    final Driver driver = new Driver(outputFolder.getAbsolutePath());
    driver.setConf(conf);
    driver.run(new String[] {});
    // Read results and check correct
    final SequenceFile.Reader reader = new SequenceFile.Reader(fs, new Path(outputFolder + "/part-m-00000"), conf);
    final Text text = new Text();
    final Set<String> results = new HashSet<>();
    while (reader.next(text)) {
        results.add(text.toString());
    }
    reader.close();
    assertEquals(expectedResults, results);
    FileUtils.deleteDirectory(outputFolder);
}
Also used : Path(org.apache.hadoop.fs.Path) ClassicKeyPackage(uk.gov.gchq.gaffer.accumulostore.key.core.impl.classic.ClassicKeyPackage) MockAccumuloStore(uk.gov.gchq.gaffer.accumulostore.MockAccumuloStore) AccumuloProperties(uk.gov.gchq.gaffer.accumulostore.AccumuloProperties) Text(org.apache.hadoop.io.Text) ByteEntityKeyPackage(uk.gov.gchq.gaffer.accumulostore.key.core.impl.byteEntity.ByteEntityKeyPackage) StoreException(uk.gov.gchq.gaffer.store.StoreException) SequenceFile(org.apache.hadoop.io.SequenceFile) FileSystem(org.apache.hadoop.fs.FileSystem) AccumuloStore(uk.gov.gchq.gaffer.accumulostore.AccumuloStore) MockAccumuloStore(uk.gov.gchq.gaffer.accumulostore.MockAccumuloStore) JobConf(org.apache.hadoop.mapred.JobConf) SequenceFile(org.apache.hadoop.io.SequenceFile) File(java.io.File) HashSet(java.util.HashSet)

Example 10 with AccumuloProperties

use of uk.gov.gchq.gaffer.accumulostore.AccumuloProperties in project Gaffer by gchq.

the class AccumuloAddElementsFromHdfsJobFactoryTest method shouldSetNoMoreThanMaxNumberOfReducersSpecified.

@Test
public void shouldSetNoMoreThanMaxNumberOfReducersSpecified() throws IOException, StoreException, OperationException {
    // Given
    final MockAccumuloStore store = new MockAccumuloStore();
    final Schema schema = Schema.fromJson(StreamUtil.schemas(AccumuloAddElementsFromHdfsJobFactoryTest.class));
    final AccumuloProperties properties = AccumuloProperties.loadStoreProperties(StreamUtil.storeProps(AccumuloAddElementsFromHdfsJobFactoryTest.class));
    store.initialise(schema, properties);
    final JobConf localConf = createLocalConf();
    final FileSystem fs = FileSystem.getLocal(localConf);
    fs.mkdirs(new Path(outputDir));
    fs.mkdirs(new Path(splitsDir));
    final BufferedWriter writer = new BufferedWriter(new FileWriter(splitsFile));
    for (int i = 100; i < 200; i++) {
        writer.write(i + "\n");
    }
    writer.close();
    final SplitTable splitTable = new SplitTable.Builder().inputPath(splitsFile).build();
    store.execute(splitTable, new User());
    final AccumuloAddElementsFromHdfsJobFactory factory = new AccumuloAddElementsFromHdfsJobFactory();
    final Job job = Job.getInstance(localConf);
    // When
    AddElementsFromHdfs operation = new AddElementsFromHdfs.Builder().outputPath(outputDir).mapperGenerator(TextMapperGeneratorImpl.class).option(AccumuloStoreConstants.OPERATION_BULK_IMPORT_MAX_REDUCERS, "10").option(AccumuloStoreConstants.OPERATION_HDFS_SPLITS_FILE_PATH, "target/data/splits.txt").build();
    factory.setupJobConf(localConf, operation, store);
    factory.setupJob(job, operation, store);
    // Then
    assertTrue(job.getNumReduceTasks() <= 10);
    // When
    operation = new AddElementsFromHdfs.Builder().outputPath(outputDir).mapperGenerator(TextMapperGeneratorImpl.class).option(AccumuloStoreConstants.OPERATION_BULK_IMPORT_MAX_REDUCERS, "100").option(AccumuloStoreConstants.OPERATION_HDFS_SPLITS_FILE_PATH, "target/data/splits.txt").build();
    factory.setupJobConf(localConf, operation, store);
    factory.setupJob(job, operation, store);
    // Then
    assertTrue(job.getNumReduceTasks() <= 100);
    // When
    operation = new AddElementsFromHdfs.Builder().outputPath(outputDir).mapperGenerator(TextMapperGeneratorImpl.class).option(AccumuloStoreConstants.OPERATION_BULK_IMPORT_MAX_REDUCERS, "1000").option(AccumuloStoreConstants.OPERATION_HDFS_SPLITS_FILE_PATH, "target/data/splits.txt").build();
    factory.setupJobConf(localConf, operation, store);
    factory.setupJob(job, operation, store);
    // Then
    assertTrue(job.getNumReduceTasks() <= 1000);
}
Also used : Path(org.apache.hadoop.fs.Path) AddElementsFromHdfs(uk.gov.gchq.gaffer.hdfs.operation.AddElementsFromHdfs) User(uk.gov.gchq.gaffer.user.User) MockAccumuloStore(uk.gov.gchq.gaffer.accumulostore.MockAccumuloStore) AccumuloProperties(uk.gov.gchq.gaffer.accumulostore.AccumuloProperties) Schema(uk.gov.gchq.gaffer.store.schema.Schema) FileWriter(java.io.FileWriter) BufferedWriter(java.io.BufferedWriter) FileSystem(org.apache.hadoop.fs.FileSystem) SplitTable(uk.gov.gchq.gaffer.accumulostore.operation.hdfs.operation.SplitTable) Job(org.apache.hadoop.mapreduce.Job) JobConf(org.apache.hadoop.mapred.JobConf) Test(org.junit.Test)

Aggregations

AccumuloProperties (uk.gov.gchq.gaffer.accumulostore.AccumuloProperties)15 Schema (uk.gov.gchq.gaffer.store.schema.Schema)13 MockAccumuloStore (uk.gov.gchq.gaffer.accumulostore.MockAccumuloStore)11 Test (org.junit.Test)9 SingleUseMockAccumuloStore (uk.gov.gchq.gaffer.accumulostore.SingleUseMockAccumuloStore)9 User (uk.gov.gchq.gaffer.user.User)7 FileSystem (org.apache.hadoop.fs.FileSystem)5 Path (org.apache.hadoop.fs.Path)5 JobConf (org.apache.hadoop.mapred.JobConf)5 AccumuloStore (uk.gov.gchq.gaffer.accumulostore.AccumuloStore)5 BufferedWriter (java.io.BufferedWriter)4 HashSet (java.util.HashSet)4 Job (org.apache.hadoop.mapreduce.Job)4 AddElementsFromHdfs (uk.gov.gchq.gaffer.hdfs.operation.AddElementsFromHdfs)4 FileWriter (java.io.FileWriter)3 EnumSet (java.util.EnumSet)3 LinkedHashSet (java.util.LinkedHashSet)3 Row (org.apache.spark.sql.Row)3 SQLContext (org.apache.spark.sql.SQLContext)3 SplitTable (uk.gov.gchq.gaffer.accumulostore.operation.hdfs.operation.SplitTable)3