Search in sources :

Example 1 with ImportJavaRDDOfElements

use of uk.gov.gchq.gaffer.spark.operation.javardd.ImportJavaRDDOfElements in project Gaffer by gchq.

the class ImportJavaRDDOfElementsHandlerTest method checkImportJavaRDDOfElements.

@Test
public void checkImportJavaRDDOfElements() throws OperationException, IOException, InterruptedException {
    final Graph graph1 = new Graph.Builder().config(new GraphConfig.Builder().graphId("graphId").build()).addSchema(getClass().getResourceAsStream("/schema/elements.json")).addSchema(getClass().getResourceAsStream("/schema/types.json")).addSchema(getClass().getResourceAsStream("/schema/serialisation.json")).storeProperties(PROPERTIES).build();
    final List<Element> elements = new ArrayList<>();
    for (int i = 0; i < 10; i++) {
        final Entity entity = new Entity.Builder().group(TestGroups.ENTITY).vertex("" + i).build();
        final Edge edge1 = new Edge.Builder().group(TestGroups.EDGE).source("" + i).dest("B").directed(false).property(TestPropertyNames.COUNT, 2).build();
        final Edge edge2 = new Edge.Builder().group(TestGroups.EDGE).source("" + i).dest("C").directed(false).property(TestPropertyNames.COUNT, 4).build();
        elements.add(edge1);
        elements.add(edge2);
        elements.add(entity);
    }
    final User user = new User();
    final SparkSession sparkSession = SparkSessionProvider.getSparkSession();
    // Create Hadoop configuration and serialise to a string
    final Configuration configuration = new Configuration();
    final String configurationString = AbstractGetRDDHandler.convertConfigurationToString(configuration);
    final String outputPath = tempDir.resolve("output").toAbsolutePath().toString();
    final String failurePath = tempDir.resolve("failure").toAbsolutePath().toString();
    final JavaRDD<Element> elementJavaRDD = JavaSparkContext.fromSparkContext(sparkSession.sparkContext()).parallelize(elements);
    final ImportJavaRDDOfElements addRdd = new ImportJavaRDDOfElements.Builder().input(elementJavaRDD).option("outputPath", outputPath).option("failurePath", failurePath).build();
    graph1.execute(addRdd, user);
    // Check all elements were added
    final GetJavaRDDOfAllElements rddQuery = new GetJavaRDDOfAllElements.Builder().option(AbstractGetRDDHandler.HADOOP_CONFIGURATION_KEY, configurationString).build();
    final JavaRDD<Element> rdd = graph1.execute(rddQuery, user);
    if (rdd == null) {
        fail("No RDD returned");
    }
    final Set<Element> results = new HashSet<>(rdd.collect());
    assertEquals(elements.size(), results.size());
}
Also used : Entity(uk.gov.gchq.gaffer.data.element.Entity) User(uk.gov.gchq.gaffer.user.User) SparkSession(org.apache.spark.sql.SparkSession) Configuration(org.apache.hadoop.conf.Configuration) Element(uk.gov.gchq.gaffer.data.element.Element) ArrayList(java.util.ArrayList) GetJavaRDDOfAllElements(uk.gov.gchq.gaffer.spark.operation.javardd.GetJavaRDDOfAllElements) Graph(uk.gov.gchq.gaffer.graph.Graph) ImportJavaRDDOfElements(uk.gov.gchq.gaffer.spark.operation.javardd.ImportJavaRDDOfElements) Edge(uk.gov.gchq.gaffer.data.element.Edge) HashSet(java.util.HashSet) Test(org.junit.jupiter.api.Test)

Aggregations

ArrayList (java.util.ArrayList)1 HashSet (java.util.HashSet)1 Configuration (org.apache.hadoop.conf.Configuration)1 SparkSession (org.apache.spark.sql.SparkSession)1 Test (org.junit.jupiter.api.Test)1 Edge (uk.gov.gchq.gaffer.data.element.Edge)1 Element (uk.gov.gchq.gaffer.data.element.Element)1 Entity (uk.gov.gchq.gaffer.data.element.Entity)1 Graph (uk.gov.gchq.gaffer.graph.Graph)1 GetJavaRDDOfAllElements (uk.gov.gchq.gaffer.spark.operation.javardd.GetJavaRDDOfAllElements)1 ImportJavaRDDOfElements (uk.gov.gchq.gaffer.spark.operation.javardd.ImportJavaRDDOfElements)1 User (uk.gov.gchq.gaffer.user.User)1