Search in sources :

Example 1 with BigQueryRelation

use of com.google.cloud.spark.bigquery.BigQueryRelation in project OpenLineage by OpenLineage.

the class BigQueryNodeVisitor method bigQuerySupplier.

private Optional<Supplier<BigQueryRelation>> bigQuerySupplier(LogicalPlan plan) {
    // SaveIntoDataSourceCommand is a special case because it references a CreatableRelationProvider
    // Every other write instance references a LogicalRelation(BigQueryRelation, _, _, _)
    SQLContext sqlContext = context.getSparkSession().get().sqlContext();
    if (plan instanceof SaveIntoDataSourceCommand) {
        SaveIntoDataSourceCommand saveCommand = (SaveIntoDataSourceCommand) plan;
        CreatableRelationProvider relationProvider = saveCommand.dataSource();
        if (relationProvider instanceof BigQueryRelationProvider) {
            return Optional.of(() -> (BigQueryRelation) ((BigQueryRelationProvider) relationProvider).createRelation(sqlContext, saveCommand.options(), saveCommand.schema()));
        }
    } else {
        if (plan instanceof LogicalRelation && ((LogicalRelation) plan).relation() instanceof BigQueryRelation) {
            return Optional.of(() -> (BigQueryRelation) ((LogicalRelation) plan).relation());
        }
    }
    return Optional.empty();
}
Also used : LogicalRelation(org.apache.spark.sql.execution.datasources.LogicalRelation) BigQueryRelation(com.google.cloud.spark.bigquery.BigQueryRelation) SaveIntoDataSourceCommand(org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand) CreatableRelationProvider(org.apache.spark.sql.sources.CreatableRelationProvider) BigQueryRelationProvider(com.google.cloud.spark.bigquery.BigQueryRelationProvider) SQLContext(org.apache.spark.sql.SQLContext)

Example 2 with BigQueryRelation

use of com.google.cloud.spark.bigquery.BigQueryRelation in project OpenLineage by OpenLineage.

the class LogicalPlanSerializerTest method testSerializeBigQueryPlan.

@Test
public void testSerializeBigQueryPlan() throws IOException {
    String query = "SELECT date FROM bigquery-public-data.google_analytics_sample.test";
    System.setProperty("GOOGLE_CLOUD_PROJECT", "test_serialization");
    SparkBigQueryConfig config = SparkBigQueryConfig.from(ImmutableMap.of("query", query, "dataset", "test-dataset", "maxparallelism", "2", "partitionexpirationms", "2"), ImmutableMap.of(), new Configuration(), 10, SQLConf.get(), "", Optional.empty());
    BigQueryRelation bigQueryRelation = new BigQueryRelation(config, TableInfo.newBuilder(TableId.of("dataset", "test"), new TestTableDefinition()).build(), mock(SQLContext.class));
    LogicalRelation logicalRelation = new LogicalRelation(bigQueryRelation, Seq$.MODULE$.<AttributeReference>newBuilder().$plus$eq(new AttributeReference("name", StringType$.MODULE$, false, Metadata.empty(), ExprId.apply(1L), Seq$.MODULE$.<String>empty())).result(), Option.empty(), false);
    InsertIntoDataSourceCommand command = new InsertIntoDataSourceCommand(logicalRelation, logicalRelation, false);
    Map<String, Object> commandActualNode = objectMapper.readValue(logicalPlanSerializer.serialize(command), mapTypeReference);
    Map<String, Object> bigqueryActualNode = objectMapper.readValue(logicalPlanSerializer.serialize(logicalRelation), mapTypeReference);
    Path expectedCommandNodePath = Paths.get("src", "test", "resources", "test_data", "serde", "insertintods-node.json");
    Path expectedBigQueryRelationNodePath = Paths.get("src", "test", "resources", "test_data", "serde", "bigqueryrelation-node.json");
    Map<String, Object> expectedCommandNode = objectMapper.readValue(expectedCommandNodePath.toFile(), mapTypeReference);
    Map<String, Object> expectedBigQueryRelationNode = objectMapper.readValue(expectedBigQueryRelationNodePath.toFile(), mapTypeReference);
    assertThat(commandActualNode).satisfies(new MatchesMapRecursively(expectedCommandNode, Collections.singleton("exprId")));
    assertThat(bigqueryActualNode).satisfies(new MatchesMapRecursively(expectedBigQueryRelationNode, Collections.singleton("exprId")));
}
Also used : Path(java.nio.file.Path) SparkBigQueryConfig(com.google.cloud.spark.bigquery.SparkBigQueryConfig) Configuration(org.apache.hadoop.conf.Configuration) AttributeReference(org.apache.spark.sql.catalyst.expressions.AttributeReference) InsertIntoDataSourceCommand(org.apache.spark.sql.execution.datasources.InsertIntoDataSourceCommand) LogicalRelation(org.apache.spark.sql.execution.datasources.LogicalRelation) BigQueryRelation(com.google.cloud.spark.bigquery.BigQueryRelation) SQLContext(org.apache.spark.sql.SQLContext) Test(org.junit.jupiter.api.Test)

Aggregations

BigQueryRelation (com.google.cloud.spark.bigquery.BigQueryRelation)2 SQLContext (org.apache.spark.sql.SQLContext)2 LogicalRelation (org.apache.spark.sql.execution.datasources.LogicalRelation)2 BigQueryRelationProvider (com.google.cloud.spark.bigquery.BigQueryRelationProvider)1 SparkBigQueryConfig (com.google.cloud.spark.bigquery.SparkBigQueryConfig)1 Path (java.nio.file.Path)1 Configuration (org.apache.hadoop.conf.Configuration)1 AttributeReference (org.apache.spark.sql.catalyst.expressions.AttributeReference)1 InsertIntoDataSourceCommand (org.apache.spark.sql.execution.datasources.InsertIntoDataSourceCommand)1 SaveIntoDataSourceCommand (org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand)1 CreatableRelationProvider (org.apache.spark.sql.sources.CreatableRelationProvider)1 Test (org.junit.jupiter.api.Test)1