Search in sources :

Example 11 with TableReference

use of com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference in project java-bigquerystorage by googleapis.

the class ITBigQueryStorageLongRunningTest method testLongRunningReadSession.

@Test
public void testLongRunningReadSession() throws InterruptedException, ExecutionException {
    // This test reads a larger table with the goal of doing a simple validation of timeout settings
    // for a longer running session.
    TableReference tableReference = TableReference.newBuilder().setProjectId("bigquery-public-data").setDatasetId("samples").setTableId("wikipedia").build();
    ReadSession session = client.createReadSession(/* tableReference = */
    tableReference, /* parent = */
    parentProjectId, /* requestedStreams = */
    5);
    assertEquals(String.format("Did not receive expected number of streams for table reference '%s' CreateReadSession response:%n%s", TextFormat.shortDebugString(tableReference), session.toString()), 5, session.getStreamsCount());
    List<Callable<Long>> tasks = new ArrayList<>(session.getStreamsCount());
    for (final Stream stream : session.getStreamsList()) {
        tasks.add(new Callable<Long>() {

            @Override
            public Long call() throws Exception {
                return readAllRowsFromStream(stream);
            }
        });
    }
    ExecutorService executor = Executors.newFixedThreadPool(tasks.size());
    List<Future<Long>> results = executor.invokeAll(tasks);
    long rowCount = 0;
    for (Future<Long> result : results) {
        rowCount += result.get();
    }
    assertEquals(313_797_035, rowCount);
}
Also used : ReadSession(com.google.cloud.bigquery.storage.v1beta1.Storage.ReadSession) ArrayList(java.util.ArrayList) Callable(java.util.concurrent.Callable) IOException(java.io.IOException) ExecutionException(java.util.concurrent.ExecutionException) TableReference(com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference) ExecutorService(java.util.concurrent.ExecutorService) Future(java.util.concurrent.Future) Stream(com.google.cloud.bigquery.storage.v1beta1.Storage.Stream) ServerStream(com.google.api.gax.rpc.ServerStream) Test(org.junit.Test)

Example 12 with TableReference

use of com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference in project presto by prestodb.

the class ReadSessionCreator method create.

public Storage.ReadSession create(TableId table, ImmutableList<String> selectedFields, Optional<String> filter, int parallelism) {
    TableInfo tableDetails = bigQueryClient.getTable(table);
    TableInfo actualTable = getActualTable(tableDetails, selectedFields, new String[] {});
    try (BigQueryStorageClient bigQueryStorageClient = bigQueryStorageClientFactory.createBigQueryStorageClient()) {
        ReadOptions.TableReadOptions.Builder readOptions = ReadOptions.TableReadOptions.newBuilder().addAllSelectedFields(selectedFields);
        filter.ifPresent(readOptions::setRowRestriction);
        TableReferenceProto.TableReference tableReference = toTableReference(actualTable.getTableId());
        Storage.ReadSession readSession = bigQueryStorageClient.createReadSession(Storage.CreateReadSessionRequest.newBuilder().setParent("projects/" + bigQueryClient.getProjectId()).setFormat(Storage.DataFormat.AVRO).setRequestedStreams(parallelism).setReadOptions(readOptions).setTableReference(tableReference).setShardingStrategy(Storage.ShardingStrategy.BALANCED).build());
        return readSession;
    }
}
Also used : Storage(com.google.cloud.bigquery.storage.v1beta1.Storage) BigQueryStorageClient(com.google.cloud.bigquery.storage.v1beta1.BigQueryStorageClient) TableReferenceProto(com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto) TableInfo(com.google.cloud.bigquery.TableInfo)

Example 13 with TableReference

use of com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference in project urban-eureka by errir503.

the class ReadSessionCreator method create.

public Storage.ReadSession create(TableId table, ImmutableList<String> selectedFields, Optional<String> filter, int parallelism) {
    TableInfo tableDetails = bigQueryClient.getTable(table);
    TableInfo actualTable = getActualTable(tableDetails, selectedFields, new String[] {});
    try (BigQueryStorageClient bigQueryStorageClient = bigQueryStorageClientFactory.createBigQueryStorageClient()) {
        ReadOptions.TableReadOptions.Builder readOptions = ReadOptions.TableReadOptions.newBuilder().addAllSelectedFields(selectedFields);
        filter.ifPresent(readOptions::setRowRestriction);
        TableReferenceProto.TableReference tableReference = toTableReference(actualTable.getTableId());
        Storage.ReadSession readSession = bigQueryStorageClient.createReadSession(Storage.CreateReadSessionRequest.newBuilder().setParent("projects/" + bigQueryClient.getProjectId()).setFormat(Storage.DataFormat.AVRO).setRequestedStreams(parallelism).setReadOptions(readOptions).setTableReference(tableReference).setShardingStrategy(Storage.ShardingStrategy.BALANCED).build());
        return readSession;
    }
}
Also used : Storage(com.google.cloud.bigquery.storage.v1beta1.Storage) BigQueryStorageClient(com.google.cloud.bigquery.storage.v1beta1.BigQueryStorageClient) TableReferenceProto(com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto) TableInfo(com.google.cloud.bigquery.TableInfo)

Example 14 with TableReference

use of com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference in project java-bigquerystorage by googleapis.

the class BigQueryStorageClientTest method createReadSessionExceptionTest.

@Test
@SuppressWarnings("all")
public void createReadSessionExceptionTest() throws Exception {
    StatusRuntimeException exception = new StatusRuntimeException(Status.INVALID_ARGUMENT);
    mockBigQueryStorage.addException(exception);
    try {
        TableReference tableReference = TableReference.newBuilder().build();
        String parent = "parent-995424086";
        int requestedStreams = 1017221410;
        client.createReadSession(tableReference, parent, requestedStreams);
        Assert.fail("No exception raised");
    } catch (InvalidArgumentException e) {
    // Expected exception
    }
}
Also used : TableReference(com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference) InvalidArgumentException(com.google.api.gax.rpc.InvalidArgumentException) StatusRuntimeException(io.grpc.StatusRuntimeException) Test(org.junit.Test)

Example 15 with TableReference

use of com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference in project java-bigquerystorage by googleapis.

the class ITBigQueryStorageTest method testFilter.

@Test
public void testFilter() throws IOException {
    TableReference tableReference = TableReference.newBuilder().setProjectId("bigquery-public-data").setDatasetId("samples").setTableId("shakespeare").build();
    TableReadOptions options = TableReadOptions.newBuilder().setRowRestriction("word_count > 100").build();
    CreateReadSessionRequest request = CreateReadSessionRequest.newBuilder().setParent(parentProjectId).setRequestedStreams(1).setTableReference(tableReference).setReadOptions(options).setFormat(DataFormat.AVRO).build();
    ReadSession session = client.createReadSession(request);
    assertEquals(String.format("Did not receive expected number of streams for table reference '%s' CreateReadSession response:%n%s", TextFormat.shortDebugString(tableReference), session.toString()), 1, session.getStreamsCount());
    StreamPosition readPosition = StreamPosition.newBuilder().setStream(session.getStreams(0)).build();
    ReadRowsRequest readRowsRequest = ReadRowsRequest.newBuilder().setReadPosition(readPosition).build();
    SimpleRowReader reader = new SimpleRowReader(new Schema.Parser().parse(session.getAvroSchema().getSchema()));
    long rowCount = 0;
    ServerStream<ReadRowsResponse> stream = client.readRowsCallable().call(readRowsRequest);
    for (ReadRowsResponse response : stream) {
        rowCount += response.getRowCount();
        reader.processRows(response.getAvroRows(), new SimpleRowReader.AvroRowConsumer() {

            @Override
            public void accept(GenericData.Record record) {
                Long wordCount = (Long) record.get("word_count");
                assertWithMessage("Row not matching expectations: %s", record.toString()).that(wordCount).isGreaterThan(100L);
            }
        });
    }
    assertEquals(1_333, rowCount);
}
Also used : AvroRowConsumer(com.google.cloud.bigquery.storage.v1beta1.it.SimpleRowReader.AvroRowConsumer) ReadSession(com.google.cloud.bigquery.storage.v1beta1.Storage.ReadSession) StreamPosition(com.google.cloud.bigquery.storage.v1beta1.Storage.StreamPosition) ReadRowsRequest(com.google.cloud.bigquery.storage.v1beta1.Storage.ReadRowsRequest) GenericData(org.apache.avro.generic.GenericData) TableReference(com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference) ReadRowsResponse(com.google.cloud.bigquery.storage.v1beta1.Storage.ReadRowsResponse) TableReadOptions(com.google.cloud.bigquery.storage.v1beta1.ReadOptions.TableReadOptions) CreateReadSessionRequest(com.google.cloud.bigquery.storage.v1beta1.Storage.CreateReadSessionRequest) Test(org.junit.Test)

Aggregations

TableReference (com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto.TableReference)14 Test (org.junit.Test)14 GenericData (org.apache.avro.generic.GenericData)8 ReadSession (com.google.cloud.bigquery.storage.v1beta1.Storage.ReadSession)7 ReadRowsRequest (com.google.cloud.bigquery.storage.v1beta1.Storage.ReadRowsRequest)5 ReadRowsResponse (com.google.cloud.bigquery.storage.v1beta1.Storage.ReadRowsResponse)5 StreamPosition (com.google.cloud.bigquery.storage.v1beta1.Storage.StreamPosition)5 Schema (org.apache.avro.Schema)5 Utf8 (org.apache.avro.util.Utf8)5 CreateReadSessionRequest (com.google.cloud.bigquery.storage.v1beta1.Storage.CreateReadSessionRequest)4 AvroRowConsumer (com.google.cloud.bigquery.storage.v1beta1.it.SimpleRowReader.AvroRowConsumer)4 ArrayList (java.util.ArrayList)3 Field (com.google.cloud.bigquery.Field)2 TableId (com.google.cloud.bigquery.TableId)2 TableInfo (com.google.cloud.bigquery.TableInfo)2 BigQueryStorageClient (com.google.cloud.bigquery.storage.v1beta1.BigQueryStorageClient)2 TableReadOptions (com.google.cloud.bigquery.storage.v1beta1.ReadOptions.TableReadOptions)2 Storage (com.google.cloud.bigquery.storage.v1beta1.Storage)2 TableReferenceProto (com.google.cloud.bigquery.storage.v1beta1.TableReferenceProto)2 InvalidArgumentException (com.google.api.gax.rpc.InvalidArgumentException)1