Search in sources :

Example 6 with Job

use of com.google.api.services.bigquery.model.Job in project beam by apache.

the class FakeJobService method pollJob.

@Override
public Job pollJob(JobReference jobRef, int maxAttempts) throws InterruptedException {
    BackOff backoff = BackOffAdapter.toGcpBackOff(FluentBackoff.DEFAULT.withMaxRetries(maxAttempts).withInitialBackoff(Duration.millis(10)).withMaxBackoff(Duration.standardSeconds(1)).backoff());
    Sleeper sleeper = Sleeper.DEFAULT;
    try {
        do {
            Job job = getJob(jobRef);
            if (job != null) {
                JobStatus status = job.getStatus();
                if (status != null && status.getState() != null && (status.getState().equals("DONE") || status.getState().equals("FAILED"))) {
                    return job;
                }
            }
        } while (BackOffUtils.next(sleeper, backoff));
    } catch (IOException e) {
        return null;
    }
    return null;
}
Also used : JobStatus(com.google.api.services.bigquery.model.JobStatus) Sleeper(com.google.api.client.util.Sleeper) IOException(java.io.IOException) Job(com.google.api.services.bigquery.model.Job) BackOff(com.google.api.client.util.BackOff)

Example 7 with Job

use of com.google.api.services.bigquery.model.Job in project beam by apache.

the class BigQueryServicesImplTest method testStartLoadJobRetry.

/**
   * Tests that {@link BigQueryServicesImpl.JobServiceImpl#startLoadJob} succeeds with a retry.
   */
@Test
public void testStartLoadJobRetry() throws IOException, InterruptedException {
    Job testJob = new Job();
    JobReference jobRef = new JobReference();
    jobRef.setJobId("jobId");
    jobRef.setProjectId("projectId");
    testJob.setJobReference(jobRef);
    // First response is 403 rate limited, second response has valid payload.
    when(response.getContentType()).thenReturn(Json.MEDIA_TYPE);
    when(response.getStatusCode()).thenReturn(403).thenReturn(200);
    when(response.getContent()).thenReturn(toStream(errorWithReasonAndStatus("rateLimitExceeded", 403))).thenReturn(toStream(testJob));
    Sleeper sleeper = new FastNanoClockAndSleeper();
    JobServiceImpl.startJob(testJob, new ApiErrorExtractor(), bigquery, sleeper, BackOffAdapter.toGcpBackOff(FluentBackoff.DEFAULT.backoff()));
    verify(response, times(2)).getStatusCode();
    verify(response, times(2)).getContent();
    verify(response, times(2)).getContentType();
}
Also used : JobReference(com.google.api.services.bigquery.model.JobReference) FastNanoClockAndSleeper(org.apache.beam.sdk.util.FastNanoClockAndSleeper) MockSleeper(com.google.api.client.testing.util.MockSleeper) FastNanoClockAndSleeper(org.apache.beam.sdk.util.FastNanoClockAndSleeper) Sleeper(com.google.api.client.util.Sleeper) Job(com.google.api.services.bigquery.model.Job) ApiErrorExtractor(com.google.cloud.hadoop.util.ApiErrorExtractor) Test(org.junit.Test)

Example 8 with Job

use of com.google.api.services.bigquery.model.Job in project beam by apache.

the class BigQueryServicesImplTest method testPollJobUnknown.

/**
   * Tests that {@link BigQueryServicesImpl.JobServiceImpl#pollJob} returns UNKNOWN.
   */
@Test
public void testPollJobUnknown() throws IOException, InterruptedException {
    Job testJob = new Job();
    testJob.setStatus(new JobStatus());
    when(response.getContentType()).thenReturn(Json.MEDIA_TYPE);
    when(response.getStatusCode()).thenReturn(200);
    when(response.getContent()).thenReturn(toStream(testJob));
    BigQueryServicesImpl.JobServiceImpl jobService = new BigQueryServicesImpl.JobServiceImpl(bigquery);
    JobReference jobRef = new JobReference().setProjectId("projectId").setJobId("jobId");
    Job job = jobService.pollJob(jobRef, Sleeper.DEFAULT, BackOff.STOP_BACKOFF);
    assertEquals(null, job);
    verify(response, times(1)).getStatusCode();
    verify(response, times(1)).getContent();
    verify(response, times(1)).getContentType();
}
Also used : JobStatus(com.google.api.services.bigquery.model.JobStatus) JobReference(com.google.api.services.bigquery.model.JobReference) JobServiceImpl(org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.JobServiceImpl) JobServiceImpl(org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.JobServiceImpl) Job(com.google.api.services.bigquery.model.Job) Test(org.junit.Test)

Example 9 with Job

use of com.google.api.services.bigquery.model.Job in project beam by apache.

the class BigQueryIOTest method testBigQueryQuerySourceInitSplit.

@Test
public void testBigQueryQuerySourceInitSplit() throws Exception {
    TableReference dryRunTable = new TableReference();
    Job queryJob = new Job();
    JobStatistics queryJobStats = new JobStatistics();
    JobStatistics2 queryStats = new JobStatistics2();
    queryStats.setReferencedTables(ImmutableList.of(dryRunTable));
    queryJobStats.setQuery(queryStats);
    queryJob.setStatus(new JobStatus()).setStatistics(queryJobStats);
    Job extractJob = new Job();
    JobStatistics extractJobStats = new JobStatistics();
    JobStatistics4 extractStats = new JobStatistics4();
    extractStats.setDestinationUriFileCounts(ImmutableList.of(1L));
    extractJobStats.setExtract(extractStats);
    extractJob.setStatus(new JobStatus()).setStatistics(extractJobStats);
    FakeJobService fakeJobService = new FakeJobService();
    FakeDatasetService fakeDatasetService = new FakeDatasetService();
    FakeBigQueryServices fakeBqServices = new FakeBigQueryServices().withJobService(fakeJobService).withDatasetService(fakeDatasetService);
    List<TableRow> expected = ImmutableList.of(new TableRow().set("name", "a").set("number", 1L), new TableRow().set("name", "b").set("number", 2L), new TableRow().set("name", "c").set("number", 3L), new TableRow().set("name", "d").set("number", 4L), new TableRow().set("name", "e").set("number", 5L), new TableRow().set("name", "f").set("number", 6L));
    PipelineOptions options = PipelineOptionsFactory.create();
    BigQueryOptions bqOptions = options.as(BigQueryOptions.class);
    bqOptions.setProject("project");
    String stepUuid = "testStepUuid";
    TableReference tempTableReference = createTempTableReference(bqOptions.getProject(), createJobIdToken(bqOptions.getJobName(), stepUuid));
    fakeDatasetService.createDataset(bqOptions.getProject(), tempTableReference.getDatasetId(), "", "");
    fakeDatasetService.createTable(new Table().setTableReference(tempTableReference).setSchema(new TableSchema().setFields(ImmutableList.of(new TableFieldSchema().setName("name").setType("STRING"), new TableFieldSchema().setName("number").setType("INTEGER")))));
    Path baseDir = Files.createTempDirectory(tempFolder, "testBigQueryQuerySourceInitSplit");
    String query = FakeBigQueryServices.encodeQuery(expected);
    BoundedSource<TableRow> bqSource = BigQueryQuerySource.create(stepUuid, StaticValueProvider.of(query), true, /* flattenResults */
    true, /* useLegacySql */
    fakeBqServices);
    options.setTempLocation(baseDir.toString());
    TableReference queryTable = new TableReference().setProjectId(bqOptions.getProject()).setDatasetId(tempTableReference.getDatasetId()).setTableId(tempTableReference.getTableId());
    fakeJobService.expectDryRunQuery(bqOptions.getProject(), query, new JobStatistics().setQuery(new JobStatistics2().setTotalBytesProcessed(100L).setReferencedTables(ImmutableList.of(queryTable))));
    List<TableRow> read = SourceTestUtils.readFromSource(bqSource, options);
    assertThat(read, containsInAnyOrder(Iterables.toArray(expected, TableRow.class)));
    SourceTestUtils.assertSplitAtFractionBehavior(bqSource, 2, 0.3, ExpectedSplitOutcome.MUST_BE_CONSISTENT_IF_SUCCEEDS, options);
    List<? extends BoundedSource<TableRow>> sources = bqSource.split(100, options);
    assertEquals(2, sources.size());
    BoundedSource<TableRow> actual = sources.get(0);
    assertThat(actual, CoreMatchers.instanceOf(TransformingSource.class));
}
Also used : Path(java.nio.file.Path) JobStatistics(com.google.api.services.bigquery.model.JobStatistics) JobStatistics2(com.google.api.services.bigquery.model.JobStatistics2) HashBasedTable(com.google.common.collect.HashBasedTable) Table(com.google.api.services.bigquery.model.Table) JobStatistics4(com.google.api.services.bigquery.model.JobStatistics4) TableSchema(com.google.api.services.bigquery.model.TableSchema) JsonSchemaToTableSchema(org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.JsonSchemaToTableSchema) BigQueryHelpers.toJsonString(org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.toJsonString) TableFieldSchema(com.google.api.services.bigquery.model.TableFieldSchema) JobStatus(com.google.api.services.bigquery.model.JobStatus) BigQueryHelpers.createTempTableReference(org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.createTempTableReference) TableReference(com.google.api.services.bigquery.model.TableReference) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) TableRow(com.google.api.services.bigquery.model.TableRow) Job(com.google.api.services.bigquery.model.Job) Test(org.junit.Test)

Example 10 with Job

use of com.google.api.services.bigquery.model.Job in project beam by apache.

the class BigQueryIOTest method testBigQueryNoTableQuerySourceInitSplit.

@Test
public void testBigQueryNoTableQuerySourceInitSplit() throws Exception {
    TableReference dryRunTable = new TableReference();
    Job queryJob = new Job();
    JobStatistics queryJobStats = new JobStatistics();
    JobStatistics2 queryStats = new JobStatistics2();
    queryStats.setReferencedTables(ImmutableList.of(dryRunTable));
    queryJobStats.setQuery(queryStats);
    queryJob.setStatus(new JobStatus()).setStatistics(queryJobStats);
    Job extractJob = new Job();
    JobStatistics extractJobStats = new JobStatistics();
    JobStatistics4 extractStats = new JobStatistics4();
    extractStats.setDestinationUriFileCounts(ImmutableList.of(1L));
    extractJobStats.setExtract(extractStats);
    extractJob.setStatus(new JobStatus()).setStatistics(extractJobStats);
    FakeDatasetService datasetService = new FakeDatasetService();
    FakeJobService jobService = new FakeJobService();
    FakeBigQueryServices fakeBqServices = new FakeBigQueryServices().withJobService(jobService).withDatasetService(datasetService);
    PipelineOptions options = PipelineOptionsFactory.create();
    BigQueryOptions bqOptions = options.as(BigQueryOptions.class);
    bqOptions.setProject("project");
    String stepUuid = "testStepUuid";
    TableReference tempTableReference = createTempTableReference(bqOptions.getProject(), createJobIdToken(bqOptions.getJobName(), stepUuid));
    List<TableRow> expected = ImmutableList.of(new TableRow().set("name", "a").set("number", 1L), new TableRow().set("name", "b").set("number", 2L), new TableRow().set("name", "c").set("number", 3L), new TableRow().set("name", "d").set("number", 4L), new TableRow().set("name", "e").set("number", 5L), new TableRow().set("name", "f").set("number", 6L));
    datasetService.createDataset(tempTableReference.getProjectId(), tempTableReference.getDatasetId(), "", "");
    Table table = new Table().setTableReference(tempTableReference).setSchema(new TableSchema().setFields(ImmutableList.of(new TableFieldSchema().setName("name").setType("STRING"), new TableFieldSchema().setName("number").setType("INTEGER"))));
    datasetService.createTable(table);
    String query = FakeBigQueryServices.encodeQuery(expected);
    jobService.expectDryRunQuery("project", query, new JobStatistics().setQuery(new JobStatistics2().setTotalBytesProcessed(100L).setReferencedTables(ImmutableList.of(table.getTableReference()))));
    Path baseDir = Files.createTempDirectory(tempFolder, "testBigQueryNoTableQuerySourceInitSplit");
    BoundedSource<TableRow> bqSource = BigQueryQuerySource.create(stepUuid, StaticValueProvider.of(query), true, /* flattenResults */
    true, /* useLegacySql */
    fakeBqServices);
    options.setTempLocation(baseDir.toString());
    List<TableRow> read = convertBigDecimaslToLong(SourceTestUtils.readFromSource(bqSource, options));
    assertThat(read, containsInAnyOrder(Iterables.toArray(expected, TableRow.class)));
    SourceTestUtils.assertSplitAtFractionBehavior(bqSource, 2, 0.3, ExpectedSplitOutcome.MUST_BE_CONSISTENT_IF_SUCCEEDS, options);
    List<? extends BoundedSource<TableRow>> sources = bqSource.split(100, options);
    assertEquals(2, sources.size());
    BoundedSource<TableRow> actual = sources.get(0);
    assertThat(actual, CoreMatchers.instanceOf(TransformingSource.class));
}
Also used : Path(java.nio.file.Path) JobStatistics(com.google.api.services.bigquery.model.JobStatistics) JobStatistics2(com.google.api.services.bigquery.model.JobStatistics2) HashBasedTable(com.google.common.collect.HashBasedTable) Table(com.google.api.services.bigquery.model.Table) JobStatistics4(com.google.api.services.bigquery.model.JobStatistics4) TableSchema(com.google.api.services.bigquery.model.TableSchema) JsonSchemaToTableSchema(org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.JsonSchemaToTableSchema) BigQueryHelpers.toJsonString(org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.toJsonString) TableFieldSchema(com.google.api.services.bigquery.model.TableFieldSchema) JobStatus(com.google.api.services.bigquery.model.JobStatus) BigQueryHelpers.createTempTableReference(org.apache.beam.sdk.io.gcp.bigquery.BigQueryHelpers.createTempTableReference) TableReference(com.google.api.services.bigquery.model.TableReference) PipelineOptions(org.apache.beam.sdk.options.PipelineOptions) TableRow(com.google.api.services.bigquery.model.TableRow) Job(com.google.api.services.bigquery.model.Job) Test(org.junit.Test)

Aggregations

Job (com.google.api.services.bigquery.model.Job)29 JobStatus (com.google.api.services.bigquery.model.JobStatus)16 JobReference (com.google.api.services.bigquery.model.JobReference)15 Test (org.junit.Test)14 IOException (java.io.IOException)8 JobConfiguration (com.google.api.services.bigquery.model.JobConfiguration)7 JobStatistics (com.google.api.services.bigquery.model.JobStatistics)6 TableReference (com.google.api.services.bigquery.model.TableReference)6 TableRow (com.google.api.services.bigquery.model.TableRow)5 JobServiceImpl (org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.JobServiceImpl)5 Sleeper (com.google.api.client.util.Sleeper)4 JobConfigurationQuery (com.google.api.services.bigquery.model.JobConfigurationQuery)4 JobStatistics2 (com.google.api.services.bigquery.model.JobStatistics2)4 Table (com.google.api.services.bigquery.model.Table)4 MockSleeper (com.google.api.client.testing.util.MockSleeper)3 JobStatistics4 (com.google.api.services.bigquery.model.JobStatistics4)3 TableFieldSchema (com.google.api.services.bigquery.model.TableFieldSchema)3 TableSchema (com.google.api.services.bigquery.model.TableSchema)3 ApiErrorExtractor (com.google.cloud.hadoop.util.ApiErrorExtractor)3 HashBasedTable (com.google.common.collect.HashBasedTable)3