Skip to content

BigQuery: QueryJob().done() method gets stuck #7831

Closed
@igor47

Description

@igor47

My environment:

 $ python --version
Python 3.7.1
 $ cat requirements.txt | grep bigquery
google-cloud-bigquery==1.8.0

here's a code sample:

    while self.backgrounded_jobs_remain:
      # find one ready job
      for q in self.query_jobs:
        self.logger.debug(f"checking to see if job {q['job']} is done...")
        if q['job'].done():
          self.logger.info(f"job {q['job']} is complete! returning it.")
          return (q['job'], q['memo'])

      # if we're still here, we need to wait some more
self.logger.info(f'waiting for one of {len(self.query_jobs)} pending jobs to complete...')

in the usual case, the output looks something like this:

DEBUG datapipes.bqreader : checking to see if job <google.cloud.bigquery.job.QueryJob object at 0x7f186aac9898> is done...
DEBUG urllib3.connectionpool : https://www.googleapis.com:443 "GET /bigquery/v2/projects/myproject/queries/c14e18c5-3ef2-472e-9eab-34ea85fbc2ee?maxResults=0&timeoutMs=0&location=US HTTP/1.1" 200 None
INFO datapipes.bqreader : waiting for one of 10 pending jobs to complete...
DEBUG datapipes.bqreader : checking to see if job <google.cloud.bigquery.job.QueryJob object at 0x7f186b18feb8> is done...
DEBUG urllib3.connectionpool : https://www.googleapis.com:443 "GET /bigquery/v2/projects/myproject/queries/58d2e451-c372-41ac-acb6-395f48eff6e1?maxResults=0&timeoutMs=0&location=US HTTP/1.1" 200 None

however, sometimes it looks like this:

DEBUG datapipes.bqreader : checking to see if job <google.cloud.bigquery.job.QueryJob object at 0x7f186aac9898> is done...
DEBUG urllib3.connectionpool : https://www.googleapis.com:443 "GET /bigquery/v2/projects/myproject/queries/c14e18c5-3ef2-472e-9eab-34ea85fbc2ee?maxResults=0&timeoutMs=0&location=US HTTP/1.1" 200 None
INFO datapipes.bqreader : waiting for one of 10 pending jobs to complete...
DEBUG datapipes.bqreader : checking to see if job <google.cloud.bigquery.job.QueryJob object at 0x7f186b18feb8> is done...

and then the output proceeds no further.
as best as i can tell, the .done() function is getting stuck somewhere, without ever even issuing the urllib3 query (since there is no urllib3 log line.
when this happens, my task just becomes stuck indefinitely.

i'm running this in containers, so i don't have access to good debug tools but might be willing to install them if necessary.
i'm going to try to read the code for .done() and see if i can spot what's happening, but wanted to open the issue to get additional eyes on it.

this happens pretty sporadically -- like, maybe one out of 50 jobs get stuck -- which makes debugging more difficult.

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions