Description
My environment:
$ python --version
Python 3.7.1
$ cat requirements.txt | grep bigquery
google-cloud-bigquery==1.8.0
here's a code sample:
while self.backgrounded_jobs_remain:
# find one ready job
for q in self.query_jobs:
self.logger.debug(f"checking to see if job {q['job']} is done...")
if q['job'].done():
self.logger.info(f"job {q['job']} is complete! returning it.")
return (q['job'], q['memo'])
# if we're still here, we need to wait some more
self.logger.info(f'waiting for one of {len(self.query_jobs)} pending jobs to complete...')
in the usual case, the output looks something like this:
DEBUG datapipes.bqreader : checking to see if job <google.cloud.bigquery.job.QueryJob object at 0x7f186aac9898> is done...
DEBUG urllib3.connectionpool : https://www.googleapis.com:443 "GET /bigquery/v2/projects/myproject/queries/c14e18c5-3ef2-472e-9eab-34ea85fbc2ee?maxResults=0&timeoutMs=0&location=US HTTP/1.1" 200 None
INFO datapipes.bqreader : waiting for one of 10 pending jobs to complete...
DEBUG datapipes.bqreader : checking to see if job <google.cloud.bigquery.job.QueryJob object at 0x7f186b18feb8> is done...
DEBUG urllib3.connectionpool : https://www.googleapis.com:443 "GET /bigquery/v2/projects/myproject/queries/58d2e451-c372-41ac-acb6-395f48eff6e1?maxResults=0&timeoutMs=0&location=US HTTP/1.1" 200 None
however, sometimes it looks like this:
DEBUG datapipes.bqreader : checking to see if job <google.cloud.bigquery.job.QueryJob object at 0x7f186aac9898> is done...
DEBUG urllib3.connectionpool : https://www.googleapis.com:443 "GET /bigquery/v2/projects/myproject/queries/c14e18c5-3ef2-472e-9eab-34ea85fbc2ee?maxResults=0&timeoutMs=0&location=US HTTP/1.1" 200 None
INFO datapipes.bqreader : waiting for one of 10 pending jobs to complete...
DEBUG datapipes.bqreader : checking to see if job <google.cloud.bigquery.job.QueryJob object at 0x7f186b18feb8> is done...
and then the output proceeds no further.
as best as i can tell, the .done()
function is getting stuck somewhere, without ever even issuing the urllib3
query (since there is no urllib3
log line.
when this happens, my task just becomes stuck indefinitely.
i'm running this in containers, so i don't have access to good debug tools but might be willing to install them if necessary.
i'm going to try to read the code for .done()
and see if i can spot what's happening, but wanted to open the issue to get additional eyes on it.
this happens pretty sporadically -- like, maybe one out of 50 jobs get stuck -- which makes debugging more difficult.