Working with files
Connecting to data sources
Coding and analysis tools
Security and privacy
Deepnote for education
Google BigQuery is a fully managed, serverless data warehouse with built-in machine learning capabilities.
What can you do with the Google BigQuery integration?
As with all of Deepnote's SQL-related integrations, the Google BigQuery integration allows you to query databases and explore the structure of your warehouse. Think "SQL editor" but with all the modern productivity boosters that come with Deepnote's notebook. For example,
- Write native SQL and Python in the same notebook
- Search your entire warehouse instantly via the integrated schema explorer
- Get intelligent autocomplete for columns, tables, and databases
- Interactively explore data without writing any additional code
Available on Enterprise plan
How to connect to Google BigQuery
Authenticating with a service account
A service account will provide a shared connection to Google BigQuery. That is, all collaborators with least Editor privileges will be able run queries against databases provisioned in the service account.
To use the service account authentication, you will need to supply a JSON service account key. Click here for a guide on creating a JSON service account key. Your service account key will be encrypted and stored in Deepnote's database.
Make sure to enable BigQuery API for your GCP project. The service account needs sufficient permissions for the resources you want to use. If the authorisation process fails, we recommend visiting BigQuery's access control page.
Grant sufficient permissions to your service account. The minimum required permissions are: BigQuery Job User, BigQuery Read Session User, and BigQuery Data Viewer. The best practice is to only grant this last role for specific datasets or tables you wish to explore in Deepnote.
Authenticating to BigQuery with Google OAuth
Available on the Enterprise plan
With BigQuery's Google OAuth authentication you can give every member of your Deepnote workspace their own set of credentials. This ensures greater security by using short-lived tokens and enabling the use of multi-factor authentication. Follow the principle of least privilege and use granular access control for various BigQuery resources to ensure users can only access the data they need. Click here to learn how to set up BigQuery's Google OAuth authentication in Deepnote.
Working with data from Google BigQuery
Now that you are connected to your Google BigQuery can do the following actions in Deepnote:
Click the newly created integration in the Integrations section to open the schema browser. Click here to learn more about the schema browser.
Create an SQL block that points to your warehouse and begin querying your data. Autocomplete for columns, tables, and databases will be at your fingertips as you type. Click here to learn more about SQL blocks.
Explore column distributions as well as the sorting and filtering capabilities on the results set. All results are displayed as an interactive Pandas DataFrame. Click here to learn about interactive DataFrame output.
Pipe the results of your query into a chart block for rapid interactive data visualization. Click here to learn more about chart blocks.
Using pure Python to connect to BigQuery
To go beyond querying (like listing tables, creating datasets, etc.), you may need to use the official Python client library (docs).
Use this code snippet to authenticate the python client using the integration's service account:
import json import os from google.oauth2 import service_account from google.cloud import bigquery bq_credentials = service_account.Credentials.from_service_account_info( json.loads(os.environ['INTEGRATION_NAME_SERVICE_ACCOUNT'])) client = bigquery.Client(credentials=bq_credentials, project=bq_credentials.project_id)
Just replace the
INTEGRATION_NAME with an uppercased, underscore-connected name of your integration. If you have trouble finding it, run this one-liner to list environment variables that contain service accounts:
[var for var in os.environ if '_SERVICE_ACCOUNT' in var]
Once the BigQuery
client is initialized, you can use it to run queries and materialize the results as dataframes like this:
sql = """ SELECT * FROM `bigquery-public-data.usa_names.usa_1910_current` LIMIT 1000 """ df = client.query(sql).to_dataframe() ``