Sunday 2 p.m.–2:40 p.m.

Analysing Data with Python and BigQuery

Tom Clark

Audience level:
Intermediate

Description

BigQuery is a Google tool to quickly analyse large sets of data. A web console and CLI tools are available, but we can also use BigQuery's remote API and Python libraries. In this talk we will introduce BigQuery and use Python to manage a project and analyse some data. Most of the examples found online are either too basic or very advanced. We hope to land somewhere between those extremes.

Abstract

My aim is to present enough information to show how to run a complete BigQuery project that not completely trivial, but that doesn't require any machinery beyond Python and the BigQuery tools. My basic outline is

  1. Intro to BigQuery
  2. Tools
    • web console
    • CLI tools
    • Python library
  3. Connecting to BigQuery (with Python)
  4. Creating a dataset (also with Python)
  5. Uploading Data (with... you get the idea)
  6. Querying
  7. Exporting data and cleaning up after a project
  8. Some best practices