BigQuery and Public Dataset

BigQuery is Serverless, highly scalable cloud data warehouse solution. BigQuery is extremely fast, and can process terabytes of data in seconds and petabytes of data in minutes. BigQuery also comes with some really useful public data sets which can be used to learn about the BigQuery or associated tools.

SandBox

BigQuery Sandbox is the quickest way to start with BigQuery without any payment. See the official Google Cloud blog post “Query without a credit card: introducing BigQuery sandbox” for more information. Here we are going to focus on you getting started and querying as quickly as possible.

BigQuery UI

BigQuery UI is the main Web interface to interact with BigQuery. If we are already signed up for Google Cloud Account then BigQuery should be already available. If not then we can take advantage of the BigQuery Sandbox. BigQuery sandbox is a BigQuery-specific initiative to provide easy access to BigQuery without needing to grab your credit card.

COVID-19 Dataset Fee Usages

As a special case, this BigQuery dataset is free to query even outside the free tier (until Sep 2020). If you join the COVID-19 data against any other datasets, the first 1 TB of querying per month of those other datasets is free and included in the sandbox program

Public Dataset

Google pays for the storage of these datasets and provides public access to the data via Google cloud project. We need to pay only for the queries that you perform on the data and that exceeds free quota. Moreover, there’s a 1TB per month free tier, making getting started super easy. Following are some of the popular public datasets to explore.

Exploring the Datasets



Preview and Query data


First Query
Query to Get cases from all regions of the world
SELECT country_region,  SUM(confirmed) as confirmed, SUM(deaths) as deaths, SUM(recovered) as recovered, SUM(active) as active FROM `bigquery-public-data.covid19_jhu_csse.summary` group by country_region order by confirmed desc LIMIT 1000
Query Result

References