DuckDB

Overview

DuckDB UI is a browser-based interface for running DuckDB locally. Once your Decode GA4 data is exported to GCS as Parquet files (using the events_external template), DuckDB can query it directly — no data loading or ETL pipelines required.

Requirements

  • DuckDB v1.2 or later
  • Decode GA4 data exported to a GCS bucket using the events_external template
  • A GCS HMAC key — created under Cloud Storage → Settings → Interoperability → Create a key for a service account

Installation

Install DuckDB using your preferred method:

macOS (Homebrew)

brew install duckdb

Python

pip install duckdb

Windows (winget)

winget install DuckDB.cli

For other platforms, download the latest binary from duckdb.org/docs/installation.

Connection Setup

1. Create a GCS HMAC key

In the Google Cloud Console, navigate to Cloud Storage → Settings → Interoperability and create an HMAC key for the service account that has read access to your Decode GA4 bucket.

Save the Access ID and Secret — the secret is only shown once.

2. Launch the DuckDB UI

From your terminal, start DuckDB with the UI flag:

duckdb -ui

This opens the DuckDB UI in your browser at http://localhost:4213.

3. Register GCS credentials

In the DuckDB UI SQL editor, run:

CREATE PERSISTENT SECRET (
    TYPE GCS,
    KEY_ID 'YOUR_HMAC_ACCESS_ID',
    SECRET 'YOUR_HMAC_SECRET'
);

Replace YOUR_HMAC_ACCESS_ID and YOUR_HMAC_SECRET with the values from step 1.

Using PERSISTENT stores the secret in ~/.duckdb/stored_secrets/ so you only need to do this once. To update the credentials later, use CREATE OR REPLACE PERSISTENT SECRET.

4. Verify access

Confirm that DuckDB can reach your bucket:

SELECT count(*) FROM 'gcs://YOUR_BUCKET_NAME/YOUR_DATASET/events/*.parquet';

Replace YOUR_BUCKET_NAME and YOUR_DATASET with your actual values.

Example

Query events directly

SELECT
    partition_date AS event_date,
    event_param.page_location,
    COUNT(*) AS pageviews
FROM 'gcs://YOUR_BUCKET_NAME/YOUR_DATASET/events/**/*.parquet'
WHERE event_name = 'page_view'
GROUP BY partition_date, page_location
ORDER BY partition_date, pageviews DESC;

Create a persistent view

CREATE OR REPLACE VIEW decode_events AS
SELECT *
FROM 'gcs://YOUR_BUCKET_NAME/YOUR_DATASET/events/**/*.parquet';

You can then query decode_events like a regular table across all DuckDB sessions that use the same database file.

Further Reading