MotherDuck

Overview

MotherDuck is a cloud-based DuckDB service that lets you run fast analytical queries directly on your data. Once your Decode GA4 data is exported to GCS as Parquet files (using the events_external template), MotherDuck can query it directly — no data loading or ETL pipelines required.

Requirements

  • A MotherDuck account
  • Decode GA4 data exported to a GCS bucket using the events_external template
  • A GCS HMAC key — created under Cloud Storage → Settings → Interoperability → Create a key for a service account

Connection Setup

1. Create a GCS HMAC key

In the Google Cloud Console, navigate to Cloud Storage → Settings → Interoperability and create an HMAC key for the service account that has read access to your Decode GA4 bucket.

Save the Access ID and Secret — the secret is only shown once.

2. Register GCS credentials in MotherDuck

In MotherDuck, open a SQL editor and run:

CREATE SECRET IN MOTHERDUCK (
    TYPE GCS,
    KEY_ID 'YOUR_HMAC_ACCESS_ID',
    SECRET 'YOUR_HMAC_SECRET'
);

Replace YOUR_HMAC_ACCESS_ID and YOUR_HMAC_SECRET with the values from step 1.

This creates a persistent secret stored in your MotherDuck account — you only need to do this once. To update the credentials later, use CREATE OR REPLACE SECRET IN MOTHERDUCK.

3. Verify access

Confirm that MotherDuck can reach your bucket:

SELECT count(*) FROM 'gcs://YOUR_BUCKET_NAME/YOUR_DATASET/events/*.parquet';

Replace YOUR_BUCKET_NAME and YOUR_DATASET with your actual values.

Example

Query events directly

SELECT
    partition_date AS event_date,
    event_param.page_location,
    COUNT(*) AS pageviews
FROM 'gcs://YOUR_BUCKET_NAME/YOUR_DATASET/events/**/*.parquet'
WHERE event_name = 'page_view'
GROUP BY partition_date, page_location
ORDER BY partition_date, pageviews DESC;

Create a persistent view

CREATE OR REPLACE VIEW decode_pageviews AS
SELECT
    partition_date AS event_date,
    event_param.page_location,
    COUNT(*) AS pageviews
FROM 'gcs://YOUR_BUCKET_NAME/YOUR_DATASET/events/**/*.parquet'
WHERE event_name = 'page_view'
GROUP BY partition_date, page_location;

You can then query decode_pageviews like a regular table across all MotherDuck sessions.

Further Reading