Quickstart

Activate

Decode GA4 needs to be subscribed to for individual projects via Google Cloud Marketplace.

Deploy

Deployment is executed by running a deployment script in BigQuery Studio, which will deploy the custom installer function into your GA4 dataset. Note that the transform_config_template value determines the destination type (external or date-partitioned).

External Output Table

This is the recommended architecture as it will result in lower storage cost, as well as automated schema evolution for new source data columns/sub-columns and observed event parameter, user property and item parameter values. It also unlocks cross-cloud and external compute access to transformed event data.

However it does require you to create a Google Cloud Storage bucket to store the data. It will process each date into different date-folder file sets, so the initial export will be slower than the partitioned table transform configuration.

DECLARE options JSON;

SET options = JSON '''
    {
        "ga4_dataset_id": "project_id.ga4_dataset_name",
        "transform_config_template": "events_external",
        "gcs_bucket_name": "bucketname"
    }
    ''';

EXECUTE IMMEDIATE (
    SELECT `decode-ga4.eu.deploy_installer`(options)
    );

Partitioned Output Table

This is the quickest approach as it does not require a GCS bucket, and it can process all of the data in one query. However this architecture does not benefit from the cost and automated schema evolution from which the external table architecture benefits. Use this approach for rapid inspection of the the output structure, however the external table architecture is recommended for long-term usage due to multiple cost and capability developments.

DECLARE options JSON;

SET options = JSON '''
    {
        "ga4_dataset_id": "project_id.ga4_dataset_name",
        "transform_config_template": "events_partitioned"
    }
    ''';

EXECUTE IMMEDIATE (
    SELECT `decode-ga4.eu.deploy_installer`(options)
    );

Installation

The above scripts will deploy the install_decode_ga4 routine into the GA4 dataset identified by the ga4_dataset_id. Execute the installation script by running:

CALL `project_id.ga4_dataset_name.install_decode_ga4`();

This will deploy Deode GA4 resources into a new dataset called decode_analytics_xxxxxxxx, where analytics_xxxxxxxx is the name of your GA4 dataset.

Run

In order to run the transform with the default options and create your output table, simply call the routine:

CALL `project_id.decode_analytics_xxxxxxxx.RUN`();

This will execute the transformation with the transform configuration set by the transform_config.

Subsequent executions will be incremental by default, so no data will be processed unless new, unprocessed data is detected.

Schedule

This can be invoked manually, on a schedule or on an event-driven basis. Note that if the RUN routine is invoked and there is no new data detected, only negligible metadata and date scan query costs are incurred.

As GA4 data will arrive into BigQuery at unpredictable times, a common approach is to schedule the RUN function using a BigQuery Scheduled Query multiple times a day (e.g. every 3 hrs).