Execution

Overview

Transforms are executed via the RUN function, which is unique to the individual property installation and will be deployed in your deployment dataset during installation. This function contains all execution logic and should not be edited.

The RUN function take a single JSON argument run_options which determines the type of execution, some of which require additional arguments to configure the execution behaviour:

Run Mode Behaviour
auto Process any new partitions identified
full Process all partitions
incremental Process any new partitions identified after the lastest destination partition date
range Process any partitions in specified date range
first Process first n date partitions
last Process last n date partitions

Invoking the RUN function from the deployment dataset will generate the following code, which will execute with the default run mode (incremental).

DECLARE run_options JSON DEFAULT NULL;
CALL `[project_id].decode_[ga4_dataset_name].RUN`(run_options);

This is equivalent to the more compact syntax:

CALL `[project_id].decode_[ga4_dataset_name].RUN`(NULL);

Run Mode

The optional run_options JSON arguement enables override control over the run_mode at execution time.

The run_mode determines the execution logic, and can be one of the following auto, full, incremental (default), range, first and last. Each run_mode then has a small set of additional configuration options, which when set will override the configuration for every transformation in the run.

Setting the auto_partition_detection boolean as true means that any modifications to upstream partitions detected will trigger an automatic refresh of dependant downstream partitions. The recommended configuration is to set "partition_detection": true with "run_mode": "auto" for a fully automated, efficient pipeline.

Auto

Option Value Description
run_mode auto Any new partitions
auto_partition_detection true (default) or false Any changed partitions

Executing the RUN function in auto run mode with auto_partition_detection set to true transforms any partitions detected in the source tables which are not detected anywhere in the destination tables.

It will also check for any partitions which have been modified since the last run on that partition, and reprocess them too.

This is useful as - although Google states that it may update data up to 72 hours after delivery - in practice this is rare, so normally these recent partitions do not need reprocessing. Additionally, we have observed many occasions where data has changed significantly after the stated 72hr window.

CALL `deployment_dataset_id.RUN`(
  JSON'{"run_mode": "auto", "auto_partition_detection": "true"}'
  )

To disable auto_partition_detection, simply omit the option or set the option to false.

CALL `deployment_dataset_id.RUN`(
  JSON'{"run_mode": "auto"}'
  )

Explicitly setting the option value to false might be preferable for future clarity.

CALL `deployment_dataset_id.RUN`(
  JSON'{"run_mode": "auto", "auto_partition_detection": "false"}'
  )

Full

Option Value Description
run_mode full All partitions

Full refresh of all partitions:

CALL `deployment_dataset_id.RUN`(
  JSON'{"run_mode": "full"}')

Incremental (default)

Option Value Description
run_mode incremental Any new partitions after the lastest destination partition date
auto_partition_detection true (default) or false Any changed partitions

Transforms any partitions detected in the source tables which are not detected after the latest partition in the destination tables:

CALL `deployment_dataset_id.RUN`(
  JSON'{"run_mode": "incremental", "auto_partition_detection": "true"}')

Note that in incremental run_mode, setting the auto_partition_detection to true will automatically detect any unexpectedly modified partitions, regardless of the latest source partition date.

Range

Option Value Description
run_mode range Any partitions in specified date range
start_date YYYY-MM-DD Date range start
end_date YYYY-MM-DD Date range end

Transforms all partitions between the start_date and end_date:

CALL `deployment_dataset_id.RUN`(
  JSON'{"run_mode": "range", "start_date": "2025-11-01", "end_date": "2025-11-30"}')

First

Option Value Description
run_mode first First n date partitions
window_days 90 (default) Number of date partitions

Last

Transforms the last observed n partitions, set by window_days option:

CALL `deployment_dataset_id.RUN`(
  JSON'{"run_mode": "first", "window_days": 30}')

Last

Option Value Description
run_mode last Last n date partitions
window_days 90 (default) Number of date partitions

Transforms the last observed n partitions, set by window_days option:

CALL `deployment_dataset_id.RUN`(
  JSON'{"run_mode": "last", "window_days": 30}')