Test
Objectives
Installing GA4 does not incur any Decode GA4 consumption costs, and gives you immediate access
to a number of useful metadata resources. However it is not until you have called the RUN function that your data is transformed and your output tables are created.
You can use the pricing calculator to estimate the volume and cost for monthly automation and backfill of historic data.
However, if you want to sample a subset of data before committing to a full backfill, then the following approach is recommended.
Approach
This approach will install Decode GA4 and run on a recent subset of data, with the option to run a full or targeted backfill after verification.
Test Installation
- Subscribe to Decode GA4 via Google Cloud Marketplace.
- Log into the Installer App by clicking
Manage on provideror by going directly to ga4.decodedata.io. - Upon login you will be taken to the
GA4 Propertiespage. Select theGoogle Cloud ProjectandGA4 datasetfrom the dropdowns. - Select the
Installation Type. If the Quick Install type isExternal Storage Destination, then you need to input theGCS Bucket Nametoo. You will now see the installation options. - Ensure that the
Run on installationoption is checked. - Set the number of days to backfill — a minimum of 7 days is required, with 30–90 days recommended for a meaningful sample.
- Click
Install Decode GA4. - Navigate to BigQuery Studio, where you will find both the linked dataset and the newly created dataset and resources.
Verification
Take some time to review the output structure and verify it meets your expectations. If needed, revise your configuration — for example, to exclude any parameters that are not relevant to your analysis.
Backfill
Upon verification, you may want to run a full or targeted backfill. The commands and configurations are presented below, and the full documentation is available in the run section.
Full
To process all unprocessed data in one go, set the run_mode to auto as a one-time operation.
CALL `deployment_dataset_id.RUN`(
JSON'{"run_mode": "auto"}'
)Executing this procedure will fill in any unprocessed date partitions from the source to the destination.
Range
To process a specific date range, set the range with the desired start_date and end_date values.
CALL `deployment_dataset_id.RUN`(
JSON'{"run_mode": "range", "start_date": "YYYY-MM-DD", "end_date": "YYYY-MM-DD"}')Executing this procedure will process or reprocess any date partitions in the date range from the source to the destination.
Automation
Once you are satisfied with the output, schedule the function as per the options in the automate section.
For automation, the run_mode is typically set to auto or incremental, which will only process unprocessed date partitions.
Depending on your downstream use case, you may need to set the auto_partition_detection, auto_schema_evolution and auto_parameter_evolution booleans to configure behaviour in response to input changes to contents or schema.
Full documentation is provided in the run section.