Events

The events table is a non-destructive, unfiltered, zero-assumption transformation of raw events data. This means that the row count by date will exactly match the row count for each corresponding date shard of the source events data.

Partitioning

The output events table is date-partitioned, which permits optimized date queries from downstream systems, is easy to work with for humans or agents and fits well within BigQuery's limits on table partitions.

If you have installed using the events_external template, the output table is hive-partitioned on the partition_date DATE column. This means that partition_date should be used in the WHERE clause of subsequent filtering statements.

The events_partitioned template will result in a date-partitioned Native Table in BigQuery, partitioned on the event_date DATE column, which should be used in subsequent WHERE clause filters.

Schema

The output schema is composed of existing and converted columns from the source data, restructured columns, as well as additional data and metadata columns.

Augmentation

In addition to the event_count, event_param and user_property STRUCT columns, the following additional fields are added to support downstream modelling. SQL definitions are included in the SQL Definitions section.

Metadata

Column Data Type Description
event_id STRING Unique id identifying each event
ga_session_id STRING Extracted from ga_session_id event_params, corresponding to the timestamp of the event start
ga4_dataset_id STRING The dataset_id (project_id.dataset_name) of the GA4 property
session_id STRING Session id derived from a combination of stream_id, user_pseudo_id and ga_session_id
Column Data Type Description
ga_session_id_is_null BOOL Flag whether the ga_session_id is null
user_pseudo_id_is_null BOOL Flag whether the user_pseudo_id is null
consent_status STRING Consent status for each event

Localization

The IANA tz database timezone in which each event occurred is derived from the geo.country, geo.region and geo.city columns. This supports offsetting the actual observed timestamps to compute the precise local time-of-day that an event occurred, for more accurate user journey time-delta metric modelling.

Column Field Path Data Type Description
local local.timezone_id STRING Timezone ID (e.g. Europe/Madrid)
local local.timezone_name STRING Timezone name (e.g. Central European Standard Time)
local local.country_code STRING ISO 3166-1 two-letter country code
local local.timezone_source STRING city, region or country depending on the matching level of the geo fields
local local.latitude STRING Latitude of the identified location
local local.longitude STRING Longitude of the identified location
local local.event_date DATE Locally-adjusted event_date column
local local.event_timestamp DATE Locally-adjusted event_timestamp column
local local.event_previous_timestamp DATE Locally-adjusted event_previous_timestamp column
local local.user_first_touch_timestamp DATE Locally-adjusted user_first_touch_timestamp column

Approximate latitude and longitude is also included for versatile mapping applications.

SQL Definitions

Default Configuration

The default configurations for web and app streams are aligned to the Automatically Collected Events documentation. Additional detected values will also be automatically reflected in the output schema.

These default values can be excluded upon installation by setting the include_default_events and/or include_default_event_params BOOL options to false.

Metadata

Column Descriptions

Note that the events table does not contain column descriptions. This is intentional, as Decode GA4 builds a foundation model which will require subsequent transformation in order to improve downstream analytics performance and agentic understanding.

Since column names do not propagate automatically in subsequent transformation steps, the recommended approach is to add column names to the final analytics/agent-ready downstream tables only, which then act as valuable context for users, BI tools, Semantic Layers, Agents or MCPs.

The input schema and descriptions should be provided as context to the LLM which is writing the column descriptions, however human verification before publication is always advised.