Events

The events table is a non-destructive, unfiltered, zero-assumption transformation of raw events data. This means that the row count by date will exactly match the row count for each corresponding date shard of the source events data.

Partitioning

The output events table is date-partitioned, which permits optimized date queries from downstream systems, is easy to work with for humans or agents and fits well within BigQuery's limits on table partitions.

If you have installed using the events_external template, the output table is hive-partitioned on the partition_date DATE column. This means that partition_date should be used in the WHERE clause of subsequent filtering statements.

The events_partitioned template will result in a date-partitioned Native Table in BigQuery, partitioned on the event_date DATE column, which should be used in subsequent WHERE clause filters.

Schema

The output schema is composed of existing and converted columns from the source data, restructured columns, as well as additional data and metadata columns.

Augmentation

In addition to the event_count, event_param and user_property STRUCT columns, the following additional fields are added to support downstream modelling. SQL definitions are included in the SQL Definitions section.

Metadata

Column	Data Type	Description
`event_id`	`STRING`	Unique id identifying each event
`ga_session_id`	`STRING`	Extracted from `ga_session_id` `event_params`, corresponding to the timestamp of the event start
`ga4_dataset_id`	`STRING`	The dataset_id (`project_id.dataset_name`) of the GA4 property
`session_id`	`STRING`	Session id derived from a combination of `stream_id`, `user_pseudo_id` and `ga_session_id`

Column	Data Type	Description
`ga_session_id_is_null`	`BOOL`	Flag whether the `ga_session_id` is null
`user_pseudo_id_is_null`	`BOOL`	Flag whether the `user_pseudo_id` is null
`consent_status`	`STRING`	Consent status for each event

Localization

The IANA tz database timezone in which each event occurred is derived from the geo.country, geo.region and geo.city columns. This supports offsetting the actual observed timestamps to compute the precise local time-of-day that an event occurred, for more accurate user journey time-delta metric modelling.

Column	Field Path	Data Type	Description
`local`	`local.timezone_id`	`STRING`	Timezone ID (e.g. `Europe/Madrid`)
`local`	`local.timezone_name`	`STRING`	Timezone name (e.g. `Central European Standard Time`)
`local`	`local.country_code`	`STRING`	ISO 3166-1 two-letter country code
`local`	`local.timezone_source`	`STRING`	`city`, `region` or `country` depending on the matching level of the geo fields
`local`	`local.latitude`	`STRING`	Latitude of the identified location
`local`	`local.longitude`	`STRING`	Longitude of the identified location
`local`	`local.event_date`	`DATE`	Locally-adjusted `event_date` column
`local`	`local.event_timestamp`	`DATE`	Locally-adjusted `event_timestamp` column
`local`	`local.event_previous_timestamp`	`DATE`	Locally-adjusted `event_previous_timestamp` column
`local`	`local.user_first_touch_timestamp`	`DATE`	Locally-adjusted `user_first_touch_timestamp` column

Approximate latitude and longitude is also included for versatile mapping applications.

SQL Definitions

Default Configuration

The default configurations for web and app streams are aligned to the Automatically Collected Events documentation. Additional detected values will also be automatically reflected in the output schema.

These default values can be excluded upon installation by setting the include_default_events and/or include_default_event_params BOOL options to false.

Metadata

Column Descriptions

Note that the events table does not contain column descriptions. This is intentional, as Decode GA4 builds a foundation model which will require subsequent transformation in order to improve downstream analytics performance and agentic understanding.

Since column names do not propagate automatically in subsequent transformation steps, the recommended approach is to add column names to the final analytics/agent-ready downstream tables only, which then act as valuable context for users, BI tools, Semantic Layers, Agents or MCPs.

The input schema and descriptions should be provided as context to the LLM which is writing the column descriptions, however human verification before publication is always advised.