How to Track User Journeys in GA4 BigQuery with Session Stitching

Why Session Stitching Matters in GA4

When analyzing user behavior in GA4 BigQuery exports, you’ll quickly realize: sessions are not always clean, continuous, or unified. With Consent Mode, device switching, and delayed user identification (user_id or gclid), GA4 often fragments what should be a single journey into multiple separate sessions.

This is where session stitching comes in — a method to programmatically piece together related sessions into one coherent user journey.

What is Session Stitching?

Session stitching is the process of linking fragmented sessions from the same user into a single logical journey based on time, identifiers, or behavioral patterns.

In GA4 BigQuery exports, a single user journey can be split due to:

Stitching sessions allows you to answer questions like:

How GA4 Fragments Sessions

GA4 defines a session as a group of events with the same ga_session_id, scoped to a user_pseudo_id.

However, issues arise when:

ScenarioResult
Consent is denied at firstGA4 assigns a temporary user_pseudo_id
Consent granted laterGA4 reassigns new session under a real user_pseudo_id
Identifier appears lateEvents stitched later don’t share original session ID
Cross-device interactionEach device = different user_pseudo_id/session_id
BigQuery SQL: Stitching Sessions Based on User ID

Here’s a simplified example that stitches sessions based on user_id when available, and falls back to user_pseudo_id when it’s not:

WITH all_events AS (
  SELECT
    event_date,
    user_pseudo_id,
    user_id,
    event_timestamp,
    traffic_source.medium,
    ga_session_id,
    event_name
  FROM
    `your_project.your_dataset.events_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20250701' AND '20250728'
),

with_session_keys AS (
  SELECT *,
    COALESCE(user_id, user_pseudo_id) AS stitched_user,
    CONCAT(COALESCE(user_id, user_pseudo_id), '_', ga_session_id) AS stitched_session_id
  FROM all_events
)

SELECT
  stitched_user,
  stitched_session_id,
  MIN(event_timestamp) AS session_start,
  MAX(event_timestamp) AS session_end,
  COUNT(DISTINCT event_name) AS total_events,
  COUNTIF(event_name = 'purchase') AS purchases,
  ARRAY_AGG(DISTINCT traffic_source.medium IGNORE NULLS) AS session_mediums
FROM with_session_keys
GROUP BY stitched_user, stitched_session_id
ORDER BY stitched_user, session_start
What the Output Reveals

This query gives you a reconstructed view of user sessions, unified by either user_id or user_pseudo_id. From the results, you can:

Tips for Storing Stitched Sessions

If you’re running this regularly:

  1. Materialize the stitched session table into a persistent table (e.g., session_stitched_daily)
  2. Partition it by event_date for performance
  3. Join it with your CRM or ad platform data to extend the journey
Related Posts
TLDR

Comments

3 responses

Discover more from GA4 BigQuery

Subscribe now to keep reading and get access to the full archive.

Continue reading