How to Track User Journeys in GA4 BigQuery with Session Stitching

Why Session Stitching Matters in GA4

When analyzing user behavior in GA4 BigQuery exports, you’ll quickly realize: sessions are not always clean, continuous, or unified. With Consent Mode, device switching, and delayed user identification (user_id or gclid), GA4 often fragments what should be a single journey into multiple separate sessions.

This is where session stitching comes in — a method to programmatically piece together related sessions into one coherent user journey.

What is Session Stitching?

Session stitching is the process of linking fragmented sessions from the same user into a single logical journey based on time, identifiers, or behavioral patterns.

In GA4 BigQuery exports, a single user journey can be split due to:

  • Lack of initial consent (analytics_storage=’denied’)
  • Cross-device browsing (mobile → desktop)
  • Late arrival of identifiers (user_id, gclid, session_id)

Stitching sessions allows you to answer questions like:

  • What sources led to a final conversion?
  • How many sessions did a user take before purchasing?
  • Where in the funnel did drop-offs happen across days or devices?
How GA4 Fragments Sessions

GA4 defines a session as a group of events with the same ga_session_id, scoped to a user_pseudo_id.

However, issues arise when:

ScenarioResult
Consent is denied at firstGA4 assigns a temporary user_pseudo_id
Consent granted laterGA4 reassigns new session under a real user_pseudo_id
Identifier appears lateEvents stitched later don’t share original session ID
Cross-device interactionEach device = different user_pseudo_id/session_id
BigQuery SQL: Stitching Sessions Based on User ID

Here’s a simplified example that stitches sessions based on user_id when available, and falls back to user_pseudo_id when it’s not:

WITH all_events AS (
  SELECT
    event_date,
    user_pseudo_id,
    user_id,
    event_timestamp,
    traffic_source.medium,
    ga_session_id,
    event_name
  FROM
    `your_project.your_dataset.events_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20250701' AND '20250728'
),

with_session_keys AS (
  SELECT *,
    COALESCE(user_id, user_pseudo_id) AS stitched_user,
    CONCAT(COALESCE(user_id, user_pseudo_id), '_', ga_session_id) AS stitched_session_id
  FROM all_events
)

SELECT
  stitched_user,
  stitched_session_id,
  MIN(event_timestamp) AS session_start,
  MAX(event_timestamp) AS session_end,
  COUNT(DISTINCT event_name) AS total_events,
  COUNTIF(event_name = 'purchase') AS purchases,
  ARRAY_AGG(DISTINCT traffic_source.medium IGNORE NULLS) AS session_mediums
FROM with_session_keys
GROUP BY stitched_user, stitched_session_id
ORDER BY stitched_user, session_start
What the Output Reveals

This query gives you a reconstructed view of user sessions, unified by either user_id or user_pseudo_id. From the results, you can:

  • See how many stitched sessions led to conversion
  • Analyze time between first touch and final action
  • Attribute sessions to channels with greater accuracy
Tips for Storing Stitched Sessions

If you’re running this regularly:

  1. Materialize the stitched session table into a persistent table (e.g., session_stitched_daily)
  2. Partition it by event_date for performance
  3. Join it with your CRM or ad platform data to extend the journey
Related Posts
TLDR
  • GA4 often splits user sessions due to consent and device fragmentation
  • Session stitching allows you to reassemble those journeys using SQL
  • Use COALESCE(user_id, user_pseudo_id) to group by user
  • Store your stitched sessions for ongoing reporting or modeling

Discover more from

Subscribe to get the latest posts sent to your email.

Discover more from

Subscribe now to keep reading and get access to the full archive.

Continue reading