Unlock the power of Google Analytics with Big Query

Cross-Device Session Stitching and Multi-Touch Attribution in GA4 Using BigQuery

In today’s multi-device world, users often start a journey on mobile, continue on desktop, and convert on a tablet. GA4 UI can deduplicate users but splits sessions across devices, making session counts and marketing attribution unreliable.

This creates a major challenge for marketers: Which touchpoint truly drove the conversion?

BigQuery provides the ability to reconstruct sessions across devices using session stitching rules, giving accurate insights into user journeys, engagement, and campaign ROI.


The Problem: Session Counts and Attribution Don’t Align

GA4 sessions can be split due to:

  1. Device Changes: Mobile → desktop → tablet sessions are counted separately.
  2. Inactivity Timeout: Sessions end after 30 minutes of inactivity.
  3. Traffic Source Changes: Switching campaigns mid-session triggers a new session.
  4. Midnight Cutoff: Sessions crossing midnight are split for daily reporting.

Impact:


Use Case: Multi-Touch Attribution Across Devices

Scenario:
An e-commerce site runs multiple campaigns (email, paid search, social, affiliates). Users may interact with several devices and channels before purchasing.

Problem:

Solution:

Outcome:


Simple SQL Example: Count Session Starts

For basic session counts, you can start with the session_start event:

SELECT
  user_pseudo_id,
  COUNT(event_name) AS sessions
FROM
  `your_project.analytics_XXXX.events_*`
WHERE
  event_name = 'session_start'
GROUP BY user_pseudo_id
ORDER BY sessions DESC;

Explanation:
Counts every session_start event per user. Simple, but doesn’t handle multi-device or session-splitting scenarios.


Deep Dive SQL: Cross-Device Session Stitching

Here’s a full session-stitching SQL for GA4 in BigQuery:

-- Step 1: Collect events and previous context per user
WITH events_with_lag AS (
  SELECT
    user_pseudo_id,
    user_id, -- Use for cross-device tracking
    event_timestamp,
    event_name,
    traffic_source.source AS traffic_source,
    traffic_source.medium AS traffic_medium,
    device.category AS device_category,
    LAG(event_timestamp) OVER(PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS prev_ts,
    LAG(traffic_source.source) OVER(PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS prev_source,
    LAG(device.category) OVER(PARTITION BY user_pseudo_id ORDER BY event_timestamp) AS prev_device
  FROM
    `your_project.analytics_XXXX.events_*`
),

-- Step 2: Identify session boundaries
session_flags AS (
  SELECT
    *,
    CASE
      WHEN TIMESTAMP_DIFF(TIMESTAMP_MICROS(event_timestamp), TIMESTAMP_MICROS(prev_ts), MINUTE) > 30 THEN 1
      WHEN traffic_source != prev_source THEN 1
      WHEN EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp)) != EXTRACT(DATE FROM TIMESTAMP_MICROS(prev_ts)) THEN 1
      WHEN device_category != prev_device THEN 1
      ELSE 0
    END AS new_session_flag
  FROM events_with_lag
),

-- Step 3: Assign session numbers per user
sessionized_events AS (
  SELECT
    *,
    SUM(new_session_flag) OVER (PARTITION BY user_pseudo_id ORDER BY event_timestamp ROWS UNBOUNDED PRECEDING) + 1 AS session_number
  FROM session_flags
)

-- Step 4: Aggregate sessions and map conversions
SELECT
  user_pseudo_id,
  user_id,
  session_number,
  MIN(TIMESTAMP_MICROS(event_timestamp)) AS session_start_time,
  MAX(TIMESTAMP_MICROS(event_timestamp)) AS session_end_time,
  COUNT(*) AS events_in_session,
  ARRAY_AGG(DISTINCT traffic_source) AS traffic_sources_in_session,
  ARRAY_AGG(DISTINCT device_category) AS devices_in_session,
  SUM(CASE WHEN event_name = 'purchase' THEN 1 ELSE 0 END) AS conversions_in_session
FROM sessionized_events
GROUP BY user_pseudo_id, user_id, session_number
ORDER BY user_pseudo_id, session_start_time;

Key Features:


Key Takeaways


TL;DR

Discover more from GA4 BigQuery

Subscribe now to keep reading and get access to the full archive.

Continue reading