Unlock the power of Google Analytics with Big Query

Why GA4 User Counts Don’t Match Between BigQuery and the GA4 Interface

If you’ve ever compared GA4 user metrics in the interface with BigQuery exports, you might have noticed discrepancies. For example, Active Users, New Users, or Total Users in the GA4 UI often don’t match what you calculate from the raw BigQuery data.

Understanding why this happens is crucial for accurate reporting, marketing attribution, and data-driven decision-making. This guide explains the problem and offers detailed solutions to align your BigQuery metrics with GA4 reporting logic.


The Problem: Discrepancies Between GA4 UI and BigQuery

MetricGA4 InterfaceBigQuery ExportWhy It Differs
Active UsersCounts users with engaged sessionsCounted via user_pseudo_id or user_id in eventsGA4 filters by engagement events, session timeout, and deduplicates using blended identity
New UsersFirst-time user in a propertyCalculated from earliest event per userGA4 uses first_visit/first_open with UI-level session stitching; BigQuery needs custom SQL
Returning UsersUsers with previous sessionsUsers with prior events in BigQueryGA4 UI automatically excludes first sessions; raw data requires joins and filters
Total UsersDeduplicated by GA4 identity rulesCOUNT(DISTINCT user_pseudo_id/user_id)BigQuery counts may overcount if identity stitching not applied

Key Insight: GA4 UI metrics apply deduplication, identity prioritization, filters, and session logic automatically, while BigQuery exports are raw event-level data.


The Solution: Aligning BigQuery Metrics with GA4

To ensure your BigQuery user counts match GA4, follow these detailed steps:

1. Deduplicate Users Correctly

Example:

COUNT(DISTINCT COALESCE(user_id, user_pseudo_id)) AS active_users

This ensures every unique person is counted only once, even if they visit multiple times or from multiple devices.


2. Filter for Engaged Sessions

Example:

WHERE event_name = 'user_engagement'

This mimics GA4 UI logic for Active Users, avoiding inflated counts from short or non-interactive visits.


3. Match Date Ranges and Timezones

4. Use GA4 Identity Spaces

5. Implement Session Stitching

Example:

SELECT
  COUNT(DISTINCT COALESCE(user_id, user_pseudo_id)) AS active_users
FROM
  `project.dataset.events_*`
WHERE
  event_name = 'user_engagement'
GROUP BY session_id

This ensures Active Users match GA4’s cross-device session logic.


6. Optional: Adjust for Sampling & Data Thresholds


Quick SQL Example: Counting Active Users

SELECT
  event_date,
  COUNT(DISTINCT COALESCE(user_id, user_pseudo_id)) AS active_users
FROM
  `project.dataset.events_*`
WHERE
  event_name = 'user_engagement'
  AND event_date BETWEEN '2025-08-01' AND '2025-08-07'
GROUP BY
  event_date
ORDER BY
  event_date;

What’s happening in this SQL:

  1. Deduplicates logged-in and anonymous users with COALESCE.
  2. Filters only user_engagement events to mimic GA4 Active Users.
  3. Counts unique users per day for daily Active Users.

Key Takeaways

For a deeper dive into GA4 user metrics and BigQuery SQL, see: Mastering GA4 Users: Metrics, Identity, and BigQuery Analysis.

Discover more from GA4 BigQuery

Subscribe now to keep reading and get access to the full archive.

Continue reading