In this guide, we’ll walk you through SQL queries to calculate Active, New, and Returning Users, explore cross-device behavior, and share best practices for deduplication and segmentation. By the end, you’ll have a complete toolkit to analyze GA4 users in BigQuery with precision—perfect for building reports, dashboards, or feeding insights into your CRM or marketing tools.
Understanding GA4 User Identifiers in BigQuery
| Identifier | Description | Notes / Use Case |
|---|---|---|
| user_pseudo_id | Anonymous ID tied to device/browser | Default in GA4 exports; general user analysis |
| user_id | Unique ID assigned to logged-in users | Enables cross-device tracking; links to CRM or internal systems |
| COALESCE(user_id, user_pseudo_id) | Combines both for deduplicated analysis | Best practice for Active, New, Returning user counts |
Always deduplicate users before counting metrics to match GA4 logic more closely.
1. Counting Active Users
Active Users are users who had an engaged session or were new in the timeframe.
WITH daily_events AS (
SELECT
event_date,
COALESCE(user_id, user_pseudo_id) AS unique_user_id
FROM
`project.dataset.events_*`
WHERE
event_name = 'user_engagement'
AND event_date BETWEEN '2025-08-01' AND '2025-08-07'
)
SELECT
event_date,
COUNT(DISTINCT unique_user_id) AS active_users
FROM
daily_events
GROUP BY
event_date
ORDER BY
event_date;
What’s happening in this SQL:
- CTE
daily_events: Selects events of typeuser_engagementand replaces nulluser_idwithuser_pseudo_idto capture all users. - COUNT(DISTINCT unique_user_id): Counts each user only once per day.
- GROUP BY event_date: Aggregates counts by day for daily reporting.
2. Counting New Users
WITH first_events AS (
SELECT
COALESCE(user_id, user_pseudo_id) AS unique_user_id,
MIN(event_date) AS first_seen_date
FROM
`project.dataset.events_*`
GROUP BY
unique_user_id
)
SELECT
first_seen_date,
COUNT(unique_user_id) AS new_users
FROM
first_events
WHERE
first_seen_date BETWEEN '2025-08-01' AND '2025-08-07'
GROUP BY
first_seen_date
ORDER BY
first_seen_date;
What’s happening in this SQL:
- CTE
first_events: Finds the earliest event date for each user, treating that as their first visit. - COUNT(unique_user_id): Counts each user’s first session exactly once.
- WHERE first_seen_date BETWEEN …: Filters only those users who were new in the selected date range.
3. Counting Returning Users
WITH first_events AS (
SELECT
COALESCE(user_id, user_pseudo_id) AS unique_user_id,
MIN(event_date) AS first_seen_date
FROM
`project.dataset.events_*`
GROUP BY
unique_user_id
)
SELECT
event_date,
COUNT(DISTINCT COALESCE(user_id, user_pseudo_id)) AS returning_users
FROM
`project.dataset.events_*` e
JOIN
first_events f
ON
COALESCE(e.user_id, e.user_pseudo_id) = f.unique_user_id
WHERE
e.event_date BETWEEN '2025-08-01' AND '2025-08-07'
AND e.event_date > f.first_seen_date
GROUP BY
event_date
ORDER BY
event_date;
What’s happening in this SQL:
- CTE
first_events: Calculates the first_seen_date for each user. - JOIN
events_*with first_events: Matches all events to their user’s first activity. - WHERE e.event_date > f.first_seen_date: Ensures only returning users (not first-time users) are counted.
- COUNT(DISTINCT …): Deduplicates users per day.
4. Cross-Device Analysis
WITH user_devices AS (
SELECT
COALESCE(user_id, user_pseudo_id) AS unique_user_id,
device.category AS device_type
FROM
`project.dataset.events_*`
WHERE
event_date BETWEEN '2025-08-01' AND '2025-08-07'
GROUP BY
unique_user_id, device_type
)
SELECT
COUNT(DISTINCT unique_user_id) AS multi_device_users
FROM
user_devices
GROUP BY
unique_user_id
HAVING
COUNT(DISTINCT device_type) > 1;
What’s happening in this SQL:
- CTE
user_devices: Collects unique users and the type of device they used. - COUNT(DISTINCT device_type) > 1: Filters users who used multiple device types.
- COUNT(DISTINCT unique_user_id): Totals multi-device users.
Best Practices
| Practice | Reason |
|---|---|
Deduplicate with COALESCE(user_id, user_pseudo_id) | Avoid double-counting across devices |
Filter for user_engagement events | Matches GA4 Active Users metric |
| Match GA4 date ranges and timezone | Ensures consistency with GA4 reports |
| Use BigQuery for custom segmentation | Funnels, cohorts, and combining with CRM data |
| Document and modularize SQL | Reusable queries for daily, weekly, and monthly reporting |
Key Takeaways
- GA4 user metrics differ from UA, but BigQuery allows precise replication and extension.
- With
user_idanduser_pseudo_id, you can analyze Active, New, Returning, and cross-device users. - BigQuery enables custom metrics, cohort analysis, advanced segmentation, and integration with other systems.
- Combine this with GA4 UI insights for a full 360° view of user behavior.
For conceptual understanding of GA4 user metrics and identity, see: Mastering GA4 Users: Metrics, Identity, and BigQuery Analysis.
TL;DR
- Active Users = engaged users or new users; filter
user_engagementevents. - New Users = first-ever session per user;
MIN(event_date)identifies them. - Returning Users = users with previous sessions; join with first_seen_date.
- Cross-Device Users = track logged-in and anonymous users across multiple devices.
- Deduplicate using
COALESCE(user_id, user_pseudo_id)for accurate counts. - BigQuery enables advanced segmentation, cohort analysis, and CRM integration beyond GA4 UI.
