TranX
← All posts
GA4BigQueryData Engineering

Stop Using the GA4 API — Use BigQuery Export Instead

TranX AI Team··7 min read

If you're pulling ecommerce data from GA4, you're probably using the Data API — through Looker Studio, a custom script, or a third-party connector. It works. Until it doesn't.

One day your product report shows an "(other)" row swallowing half your catalog. Your dashboard hits a quota limit and stops refreshing. Your funnel numbers look different every time you change the date range. And you start wondering: is this data even accurate?

The short answer: probably not. The GA4 Data API silently degrades your data in ways most teams never notice. BigQuery Export gives you the raw, unsampled truth — and it's free.

The five ways the GA4 API quietly ruins your data

The GA4 Data API isn't broken — it's just designed for simplicity, not precision. And for ecommerce brands making real money decisions, that gap is expensive.

1. Sampling kicks in faster than you think

Once your query touches more than 10 million events, GA4 takes a sample and scales it up. For a mid-size ecommerce store doing 50K sessions a day with enhanced ecommerce tracking, that threshold can be hit with just a few weeks of data.

Standard Reports are unsampled, but they're rigid — you can't customize them. Explorations give you flexibility but introduce sampling. The API inherits the same problem. You get convenience or accuracy, not both.

2. Data thresholds hide your small segments

GA4 applies "data thresholds" to prevent identifying individual users. If a segment has fewer than roughly 40–50 users, GA4 hides the entire row — not groups it, hides it. You won't even know it's missing.

For ecommerce, this kills niche analysis. Want to see how your VIP customers in a specific city behave? Or how a small-but-profitable demographic responds to a new campaign? GA4 silently withholds that data. You can't turn this off.

3. The "(other)" row eats your product catalog

GA4 caps dimension cardinality at roughly 50,000 unique values. When a dimension like "Item name" or "Page path" exceeds 500 unique values in a single day, GA4 starts collapsing the long tail into a catch-all (other) row.

If you sell 5,000 SKUs, a huge portion of your product-level data is invisible. You can see your top sellers, but the mid-tier and long-tail products — often where the real margin insights live — are lumped together and lost.

4. Row and dimension limits box you in

The API returns a maximum of 10,000 rows per request and allows only 9 dimensions per query. Need item name + category + campaign + source + medium + landing page + device + geography + a custom dimension? That's already 9. Add one more and you're splitting into multiple queries and trying to stitch them back together.

For pagination beyond 10K rows, each additional request burns quota tokens. Which brings us to the next problem.

5. Quota throttling kills your dashboards

Standard GA4 properties get 200,000 tokens per day and roughly 40,000 tokens per hour. A single complex query with wide date ranges and multiple dimensions can cost hundreds of tokens. A Looker Studio dashboard with 15 widgets, each hitting the API separately, can exhaust your hourly quota in minutes.

When the quota is gone, your dashboards go blank until it resets. Your marketing team sees stale data — or worse, they don't realize the data stopped updating.

What BigQuery Export gives you instead

BigQuery Export sends your raw GA4 event data directly into Google BigQuery. No pre-aggregation, no sampling, no artificial limits. Here's what that means in practice.

100% of your events, zero sampling

Every single event — page views, add-to-carts, purchases, custom events — lands in BigQuery as an individual row. Query a year of data across billions of events and get exact numbers. No estimation, no scaling. This used to cost $150K+/year with Universal Analytics 360. With GA4, it's free for every property.

No "(other)" row — ever

BigQuery stores raw event rows, not pre-aggregated tables. There is no cardinality limit and no row cap. Every SKU, every page path, every UTM parameter is individually queryable. Your entire product catalog is visible, not just the top performers.

Full SQL flexibility

No 9-dimension limit. No rigid report structure. Write any SQL query you want — joins, window functions, CTEs, subqueries. Build custom attribution models. Create cohort analyses. Calculate true customer lifetime value by acquisition source. Do things the API literally cannot do.

Join with your other data sources

This is where BigQuery becomes transformative. You can JOIN your GA4 behavioral data with:

  • Shopify / your OMS — match GA4 sessions to actual orders and calculate real revenue per channel, not platform-estimated revenue.
  • Ad platform exports — combine Meta, Google, and TikTok spend data with GA4 touchpoints for unified cross-channel attribution.
  • CRM data — connect customer records to behavioral data using user_id for LTV analysis, churn prediction, and segmentation.
  • Inventory / product data — join product margins with GA4 purchase events to see which campaigns drive the most profit, not just revenue.

The GA4 API can't do any of this. It only sees what GA4 collected.

No quota throttling

Once data is in BigQuery, you query it with BigQuery's own pricing — the first 1 TB per month is free. Your Looker Studio dashboards, dbt models, and scheduled queries all run against BigQuery directly. No GA4 tokens consumed. No hourly caps. No blank dashboards.

Real ecommerce use cases you can't do with the API

Here's what becomes possible once your GA4 data lives in BigQuery:

  • True LTV by acquisition channel. Join GA4 first-touch data with 12 months of order history from your OMS. See which channels bring customers that keep buying — not just which ones get the first click.
  • Full product catalog analytics. Query purchase and add-to-cart events for all 10,000+ SKUs without hitting cardinality limits. Find your hidden winners.
  • Cross-platform attribution. Merge GA4 touchpoint data with actual ad spend from every platform. Calculate unified ROAS using your own revenue numbers.
  • Custom funnel analysis. Build any funnel from raw events — not just the pre-defined funnels GA4 supports. See exactly where users drop off, by segment, by device, by campaign.
  • Margin-based optimization. Join product cost data with purchase events to optimize for profit instead of revenue. A $100 sale with 60% margin beats a $150 sale with 10% margin.

The one caveat: the daily export limit

Standard GA4 properties have a 1 million events/day limit on the daily batch export. If you consistently exceed this, the export pauses.

Two ways to manage this:

  1. Filter your export. In GA4's BigQuery export settings, exclude high-volume but low-value events like scroll, file_download, or video_progress. Export only the events that matter for your analysis.
  2. Use streaming export. The intraday streaming tables have no daily event cap and update within minutes. For high-traffic stores, this is often the better option anyway.

For very high-volume sites (10M+ events/day), GA4 360 removes the limit entirely.

How TranX AI makes this easy

BigQuery Export is powerful — but writing SQL against GA4's nested event schema isn't exactly beginner-friendly. The events table uses nested arrays for event_params and items that require UNNEST() operations, and building even a basic ecommerce funnel query can take dozens of lines of SQL.

TranX AI connects directly to your BigQuery warehouse and lets you ask questions in plain English:

  • "What's my conversion rate by landing page this month?"
  • "Which products have the highest add-to-cart rate but lowest purchase rate?"
  • "Show me customer LTV by acquisition source over the last 6 months."
  • "What campaigns drove the most profit after accounting for product costs?"

TranX AI generates the SQL, runs it against your BigQuery data, and shows you the answer — with the full query visible so you can verify, tweak, and rerun. No sampling. No "(other)" row. No quota limits. Just the real numbers.

Get real answers from your data

TranX AI connects to your BigQuery warehouse and turns questions into SQL. No sampling. No limits. Just answers.

Try TranX AI Free