The Google Analytics 4 transition has been less than enjoyable for many a marketer and analyst. But one of the biggest (remaining) headaches has got to be data thresholding.

A screenshot of the Google Analytics 4 interface showing the header "Conversions: Event name" with a red triangle around an exclamation point next to it.
Forget the Bermuda Triangle, this is the red triangle of death for marketers.

What is data thresholding?

Google Analytics has applied thresholding to this card and will only display the data when the data meets the minimum aggregation thresholds.

Say what now?

Thresholding is applied when the data could be granular enough for an analyst to theoretically identify individuals from the reporting. The reason this is new to GA4 is because of the shift from session-first reporting to user-first reporting the transition from Universal Analytics (GA3?) signaled.


Sampling Side Note

If you click the red triangle of death dropdown, you may notice a mention of data sampling (hopefully via an item that says "unsampled card"). What is data sampling?

Analytics uses data sampling when the number of events returned by an exploration or funnel report exceeds the limit for your property type (i.e., a standard or 360 property). 
The quota limit is 10 million events for users of the free Google Analytics product and up to 1 billion events for Google Analytics 360 users.

If you're racking up a lot of events, you should either consider shelling out for 360 or investigating GA alternatives

Now, back to our regularly scheduled goblin...


Data Thresholding In Practice

If you see the red triangle of death at the top of your report, know that you aren't seeing all the data. This is especially discomforting if the report you're looking at is related to a metric your performance is measured on.

The quickest way I've seen to get hit by the thresholding goblin is by using a short time frame or by using dimensions closely tied to user information (demographic data and search query information are 2 of the 3 threshold triggers mentioned by Google).

But it can be sneaky too. If an event didn't fire enough in the time frame you chose (maybe because it's new), it may get hidden by the thresholding goblin. And you're left to wonder if you set it up wrong or if no one is taking the action that triggers the event.

How To Fix Thresholding

The easiest way is to "adjust the date range" (as mentioned above).

Data may be withheld when viewing a report or exploration within a narrow date range if you have low user or event counts in that date range. Expanding the date range may increase the number of users who triggered an event, enabling you to see the previously thresholded data.

You can also export to BigQuery. The reason this works is Google Signals, the true source of this goblin.

Back in the UA days, Google Signals was a set-and-forget toggle in the admin to (theoretically) make your Google Ads better by tying your site activity to logged in Google user accounts. Essentially Google's version of the Meta pixel. But GA4's user-centric approach to measurement causes Signals to unleash the thresholding goblin.

Google just recently launched a simple solution: turn off "Include Google signals in reporting identity" in the admin.

A screenshot of the GA4 interfacing showing the Google Signals Data Collection pane with the toggle to disable including the data in reporting.
Admin > Data Collection > turn off "Include Google signals in reporting identity"

That's it. It's that simple. I wrote over 500 words for the payoff of one screenshot telling you to click one toggle. Isn't GA4 fun?

If for some reason that approach doesn't work for you, or you want to keep that data included in reporting, or you just want to delve deeper into GA4's settings, there is another option: change your Reporting Identity.

You can find this in the Admin under Data Display. The default is Blended, which includes all the toys Google can pack in. But hidden under the nondescript "Show all" at the bottom right of the 2 option table is a third option: Device Based.

I'll try for a deeper dive on the differences between these 3 options at a later date, but prior to the Google Signals reporting toggle, this was Google's recommended way around the thresholding goblin. The recommendation included switching back to one of the other 2 options after you were finished with your reporting needs, but it's not clear why you would need to do that if you are just going to switch back to device-based every time you open GA4 to pull data.

Anyway, TL;DR:

Data thresholding occurs when your reporting view could be used to identify individual users either because you don't have enough data in the time frame or you're using dimensions closely tied to users' personally identifiable information (PII).

Data thresholding can prevent data from showing in impacted reports, meaning you don't have a full view of what's happening on your site (and potentially driving you insane if entire events are "missing).

The best way to avoid the thresholding goblin is to turn off "Include Google signals in reporting identity" in the data collection section of the admin.

Here's to better reporting and analysis.