Quick Walkthrough: How to Track AI Traffic on your Website Using Google Analytics 4
If you’re short on time and just need to get this sorted, here’s the fast track:
Step-by-Step: Spotting AI Referrers in GA4
1. On your GA4 dashboard, go to the left-hand menu and click Reports.
2. Under Life cycle, choose Acquisition → Traffic acquisition.
This is where you’ll find your general traffic stats and sources.
3. Click Add comparison at the top.
4. Add a filter for Referral & affiliates traffic.
5. Under Dimension, search for Session source / medium
6. Under Match type, select matches regex
7. Paste the following regular expression into the field:
.*(gpt|chatgpt|openai|neeva|writesonic|nimble|outrider|perplexity|google.bard|bard|edgeservices|gemini.google).*
This expression is designed to match known AI platforms that may visit your site, even if they appear in slightly different formats.
8. You’ll now see a filtered view of traffic from these sources. You’ll be able to see:
- Which AI tool or platform triggered the session
- What pages they accessed
- When these sessions occurred
- Whether they were repeated or one-off hits
Need help setting this up? Book a free GA4 audit with The Munro Agency and we’ll help you get it sorted.
What’s All This Fuss about AI Traffic?
If you’ve noticed some funny spikes in your web analytics lately — strange referrers, ultra-short sessions, or unusually high bounce rates — it’s not just you. More often than not, it’s not even a person. It’s AI.
As AI-powered tools like ChatGPT, Perplexity, Gemini, and Claude continue to scour the web for training data and user answers, they’re regularly landing on your site. Sometimes intentionally. Sometimes not. And this traffic, while not inherently bad, can seriously muddy your data.
In this guide, we’ll walk you through how to spot and track AI traffic in Google Analytics 4, so you can keep your data accurate and actionable.
What Is AI Traffic (and Why Should You Care)?
AI traffic refers to sessions initiated by artificial intelligence bots — typically large language models (LLMs) or their assistants — that crawl, index, or retrieve web content.
Common Sources of AI Traffic:
- ChatGPT Browsing (OpenAI)
- Perplexity AI
- Bing Copilot
- Google Gemini / Bard
- Anthropic Claude
- Neeva, Writesonic, and others
Why It Matters:
- Distorted Metrics: Bots skew your average time on page and bounce rate.
- Misdirected Optimisation: You may start optimising for traffic that will never convert.
- Unreliable Funnels: These sessions can appear in user flows, suggesting interest where there is none.
How GA4 Handles AI Traffic by Default
To be frank, GA4 isn’t brilliant at this out of the box. It does have a basic bot filter under Admin > Data Settings > Data Filters, but it’s mostly geared towards known web crawlers and spam.
The kind of AI traffic we’re talking about — LLMs scraping for context or bots fetching summaries — is still relatively new and can slip through undetected. That’s where a bit of custom setup and analysis comes in.
Digging Deeper: How to Identify AI Traffic in GA4
If you’ve followed the Quick Walkthrough above, you’ve already taken a solid first step. But if you want to go further and really understand the shape of AI traffic on your site, here’s how:
- Use Session Source/Medium Filters in GA4
Apply the regex filter in GA4 to pinpoint sessions from known AI tools like ChatGPT, Perplexity, or Bard. This lets you explore which pages they accessed, when those visits occurred, and how little they actually engaged—usually just a quick hit and exit.
- Landing pages: See which blog posts or site pages AI bots are targeting most often.
- Session timestamps: Identify whether the visits cluster around odd hours—often a clue it’s not human traffic.
- Engagement patterns: Bots usually bounce immediately, with no clicks, scrolls, or time spent on page.
- Compare Regions and Devices
- AI sessions often originate from unusual locations or appear with system-default language settings (e.g., “en-us”).
- Most show up as desktop traffic — even when mobile is dominant on your site.
- Check for Behavioural Red Flags
AI bots:
- Rarely click beyond the first page
- Often show 0 seconds on site
- Sometimes hit the same blog post repeatedly
- Usually have no referrer or come in as direct/none
- Don’t interact — no clicks, no scrolls, no events
Comparison Table: AI vs Real Visitor Behaviour
Here’s a snapshot of what sets bots apart from humans in your data.
| Metric | Typical Human | Typical AI Bot |
|---|---|---|
| Session Duration | 30s–5 mins | 0s |
| Pages per Session | 2–5 | 1 |
| Time of Visit | Daytime hours | Unusual hours (e.g. 3am) |
| Referrer Source | Varied | openai.com, bard.google |
| Event Interactions | Clicks, scrolls, form fill | None |
When you see patterns that match the AI column, it’s a strong sign you’re looking at machine traffic — not potential customers. Keep this in mind as you filter or segment your data.
Complementary Tools to Support GA4
GA4 alone isn’t always enough, especially if you want to be precise. These additional tools can help:
Google Tag Manager (GTM)
- Create custom triggers for user-agent patterns (e.g., contains “GPTBot”)
- Fire events or push tags when certain bots land
Server Log Analysis
- Use log data to extract raw IPs and headers
- Cross-reference with known bot IPs (e.g., OpenAI publishes theirs)
GA4 Custom Dimensions
- Set up a custom dimension called Traffic Type
- Use GTM to populate this with values like ai_bot or human based on detection rules
What to Do with AI Traffic: Exclude or Tag
Once you’ve spotted it, the next question is: should you exclude AI traffic from your reports or keep it tagged?
Here’s how to handle it:
Exclude in Explorations
- Use “Add Comparison” to exclude regex-matched sources
- Create a bot-free version of key reports (great for board meetings)
Tag It for Context
- Rather than exclude, mark it with a custom event parameter
- That way, you preserve the data but can separate it easily
Best Practices for a Cleaner Analytics Setup
- Review traffic sources weekly: Regularly scan your reports for unusual patterns — like zero-duration sessions or unfamiliar referrers — to catch new AI bots early.
- Stay updated with AI agents: New bots emerge constantly. Keep your regex and filters up to date by following known AI platforms and their published user-agent lists.
- Work with your dev or SEO team: They can help implement deeper filters — like IP blocks or GTM triggers — without affecting your core tracking setup.
- Split your reports: Create two views — one with all traffic and another filtered for human users. This helps maintain clean data for decision-making without losing sight of AI visits.
A Note on Limitations
The AI landscape moves fast. Some bots may soon mimic human patterns, use proxies, or obfuscate their identity. GA4 might catch up — or it might not. Either way, keeping a critical eye on your traffic is going to be part of the job going forward.
Final Thoughts
In an age where traffic can come from people, programs, or something in between, it’s vital to know who (or what) is really visiting your website. With a bit of setup in GA4, plus some clever filtering, you can make sure your analytics reflect genuine user behaviour — not just curious machines.
Want Expert Eyes on This?
If you’re feeling uncertain about where to begin — or just want to sanity-check your GA4 data — get in touch with The Munro Agency. We’ll get your analytics future-proofed and ready to handle whatever AI throws your way.
















Leave a Comment