Table of Contents >> Show >> Hide
- Why analytics assumptions are so tempting (and so expensive)
- Assumption #1: “I can track everyone”
- Assumption #2: “Users = people”
- Assumption #3: “Sessions are visits, and they’re comparable across tools (and time)”
- Assumption #4: “Direct traffic means they typed the URL”
- Assumption #5: “Attribution reports tell me what caused the conversion”
- Assumption #6: “The numbers are precise (so my conclusions can be precise)”
- Assumption #7: “Bots and internal traffic don’t matter (or the tool handles it)”
- Assumption #8: “UTM tags are optional (and naming consistency is ‘nice to have’)”
- Assumption #9: “Cross-domain tracking is optional”
- Assumption #10: “Dashboards answer questions”
- A quick “assumption audit” checklist you can run this week
- Conclusion: trade assumptions for hypotheses
- Field Notes: 5 Real-World Analytics “Oops” Moments (and What They Taught Me)
- 1) The “Direct Traffic Growth Strategy” that was actually… a typo
- 2) The checkout domain that quietly doubled “new users” overnight
- 3) The “retention crash” that turned out to be a browser privacy cliff
- 4) “Engagement improved!” said the dashboard, which had just changed the rules
- 5) The bot problem that hid inside “normal-looking” traffic
Analytics is a little like a security camera: it feels objective until you realize the lens is smudged, the
timestamp is wrong, and your cat just triggered the motion sensor for the 47th time. And yet, we still use
dashboards to make real decisions: budgets, headcount, product roadmaps, campaign strategy, and occasionally
the fate of someone’s “Q2 Growth Experiment” slide deck.
The problem isn’t that analytics is “bad.” The problem is that we treat it like a census when it’s closer to a
well-intentioned estimate. Between privacy changes, cookie limits, ad blockers, tracking prevention, and the
fact that humans use more than one device (rude), measurement comes with blind spots. Those blind spots become
dangerous when we quietly paper them over with assumptions.
Let’s take the most common analytics assumptionsmany highlighted in the Moz Whiteboard Friday conversation
and turn them into something far more useful: testable statements, better instrumentation, and reporting you
can defend without sweating through your shirt.
Why analytics assumptions are so tempting (and so expensive)
Assumptions happen for two reasons. First, analytics tools present numbers with the confidence of a fortune
teller who just found your wallet. Second, teams need answers fast. When a CEO asks, “Why did revenue dip?”
nobody wants to say, “Because Safari, consent prompts, and a mislabeled event created a measurement black hole.”
But assumptions compound. A slightly-wrong “users” number becomes a wildly-wrong CAC calculation. A “direct
traffic” spike turns into a bad brand narrative. A misread attribution model shifts spend away from the channels
that actually create demand. Before you know it, you’re optimizing for the dashboard instead of the business.
Assumption #1: “I can track everyone”
This is the granddaddy assumptionand it’s almost never true. Modern measurement depends on a chain of events:
tags firing correctly, browsers allowing storage, users consenting (where required), networks not blocking calls,
and platforms successfully stitching identity across sessions.
Reality check: tracking is partial by design
- Browser privacy features can limit how long identifiers last, which affects returning user measurement and attribution windows.
- Consent choices can reduce observable data, pushing tools toward aggregated or modeled reporting.
- Ad blockers can prevent tags from loading or block requests to common analytics endpoints.
- Cross-device behavior fragments journeys unless you have strong identity stitching (like authenticated User-ID).
Practical takeaway: treat analytics as a “high-quality sample,” not a perfect mirror of reality. Your job is to
understand the bias in the samplewho gets counted reliably and who doesn’tthen make decisions with that bias
in mind.
Assumption #2: “Users = people”
“Users” sounds like actual humans. But most analytics platforms identify users through a mix of device IDs,
cookies, and (sometimes) modeled identity signals. That means one person can become multiple users (phone + laptop
+ tablet + incognito + new phone), and multiple people can become one user (shared family iPad, kiosk devices,
classrooms, internal demo machines).
What to do instead
- Segment users by authentication state (logged-in vs. anonymous) if your product supports it.
- Implement User-ID for signed-in experiences so cross-device journeys stitch more accurately.
- Use users as a directional metric, and validate with backend counts (accounts created, purchases, subscriptions, leads).
If someone asks, “How many people visited?” a better answer is: “Here’s how many unique devices we observedand
here’s the range we expect based on login rates and known measurement gaps.”
Assumption #3: “Sessions are visits, and they’re comparable across tools (and time)”
Sessions feel simple: a visit to your site. But session definitions vary by platform and configuration. In GA4,
engagement-based metrics changed the way many teams interpret “bounce” and session quality. If you compare modern
engagement-based metrics to older reports without acknowledging definition changes, you’re basically comparing
apples to… apple pie.
The classic trap: reporting “improvement” caused by definitions, not behavior
Example: a team migrates to a new analytics setup and celebrates a huge drop in bounce rate. The truth: the metric
definition changed (or the default events changed), not the audience’s behavior.
Better approach: use a stable “north star” metric tied to business outcomes (qualified leads, trial-to-paid, repeat
purchase rate). Then use sessions, engagement rate, and similar metrics as diagnostic signalsnot the final verdict.
Assumption #4: “Direct traffic means they typed the URL”
Ah yes, “direct”the junk drawer of acquisition. Sometimes it’s true brand demand. Sometimes it’s your email
newsletter. Sometimes it’s a social app that refuses to pass referral data. Sometimes it’s a redirect chain that
quietly dropped the referrer like it was hot. And sometimes it’s a UTM tagging mistake so small you’ll miss it
even when you’re staring directly at it (which, ironically, becomes… direct traffic).
Why “direct” gets inflated
- Missing or broken campaign tagging (UTMs not applied, inconsistent naming, wrong parameter casing)
- “Dark social” sharing (links pasted in texts, DMs, Slack, etc.)
- Apps and privacy restrictions that reduce referrer visibility
- Cross-domain gaps that break sessions and attribution paths
- Attribution rules that treat “direct” differently than other channels
How to troubleshoot a direct traffic spike (without panic-baking banana bread)
- Check recent campaign launches: did any new email, influencer, QR code, or partner push go out without UTMs?
- Audit redirects: especially from link shorteners, vanity domains, and paid landing pages.
- Inspect UTM conventions: one platform using “utmSource” instead of “utm_source” can quietly break attribution in some setups.
- Review cross-domain measurement if the journey spans multiple domains (checkout, booking engines, subdomains).
The goal isn’t to “eliminate direct.” The goal is to make “direct” honest: brand demand + true unknowns, not “we
forgot to tag the thing.”
Assumption #5: “Attribution reports tell me what caused the conversion”
Attribution is not a truth machine. It’s a credit assignment system based on the data you collected and the model
you chose. Different models answer different questions:
- First-touch asks: “What introduced someone to us?”
- Last-touch asks: “What closed the deal (or was last observed)?”
- Linear asks: “What supported the journey across touches?”
- Data-driven tries to infer influence patterns from observed paths (with caveats based on data volume and observability).
The most common assumption: “Attribution = causation.” It’s not. If your analytics can’t observe a touchpoint
(privacy limits, walled-garden views, offline interactions), it can’t credit itno matter how influential it was.
What to do instead
- Match the model to the decision: use first-touch for demand gen strategy, last-touch for conversion optimization, and multi-touch for budget debates.
- Validate with experiments: geo holdouts, incrementality tests, lift studies, or controlled campaign pauses.
- Bring in offline reality when relevant: sales cycles, call tracking, in-store conversions, and CRM stages.
Assumption #6: “The numbers are precise (so my conclusions can be precise)”
Many teams treat analytics numbers like accounting. But analytics is often subject to rounding, aggregation,
privacy thresholds, sampling in exploratory analysis, and reporting constraints designed to protect user privacy.
In plain English: sometimes you’re looking at a cleaned-up approximation.
Three silent culprits
- Sampling: using a subset of data to speed up complex reports, which can shift totals and distributions.
- Thresholding: withholding granular rows when data could expose individual user info in certain report contexts.
- Cardinality limits: collapsing or grouping high-variation dimensions (think: millions of unique page titles or parameters).
Practical takeaway: when you’re making a big bet (budget shifts, channel cuts, product changes), use multiple
lensesanalytics plus backend data, plus controlled testsso you’re not betting the farm on a dashboard that’s
intentionally smoothing details.
Assumption #7: “Bots and internal traffic don’t matter (or the tool handles it)”
Tools try to filter known bots, but “known” is doing a lot of work in that sentence. Monitoring tools, scraping,
click fraud, spam referrals, and internal teams QA’ing the site can all skew engagement metricsespecially on low
traffic sites where a few thousand junk events can look like a “growth story.”
How to harden your data quality
- Exclude internal and developer traffic using supported filters and a clear process for IP ranges and testing states.
- Watch for anomaly patterns: 0-second engagement time, 100% bounce-like behavior, weird geos, or sudden spikes to obscure pages.
- Validate key events server-side where possible (purchases, subscriptions, lead submissions) to reduce reliance on front-end signals alone.
Assumption #8: “UTM tags are optional (and naming consistency is ‘nice to have’)”
UTM parameters are the labels on your analytics filing cabinet. Without labels, you still have papersyou just
can’t find anything when it matters. With messy labels, you end up with “Email,” “email,” “e-mail,” “newsletter,”
and “NEWSLETTER!!!” all living as separate channels like a weird reality show.
UTM rules that save careers
- Standardize utm_source and utm_medium (approved values only; keep a simple naming dictionary).
- Tag everything you control (email, QR codes, partner links, paid social when needed).
- Don’t tag internal links (it breaks attribution and creates self-referrals or false campaigns).
- Be boring on purpose: lowercase, hyphens, no spaces, no creative spelling.
Bonus: if you build a lightweight “campaign intake” form for UTMs, you’ll catch errors before they shipand you’ll
spend less time playing detective in “direct traffic.”
Assumption #9: “Cross-domain tracking is optional”
If your customer journey spans multiple domainsmarketing site to app, checkout on a separate domain, third-party
booking enginesthen cross-domain measurement is not a luxury feature. Without it, users can appear as “new” when
they move between domains, sessions can break, and attribution can get scrambled.
The result is a familiar horror story: “Why did our conversion rate drop?” It didn’t. Your measurement just split
one journey into two unrelated journeys and then shrugged.
Symptoms you need cross-domain fixes
- Self-referrals (your own domains showing up as referral sources)
- Sudden increases in new users after a domain change or checkout migration
- Conversion paths that “start over” at checkout
Assumption #10: “Dashboards answer questions”
Dashboards show metrics. Questions require interpretation. If your dashboard has 42 charts and none of them can
answer “Are we attracting the right customers?” you don’t have a dashboardyou have a data-themed wallpaper.
Upgrade: from dashboards to decisions
- Define the decision: “Should we increase paid search budget?” “Is onboarding improving retention?”
- Define the success metric: a measurable outcome tied to the business (not just clicks).
- List measurement risks: tracking prevention, missing UTMs, cross-domain gaps, consent rates.
- Choose validation data: CRM stages, payment processor, warehouse events, customer support tags.
- Decide the threshold for action: what change is meaningful enough to act on?
When you do this, analytics becomes a decision-support systemnot a scoreboard that can be “won” by tagging your
own internal links.
A quick “assumption audit” checklist you can run this week
- Users: Do you know what percentage of sessions are logged-in? Do you have User-ID for authenticated users?
- Sessions: Are your session and engagement definitions documented and consistent across reports?
- Direct: Do you have a UTM standard? Do you tag email, QR codes, affiliates, and partnerships?
- Attribution: Do you know which model your team is using in which report? Do stakeholders understand what it does and doesn’t mean?
- Data quality: Are internal/developer filters active? Do you monitor bot-like anomalies?
- Cross-domain: Does the full conversion path live on one domain? If not, is cross-domain measurement configured?
- Reality check: Do you reconcile key outcomes with backend sources (orders, signups, qualified leads)?
Conclusion: trade assumptions for hypotheses
The best analytics teams don’t demand perfect datathey design around imperfect data. They know where measurement
is strong, where it’s weak, and how to confirm big decisions with additional evidence. Most importantly, they
treat assumptions as hypotheses:
“We believe this channel drives new demand because we see it early in journeys and in lift tests.”
“We believe direct traffic rose because a partner campaign launched without UTMs, and the timing aligns.”
“We believe retention dipped because Safari cookie limits shortened user recognition, and backend cohorts stayed flat.”
Analytics will always have gaps. Your job is to make the gaps visible, quantify the risk, and build a measurement
practice that still leads to confident decisions. Or, at the very least, prevents “direct traffic” from becoming
your brand’s most powerful marketing channel.
Field Notes: 5 Real-World Analytics “Oops” Moments (and What They Taught Me)
The fastest way to understand common analytics assumptions is to watch them break in the wildpreferably on a
Tuesday, so you have the rest of the week to recover. Here are five experiences that show how “obvious” metrics
can turn into practical comedy (the kind where you laugh, then quietly update your tagging document).
1) The “Direct Traffic Growth Strategy” that was actually… a typo
A team celebrated a sudden jump in direct traffic and declared, with straight faces, that “brand awareness is up.”
The timing even matched a big social push. The story was beautifuluntil we looked at the campaign links.
Someone had used a nonstandard UTM format (mixed casing and a couple of parameters that didn’t get recognized).
The social push did drive traffic, yes. But the system couldn’t categorize it properly, so it slid into “direct”
like a raccoon sneaking into an open garage.
Lesson: if direct traffic rises, don’t write the press release yet. Audit UTMs, redirects, and channel rules first.
“Direct” is not a personality trait; it’s a symptom.
2) The checkout domain that quietly doubled “new users” overnight
Another team moved checkout to a separate domain for security and platform reasons. Good move. But within days,
acquisition reports looked like a miracle: new users skyrocketed, conversion rates dropped, and the funnel looked
“leakier.” The business didn’t change; the measurement did. Without cross-domain measurement, users were being
counted as “new” again when they hit checkout, and the journey fractured into multiple sessions.
Lesson: if your journey spans domains, cross-domain setup isn’t optional. Otherwise your funnel becomes a
before-and-after photo of a haircut… taken from two different people.
3) The “retention crash” that turned out to be a browser privacy cliff
A subscription site saw returning users drop dramatically in certain browsers. Panic followed. Teams brainstormed
content changes, pricing changes, even existential rebranding. But backend subscriptions and logins were stable.
The real culprit was measurement: tracking prevention and cookie limits shortened recognition windows, so returning
visitors started appearing as “new” after enough time passed.
Lesson: retention metrics need a second opinion. If your analytics says loyalty died, validate with login cohorts,
subscription renewals, or CRM activity before you hold a funeral.
4) “Engagement improved!” said the dashboard, which had just changed the rules
A marketing team changed its event setup and started firing more engagement-related events by default. Engagement
rate rose. Bounce rate dropped. Champagne almost happened. But user behavior didn’t meaningfully changemeasurement
did. The team hadn’t documented the instrumentation change, so they compared “before” and “after” like it was one
continuous metric.
Lesson: write down changes. Annotate launches. Document event definitions. Otherwise, your reports will take
credit for improvements you didn’t earn (which is fun until someone asks you to repeat the miracle next quarter).
5) The bot problem that hid inside “normal-looking” traffic
One site experienced a steady increase in pageviews and sessions with oddly low engagement time and suspiciously
consistent behaviorsame pages, same cadence, same everything. The built-in bot filtering caught some of it, but
not all. A portion came from automated tools, uptime monitors, and low-grade scrapers that looked “human enough.”
The team had been making content decisions based on what was essentially an army of polite robots.
Lesson: bot filtering is necessary, not sufficient. Build anomaly monitoring, filter internal/dev traffic, and
sanity-check engagement patternsespecially if you’re a smaller site where bots can skew the entire story.
The common thread across all five stories is simple: analytics assumptions create confident narratives. Confident
narratives drive decisions. And decisions made on untested assumptions are just expensive guesses dressed up in
charts. The fix isn’t paranoiait’s process: consistent tagging, documented definitions, validation against
backend reality, and a team culture that treats “I’m not sure” as the start of analysis, not the end of a meeting.