written by Antonny Santos (Solutions Architect Lead)

Attribution Is Not a Model. It’s a System.

Most attribution discussions start with a question like: “Should we use last-click, first-touch, or multi-touch?”

That question is already wrong.

Attribution isn’t just a choice of model. It’s a system design problem operating under a causal constraint that most teams never make explicit. And it doesn’t live neatly in any single discipline. Attribution lives in the messy middle—where product analytics, marketing execution, identity resolution, and data governance all meet.

When attribution fails, it usually has nothing to do with model choice. It fails because the system itself isn’t built to hold the complexity it’s meant to explain.

A Quick Look Back: How Did We Get Here?

It’s worth remembering that most attribution tools evolved from simpler models. First-click, last-click, and linear attribution models were built for a time when journeys were shorter, data was messier, and the number of touchpoints was smaller.

As user journeys became more complex and cross-channel by default, teams began layering in more data and more sophisticated models—like algorithmic or data-driven attribution.

But even the most advanced models still rely on the same flawed assumption: that what we observe can cleanly explain what caused a behavior. The models became more complex, but the foundational issues remained unaddressed.

And so attribution kept breaking in more subtle and expensive ways.

The Core Attribution Constraint: Observation vs. Intervention

Attribution systems are built almost entirely on observational data. We see that a user was exposed to a campaign, a conversion that happened later. We see a sequence of events, and we try to assign meaning.

But the question attribution is meant to inform—*”Should we invest more here?”—*isn’t observational. It’s counterfactual. What we really want to know is:

What would have happened if this exposure had not occurred?

This is an interventional question. And that distinction matters.

In formal terms, most attribution systems rely on formulas like:

P(Y∣X)

Where X is the exposure, and Y is the conversion event. But what we’re actually trying to answer is something closer to:

P(Y∣do(X))

The do(·) operator signals an intervention—something that deliberately changes the system, independently of user behavior, intent, or context.

That gap between P(Y∣X) and P(Y∣do(X)) isn’t a tooling issue. It’s a fundamental property of causal systems. Every attribution model—from last-click to multi-touch to data-driven—is just an approximation that tries (and usually fails) to bridge it.

Most stacks don’t acknowledge this. They just hide it behind weightings.

Let’s pause here for a quick step back.

Where does this idea of do(X) even come from?

It comes from the field of causal inference, particularly the work of Judea Pearl. In simple terms, observational data tells us what did happen. But causal questions ask us to imagine what could have happened if we had changed one variable in isolation.

Attribution tools try to answer causal questions with observational data. But without randomized experiments (where we deliberately control exposure), we can’t truly isolate cause and effect. That’s the challenge. It’s not a flaw in the tool—it’s a limitation of the data and the system design behind it.

Why Attribution Breaks in Practice

This causal limitation doesn’t operate in isolation. It compounds with three structural problems that show up almost everywhere.

First, identity resolution is incomplete or implicit.

Attribution assumes a stable, unified notion of “user.” But in practice, the concept of “user” is fractured.

Marketing sees cookies and device IDs. Product sees accounts and logged-in sessions. CRM sees emails and customer records. Paid media works with yet another layer of identifiers, sometimes probabilistic.

So when we ask, “Are these interactions from the same person?” we often can’t say for sure.

When identity is weak, journeys fragment. Attribution windows collapse. Cross-channel analysis becomes fiction. This isn’t an edge case—it’s the default state.

Second, event data is not governed.

Attribution assumes events are cleanly defined and consistently tracked. But in the real world, events are renamed, repurposed, re-instrumented. Properties drift. Marketing and product often emit different versions of the same truth.

Without governance, attribution becomes non-reproducible. Historical comparisons fall apart. And models end up optimizing noise. No model can solve for undefined or unstable data.

Third, marketing and product optimize different realities.

Marketing looks at reach, clicks, CPA, channel performance. Product looks at activation, retention, LTV, feature usage. Attribution sits between them—but is usually owned by neither.

The result? Marketing reports success that product never sees. Product sees drop-offs that marketing can’t explain. Attribution becomes less of a learning tool, more of a political artifact.

This organizational mismatch is as damaging as any technical flaw.

The Real Attribution Failure Mode

Put it all together, and attribution doesn’t fail because it picked the wrong model. It fails because it can’t reliably answer four foundational questions:

  1. Who is this user, really?

    (identity resolution)

  2. What actually happened, and in what order?

    (event quality and sequencing)

  3. Which actions were merely correlated, and which plausibly caused change?

    (observational vs interventional gap)

  4. What outcome actually matters for the business?

    (product truth vs channel metrics)

If any one of these breaks down, attribution becomes credit assignment theater.

What Comes Next

This piece stops at the problem, by design. Attribution is often framed as a modeling decision, but the real failure modes are upstream—in how we handle identity, data, organizational boundaries, and the limits of observation.

In the next blogposts, we’ll look at how teams operate inside those constraints: reducing bias in practice, understanding the difference between owned and paid behavior, and using product analytics tools not to “solve” attribution, but to reason about it more honestly.

Seen Your Attribution Break Like This?

If any of this sounds familiar — the fragmented identities, the event chaos, the disconnect between what marketing celebrates and what product sees — you’re not alone.

We help teams build attribution systems that reflect how growth actually works. If you need help shaping your strategy or just want to sanity-check what you’re seeing, get in touch with us. We’d love to hear your story.