Clearbit's marketing attribution model: how we built a single view of our customers
Here at Clearbit, we measure the dollar impact of our marketing activities with a single-source attribution model, which was built by Julie Beynon, our Head of Analytics. But before she joined, we often found it difficult to see which of our campaigns drove the most revenue and leads.
Our earliest analytics tools never quite captured the full picture. Our data was scattered across various marketing channels and across Clearbit's suite of free and paid products. We couldn't report on it in a standardized way to see where our revenue was really coming from. It was tough to know what worked, what we should improve, and where to reinvest our money in the future.
Today, Julie's model is the source of truth that shows us which campaigns drove the most leads, marketing qualified leads (MQLs), sales qualified leads (SQLs), and subscriptions. Our system also helps our marketers do intent scoring for leads, and enables our sales and success teams to zoom in on individual customer journeys in Salesforce. This lets them have successful conversations, because they can quickly see a customer's history with Clearbit and their converting touchpoints, like which of our ebooks they've read, which webinars they've joined, and how often they use our free products.
The model combines data from more than 10 different sources, ranging from ads to email to event attendance. In essence, it has the capacity to capture any lead behavior in any way we want it. But all this incredible data is less fun if it's locked up in a data warehouse where only a few analysts can access it.
Julie's main objective was to bring answers and information to the Clearbit teams who needed it, regardless of their technical skill.
To achieve this, the model pushes behavioral data from our database into our everyday sales and marketing tools, like Salesforce. In those platforms, it's easy for more people to access behavioral intelligence without relying on Julie's SQL (Structured Query Language) wizardry – which means they have data to do their jobs better, whether they're in sales, CS, or marketing. This often means a better experience for leads, too, because the emails, ads, and outreach they receive from Clearbit is more personalized.
That's the beauty of the model, if you ask us: it's powerful enough to handle incredibly complex data inputs at scale, but can simplify them and weave them into the fabric of everyday life at Clearbit.
Read on to:
- learn what types of lead and customer behavior we capture
- see the attribution touchpoint journey view in Salesforce
- understand the data pipeline architecture underpinning the attribution model
- how we define events using Redshift and dbt
- the final attribution model and how we use it every day
Capturing behaviors along the buyer's journey and customer lifecycle
The model is able to pick up a lead's (or customer's) behaviors across virtually any channel or Clearbit product — ranging from our Clearbit Enrichment feature to email interactions to clearbit.com website analytics (which also uses Clearbit Reveal to identify visiting companies).
Because our model is custom-built, we have a lot of flexibility in how we gather information. Put simply, it gives us the most comprehensive view of a lead's journey we've ever had.
Showing a lead's journey in Salesforce
Before we share how the model is built, let's look at one of the end results: what shows up in Salesforce.
When a teammate clicks on a contact in Salesforce, Salesforce breaks out a detailed list of all the marketing channels and content that the contact has interacted with. It also marks the converting touchpoints for lead, MQL, and SQL.
For example, imagine you're a Clearbit salesperson about to hop on a phone call with a lead, and you quickly need to do some research on them. You might see that they entered the Clearbit universe on September 24, 2020, and subsequently, they:
- attended a webinar via paid social, which was the converting touchpoint for both lead and MQL ... but they go cold
- return to engage with Clearbit emails for a few months, then opened our "demo request" page and asked to speak to Sales.
- read a few more marketing emails from us, and checked out our TAM Calculator.
- visited a few Clearbit website pages directly, like our Prospector product page.
This granularity makes conversations more efficient and tailored. And for outreach at scale, this information is also available in the email tool we use at Clearbit, Customer.io, so that we can personalize our email campaigns.
Here's another cool bit of data we can use: once a lead becomes a customer and starts to use Clearbit products, the model will track their usage and show whether an account is hitting its subscription quotas — or, conversely, if engagement is lagging. This allows sales and success reps to reach out proactively, prevent churn, and better monitor our customers' usage.
The data pipeline architecture
Now, let's look at the machinery under the hood. When Julie set out to create the model two years ago, she couldn't find one tool that met all of Clearbit's needs perfectly; it was difficult to track all behaviors across all Clearbit products and marketing channels in a standardized way.
So, she custom built the model instead. She connected a number of tools, based around Redshift as the data warehouse.
- Data warehouse: Our data warehouse, Redshift, holds all the raw data about user interactions.
- Data collection: Segment sends behavior data — pulling from our website, products (and marketing channels) — while Stitch sends behavior and additional data from Salesforce and Customer.io, as sources, to Redshift.
- Data transformation dbt sits on top of Redshift and does the hard work of transforming the raw data. It combines all data sources into one attribution model table.
- Making data available and useful: Clearbit teammates can query that table (we use Mode), while Census pushes the data into destination tools like Salesforce and Customer.io. Census also handles field mapping to ensure that data is formatted properly to sync to those tools.
The attribution model is connected to a separate tech stack for lead qualification, owned by the Clearbit RevOps team — which is centered around Salesforce, and uses tools like MadKudu to score new leads and LeanData to route them.
Defining user interactions in Redshift and dbt
As Julie set up the model in dbt, she had conversations with our marketers to figure out what types of personalized campaigns they wanted to run, and which user behaviors they needed to track to achieve this (for example, tracking webinar attendance). Then, she created a SQL file for each user event in dbt. Examples include "webinar registered," "webinar attended," "eBook form submitted," and "blog signup." For each type of event, there is a table in Redshift and a corresponding model in dbt; right now, we track around 30 events.
Segment feeds in events as they occur in the real world, and dbt transforms all that data and combines all touchpoint files into one nice, clean table — our master attribution model table.
When we want to track new touchpoints, Julie adds new SQL files in dbt for them. It just takes a few lines of code, so it's a relatively painless operation. Plus, we can define user events in dbt exactly how we want them — a huge advantage of a custom-built solution.
The final attribution model and how we use it
Our final attribution model table shows everything that's happened in a user's journey, flagging their converting touchpoints and tracking their subscription revenue. It also ties in firmographic and product usage data to tell us whether a lead is an MQL, PQL, or ideal customer (ICP). The MQL/PQL distinction guides sales outreach by indicating their purchase or sales readiness, based on whether they've had more interactions with our products or with our emails, eBooks, and events. And, we can query the table to see which marketing channels were the most effective in driving MQLs, product qualified lead (PQLs), and revenue in aggregate.
This attribution model has enabled us to start building an intent scoring model in dbt, which gives each user a trailing 30-day engagement score to indicate how much they've interacted with Clearbit in the last month. Armed with this information, we can see the percentage of our target audience that has been active lately, and how that's changing over time with our marketing and sales effort. We can also start acting on it, as this scoring data can be pulled into tools Salesforce and Customer.io via Census.
At a high level, each event is assigned a point value (for example, an email click earns the lead 1 point, while an ebook download earns 13). These point values are somewhat arbitrary, because we're still on the first version of this model. We'll validate its performance, and then reassign points to each behavior so the next generation becomes more accurate. If we decide a "blog post read" is more valuable than we originally thought, we can change its point value, and it'll update the historical scores for leads who've read blog posts. The original record of behavioral data doesn't change, but the scoring model layers over it and allows us to re-weight behaviors over time. (Stay tuned!)
How Clearbit teammates use our model every day
We take advantage of attribution data in many ways, but among our favorites are how teammates can see lots of data about marketing interactions, subscriptions, and product usage. It's instantly available in plain English — no need to speak SQL/Klingon.
- Full revenue and subscription reporting. When Julie joined Clearbit, we were using a subscription table in Redshift that only showed the current value of customer subscriptions, not past value. Now, we can use dbt's archiving feature to save subscription charges and changes, enabling us to see subscription revenue over time.
- Our growth team can see which marketing campaigns generated the most leads, MQLs, or SQLs during the previous quarter — continue to invest in the ones that are working and improve, stop, or rethink the ones that aren't.
- Anyone who knows how to run a Salesforce or Mode report can get instant insights or create a dashboard.
Understanding individual journeys:
- A salesperson can tailor a phone conversation by looking at a lead's previous interactions with Clearbit right in Salesforce. They might see, for example, that the lead has gobbled up all our blog posts about site personalization — in which case, the salesperson can bring up Clearbit's Reveal product in conversation, which helps with personalization. Or if the lead attended a webinar, we can send them a related ebook over email to keep nurturing them.
- Our ABM efforts are supported by granular interaction data. For example, when looking at a lead, we can also check out other contacts at the same account to see how the lead's colleagues have been interacting with Clearbit. It's a simple matter to discover what resonates, understand account dynamics, and see whether Clearbit is making an impression on a buying committee.
- We can predict a churn risk when someone's not using Clearbit enough each month, based on their contract's quota and send an email to re-engage them.
- We can trigger highly-personalized, contextual emails all along the lifecycle.
- Our treasure trove of attribution data allowed us to easily fix a mistake we'd been making before: our emails sometimes treated the wrong hand-raisers as if they were "new" to the Clearbit universe — even though they'd been using our free Clearbit Connect tool for years. That made no sense, so today our model helps us keep track of who is net-new and who isn't. By passing this information on to our email tool, we can make sure we're sending relevant messages.
Today, our single-view attribution model helps us better understand our users and get smarter about where we spend resources. Just as our marketing activities produce behavioral data, so will our attribution data inform our next marketing moves — a continuous loop that results in a smarter system where everyone at Clearbit is empowered to create great experiences for our customers.