Audiences Testing & Experimentation Guide
How to test and measure Audiences the right way
This guide is here to help you get accurate results when testing Freshpaint Audiences and avoid the most common testing mistakes we see that lead to misleading conclusions.
Table of Contents
First: what Freshpaint Audiences actually does
Freshpaint Audiences isn’t meant to replace Meta or Google’s native targeting.
Think of it like this:
Freshpaint helps you identify the right people to target using your first-party data and pass those identifiers to Meta and Google in a compliant manner
Meta & Google handle ad delivery and optimization using their in-platform machine learning
Freshpaint gives those platforms better, cleaner, compliant inputs. It’s not designed to out-optimize their delivery engines.
Because of that, direct 50/50 A/B tests between audiences in Freshpaint and platform-native audiences can be misleading.
The right way to frame success
Instead of asking:
“Did this Freshpaint audience beat our Meta or Google audience?”
Look at:
Are we now able to run compliant retargeting we couldn’t safely run before?
Are we reducing wasted spend by excluding existing members or patients?
Does overall campaign ROI improve when Freshpaint audiences are layered into your strategy?
Are your lookalike seeds higher quality?
Audiences works best as part of your overall acquisition strategy, not as a standalone ad set.
Why audiences in Freshpaint can look “smaller”
Freshpaint audiences are often:
More precise
Higher intent
Built from real first-party behavioral or CRM/EHR data
Smaller, higher-intent audiences naturally:
Have higher frequency
Can look more expensive in isolation
Convert better per person
This is expected and often a sign of quality, not a problem.
What to test instead (better experiments)
Rather than testing:
Freshpaint vs Meta
Focus on:
Total campaign CPA before and after exclusions
Overall ROI when Freshpaint audiences are layered in
Changes in call center or lead handling volume
Lookalike quality when seeded with audiences in Freshpaint
You’re measuring system impact, not ad-set performance in a vacuum.
How to Run a Valid Audiences Experiment (Overview)
Pick one clear problem to fix
Wasted spend → Exclusions
Scaling efficiently → Lookalikes
Lost demand → Retargeting
Layer, don’t replace
Keep your existing native campaigns running
Add Freshpaint audiences into the strategy
Measure blended performance
Look at overall CPA, ROI, and operational impact
Not just which ad set “won”
Let campaigns normalize
You’re changing who you reach
Platform learning needs time to adjust
How to Run a Valid Audiences Experiment (step-by-step)
This section walks through how to design, run, and evaluate a Freshpaint Audiences test so you get a real, accurate read on impact and avoid false negatives.
Step 1: Define Your Control and Your Test
Before you touch any campaigns, decide:
Control (your baseline)
Your existing campaign setup that does not use any audiences built in Freshpaint. This should reflect how you’re running acquisition today.
Examples:
Native Meta / Google audiences
Any current targeting and exclusions
Your normal budget and bidding strategy
This is your “business as usual” benchmark.
Test (your Freshpaint layer)
Your Test should use the exact same campaign structure as your Control, with one change:
Audiences built in Freshpaint are layered into the targeting.
What stays the same:
Budget
Creative
Bidding strategy
Geo
Conversion goals
What changes:
Add Freshpaint exclusions, retargeting, or lookalikes based on the use case you’re testing
This isolates Freshpaint’s impact.
Step 2: Split Traffic Cleanly
You must ensure the Control and Test are not competing with each other.
Use one of the following:
Platform experiments (recommended)
Geo splits (two similar markets)
Time-based splits (if geo splitting isn’t possible)
Goal: Control and Test should receive equivalent traffic and opportunity.
Step 3: Run the Test Long Enough
Audiences changes who you’re reaching, which means platform learning needs time to stabilize.
Minimum guidance:
Run at least 4–6 weeks
Or until you’ve seen a meaningful volume of conversions in both arms
Avoid drawing conclusions in the first 1–2 weeks.
Step 4: What Metrics to Use (and what to ignore)
Do not evaluate based on:
Which ad set “won”
Single-ad-set CPA in isolation
Early learning-phase metrics
Instead, track:
Overall CPA (blended)
Total conversions
Conversion quality
Waste reduction (suppressed traffic, reduced internal calls, etc.)
Lookalike expansion efficiency (if applicable)
You’re evaluating system lift, not ad-set competition.
Step 5: How to Read Results
Ask:
Did overall CPA improve?
Did total ROI improve?
Did we reduce wasted or redundant spend?
Did conversion quality improve?
Did we unlock compliant strategies we couldn’t run before?
If yes → Audiences is working.
Even if one individual ad set looks “more expensive.”
Step 6: Expand Beyond a Single Use Case
The strongest results don’t come from one audience.
After your first test:
Start with exclusions
Add lookalikes
Then layer retargeting
Build a system over time
This is how teams see consistent ROI and why single-use-case tests often under-show value.
Step 7: When to Iterate
Only change one variable at a time:
Add a new Freshpaint audience
Or expand into a new use case
Then repeat the same structure: Control vs Test → Run → Measure blended impact → Expand
Why testing only one use case often under-shows value
Teams that only run:
One audience
One campaign
One short test
Often miss the bigger value of Audiences.
Teams that see the strongest results usually implement some combination of:
Start with exclusions
Add lookalikes
Then layer retargeting
Build a system over time
Common testing questions (and what they usually mean)
What you might notice
What’s happening
How to think about it
CPAs look higher
Audiences are more precise
Look at blended ROI
Audiences are smaller
They’re higher intent
Pair with native reach
A/B tests look worse
Audiences were treated as replacements
Evaluate blended performance
What good testing looks like
Good tests answer:
Did wasted spend go down?
Did overall ROI improve?
Did conversion quality improve?
Did Audiences unlock strategies you couldn’t safely run before?
Not:
Which ad set “won.”
Last updated
Was this helpful?