Forget A/B testing, Do This Instead

Picture of Siim Pettai

Siim Pettai

Retention marketer for eCommerce brands

Imagine this: you’re in charge of email marketing for a 8-figure eCom brand.

Your goal is to optimize channel revenue, so you set up an A/B test around call-to-action button placement.

One campaign has a CTA button above the fold (in the hero section).

The other campaign has a CTA button at the bottom.

Variation A results in a 3.25% click rate, $6,500 revenue.

Variation B results in a 2.50% click rate, $5,000 revenue.

You gasp.

Did you just find a “winner”?

But more importantly…

Have you been missing out on tens of thousands of revenue because your previous campaigns never had a CTA button above the fold? 

Let me explain … 

These A/B tests tend to look good on the surface, but there’s a fundamental flaw in how they’re measured.

3 Problems With Email A/B Testing

1) You Don’t Actually Know The Impact Of These Tests

While your A/B test may seem like it resulted in more revenue, you don’t actually know if the lift was random, or because of the variable. You can only validate the incremental revenue by analyzing your sales data in Shopify (and that requires some manual work). 

2) Most A/B Tests Never Reach Statistical Significance 

Most brands just don’t have the audience size and/or transactional volume to run valid A/B tests. Your A/B test needs to run long enough (or have enough data) to achieve statistical significance

If you split test a campaign sent to 10,000 subscribers, the results you get will always be noise. Here’s an example:

Email A/B test statistical significance

With an email list of 10k subscribers, you’ll never know if your split tests actually led to more revenue, but you can optimize for higher engagement, e.g. click rates. 

3) Aggressive Promos/Urgency Always Wins

Aggressive promos will always win against full price campaigns. And if you add time pressure (e.g. urgency + scarcity), it becomes impossible to isolate whether your actual variable had any impact at all. You’re back to square one — you don’t know what actually moved the needle.

What To Do Instead: Holdout Testing

A holdout test is a technique where you deliberately prevent a small group of users from seeing your email campaign.

In simple terms, the test group receives your email campaign, while the holdout group receives nothing at all.

Here’s what that looks like in a graph:

Email holdout test

In this context, you’d take 10-20% of the segment you created before, and exclude them from your campaigns.

Holdout tests are good if you want to validate tactics that look good in your ESP. They do require a bit more effort, since you can’t measure the results directly in Klaviyo (or any ESP).

Below are some holdout tests I’d be looking at running this year.

These are all relatively high impact, and easy to set up. 

3 Retention Holdout Tests To Run in 2026

#1 Post-Purchase Cadence

Double your current post-purchase email cadence. If you’re currently sending 3 emails, send 6. Split one-time buyers: 80% receive the new cadence, 20% receive your current cadence. 

To track true incrementality, measure repeat purchase rates over 30 days using Shopify data. Validate if doubling your email volume actually drives more repeat purchases.

#2 Direct Mail

Another simple holdout test is postcards. Again, I’d test this on one-time buyers. For example, if you send 80% of your one-time buyers a postcard on day 15, and the rest of the 20% receive nothing, does it lead to incremental lift in revenue over 30 days?

This should be particularly easy to implement with PostPilot. 

#3 Plain Text At Risk

Figure out when churn happens in your brand. When do > 50% of your repeat purchases happen? Is it by day 30, 60, or 200 after the first order? 

Then create a plain text 3-email sequence targeting customers at risk of churning.  

Again, one segment receives the automation, the other one doesn’t. 

After a month, see if the group that received the plain text sequence had higher repeat purchase rates than the holdout group.

Remember: the bigger the audience, the bigger the impact.

All these tests I just mentioned are aimed at re-converting one-time buyers. The reason is simple: you’re more likely to see a revenue impact by trying to re-convert 50,000 people who have bought from you once vs 3,000 VIP customers. It’s just a numbers game. 

Good luck!

FREE DOWNLOAD:
15 Email Automations To Generate Passive Revenue From Your Email List

Ecommerce Email Automation Playbook (5)

    We respect your privacy. Unsubscribe at any time.

    Ecommerce Email Automation Playbook (5)

    Discover more from Retention marketing for eCommerce brands

    Subscribe now to keep reading and get access to the full archive.

    Continue reading