Understanding Test Results

How to read your A/B test results and decide when to act.

The Results Dashboard

Click on any A/B test to open its results. The main numbers you'll see are:

Visitors — the total number of people who have been part of the test for each variant
Conversions — how many of those visitors completed the goal (e.g. made a purchase or signed up)
Conversion rate — the percentage of visitors who converted (conversions ÷ visitors × 100)
Uplift — the difference in conversion rate between Variant B and Variant A, shown as a percentage

What is the Confidence Level?

The confidence level tells you how likely it is that the difference you're seeing is real and not just random chance.

Below 80% — too early to draw conclusions, keep the test running
80–94% — there's a trend, but not enough data yet
95% or above — this result is statistically significant; you can act on it with confidence

Tip

A 95% confidence level means there is only a 1 in 20 chance that the difference you're seeing is just luck. This is the standard threshold used by researchers and marketers worldwide.

When to Stop a Test Early

In most cases, let the test run until it reaches 95% confidence. However, you may want to stop a test early if:

Variant B is clearly hurting performance — if one version has significantly lower sales after a decent number of visitors, it's fine to stop and revert
A major event is happening — sales events or seasonal spikes can skew results, so pause tests during Black Friday or similar periods
You need the traffic for something else — you can pause and resume tests

Declaring a Winner

Open the test from A/B Tests
Once a variant reaches 95% confidence, click Declare Winner
Confirm your choice — the winning variant becomes the default for all visitors
The other variant is archived

If neither variant reaches 95% confidence after several weeks, the test is inconclusive. This means the two versions perform similarly — you can pick either one, or try a more dramatic difference next time.

Common Mistakes to Avoid

Testing too many things at once — change only one thing between variants so you know what caused the difference. If you change the hero image, headline, and button color all at once, you won't know which change made the difference.

Stopping too soon — getting excited about early results is tempting, but a test needs enough visitors before the numbers are reliable. Aim for at least a few hundred visitors per variant.

Ignoring seasonal effects — if your test starts on a Monday and you evaluate on Friday, you may be comparing weekday traffic to weekend traffic. Let tests run for complete weekly cycles.

Next Steps

A/B Testing overview — set up your first test
Business rules — automate store actions based on behavior