Understanding Test Results
How to read your A/B test results and decide when to act.
The Results Dashboard
Click on any A/B test to open its results. The main numbers you'll see are:
- Visitors — the total number of people who have been part of the test for each variant
- Conversions — how many of those visitors completed the goal (e.g. made a purchase or signed up)
- Conversion rate — the percentage of visitors who converted (conversions ÷ visitors × 100)
- Uplift — the difference in conversion rate between Variant B and Variant A, shown as a percentage
What is the Confidence Level?
The confidence level tells you how likely it is that the difference you're seeing is real and not just random chance.
- Below 80% — too early to draw conclusions, keep the test running
- 80–94% — there's a trend, but not enough data yet
- 95% or above — this result is statistically significant; you can act on it with confidence
A 95% confidence level means there is only a 1 in 20 chance that the difference you're seeing is just luck. This is the standard threshold used by researchers and marketers worldwide.
When to Stop a Test Early
In most cases, let the test run until it reaches 95% confidence. However, you may want to stop a test early if:
- Variant B is clearly hurting performance — if one version has significantly lower sales after a decent number of visitors, it's fine to stop and revert
- A major event is happening — sales events or seasonal spikes can skew results, so pause tests during Black Friday or similar periods
- You need the traffic for something else — you can pause and resume tests
Declaring a Winner
- Open the test from A/B Tests
- Once a variant reaches 95% confidence, click Declare Winner
- Confirm your choice — the winning variant becomes the default for all visitors
- The other variant is archived
If neither variant reaches 95% confidence after several weeks, the test is inconclusive. This means the two versions perform similarly — you can pick either one, or try a more dramatic difference next time.
Common Mistakes to Avoid
Testing too many things at once — change only one thing between variants so you know what caused the difference. If you change the hero image, headline, and button color all at once, you won't know which change made the difference.
Stopping too soon — getting excited about early results is tempting, but a test needs enough visitors before the numbers are reliable. Aim for at least a few hundred visitors per variant.
Ignoring seasonal effects — if your test starts on a Monday and you evaluate on Friday, you may be comparing weekday traffic to weekend traffic. Let tests run for complete weekly cycles.
Next Steps
- A/B Testing overview — set up your first test
- Business rules — automate store actions based on behavior