Confidence Level Calculator

Enter the quantity mailed and number of gifts for each of your two test panels, and then click on "Calculate." If you've never used this calculator before, please read the explanation below.

 

Test A Test B
Quantity mailed
Gifts received
Response Rate % %
Confidence Level for this test %

 

What does it mean?


Confidence Level is the likelihood - expressed as a percentage - that the results of a test are real and repeatable, and not just random. The idea is based on the concept of the "normal distribution curve," which shows that variation in almost any data (such as the heights of all fourth-graders, or the amount of rainfall in January) tends to be clustered around an average value, with relatively few individual measurements at the extremes.


So if your confidence level is, say, 92%, that means, according to probability theory, there's a 92% chance that you'd see similar results in a repeat of the test. (It does not mean you'd receive the same number of gifts, or that the difference between the packages would be the same. It only means that the package that received more gifts in the first test would be likely to receive more gifts in the second as well - unless, of course, some significant other factors have changed.)


A confidence level of 50% would mean the difference is truly random, with only a 50-50 chance that you'd see the same results in a repeat of the test. Even at 75% the odds are not good - there's a one in four chance that your results are meaningless. Some statisticians consider 90% to be the minimum confidence level for statistically significant results, and that's reportedly the standard used in many election polls. Others insist on a minimum of 95% to be considered significant. And in medical research, for obvious reasons, there's a strong preference for even higher levels of confidence.


It's important to remember that we're talking about probabilities, and there's no magic number that guarantees your results will be repeatable. While it's always best to have a confidence level of 95% or higher, you shouldn't ignore results in the 80% to 90% range. Those results may indicate trends and provide clues about how to improve your mailings; at the very least, they're worth re-testing, preferably in larger quantities. (In any test, a larger sample size will generally give more reliable results.)


Finally, a few words of caution: These confidence levels are only valid when you're comparing test panels that can be thought of as a single event. Don't try to adapt them to a situation that changes over time (such as your total number of active donors), or use it to compare appeals that mailed at different times (there are far too many uncontrolled variables in that case). You'll also notice that the formula allows you to use test panels of different sizes - but if you do, make sure your merge/purge house is extremely careful about producing statistically equivalent lists of names. And, of course, this calculator only addresses the question of response rate - not average gift, acquisition cost, or other variables. As with any other mailing, you'll need to evaluate the results of your tests in terms of your overall fundraising goals.


 This article is excerpted from Testing, Testing, 1, 2, 3: Raise More Money Through Direct Mail Tests, by Mal Warwick (Jossey-Bass Publishers, 2003). Copyright (c) 2002 by Mal Warwick.