Whereas, in hypothesis testing, study results lead the reader to reject or accept a null hypothesis, in estimation the reader can assess whether a result is strong or weak, definitive or not. A confidence interval, based on the observed result and the size of the sample, is calculated. It provides a range of probabilities within which the true probability would lie 95% or 90% of the time, depending on the precision desired.
It also provides a way of determining whether the sample is large enough to make the trial definitive. If the lower boundary of a confidence interval is above the threshold considered clinically significant, then the trial is positive and definitive; if the lower boundary is somewhat below the threshold, the trial is positive, but studies with larger samples are needed. Similarly, if the upper boundary of a confidence interval is below the threshold considered significant, the trial is negative and definitive. However, a negative result with a confidence interval that crosses the threshold means that trials with larger samples are needed to make a definitive determination of clinical importance.
In a positive trial - one that establishes that the effect of treatment is greater than zero - look at the lower boundary of the confidence interval to determine whether the size of the sample is adequate. The lower boundary represents the smallest plausible treatment effect compatible with the data. If it is greater than the smallest difference that is clinically important, the sample size is adequate and the trial definitive. However, if it is less than this smallest important difference, the trial is not definitive and further trials are required. In a negative trial - the results of which do not exclude the possibility that treatment has no effect - look at the upper boundary of the confidence interval to determine whether the size of the sample is adequate. If the upper boundary - the largest treatment effect compatible with the data - is less than the smallest difference that is clinically important, the size is adequate, and the trial definitively negative. If the upper boundary exceeds the smallest difference considered important, there may be an important positive treatment effect, the trial is not definitive, and further trials are required.
The point estimate of probability is the value we have obtained (as in a coin toss) - but what is the plausible range within which the true value may lie? Hence the confidence interval. The coin toss example illustrates how the confidence interval tells us whether the sample is large enough - 100 tosses will give a confidence interval within 10% of the point estimate, but 1000 are needed for a confidence interval within 3% of the point estimate. To obtain greater precision, you need more measurements - in clinical research, enrol more subjects, or increase the number of measurements in each enrolled subject.
This article is based on
Basic statistics for clinicians: Basic statistics for clinicians: 2. Interpreting study results: confidence intervals. G. Guyatt, R. Jaeschke, N. Heddle, D. Cook, H. Shannon, and S. Walter
Can. Med. Assoc. J., Jan 1995; 152: 169 - 173