|
|
||||||||
From the Department of Optometry and Vision Sciences, The University of Melbourne, Carlton, Victoria, Australia.
| Introduction |
|---|
|
|
|---|
It could be argued that studies using small sample sizes are not meant to quantify general performance within a population but merely to document the existence of an effect, and so the number of subjects is less important. However, the fact that investigators bother to perform replications in such studies implies a wish to demonstrate that their findings are not aberrant and should be taken as representing the performance of the population at large. Why, therefore, is the ability of these studies to predict the populations performance not considered? Can an author justify the extra costs (in time and money) in testing four subjects, when he or she may just as well test only two (or even one)?
This issue becomes even more important when considering that large subpopulations can exist within a population. An obvious case is gender. A naive investigator could perform an experiment on three randomly selected subjects and arrive at the conclusion that all people are female. Although such an example may seem ridiculous, it highlights the effects that sampling artifacts can have, especially when subpopulations exist. Therefore, the question that begs consideration is: what sample size is required to ensure, to a specified confidence, that the results are indicative of the general population?
We will consider the situation in which the presence of a previously undocumented effect is to be investigated. The following assumptions are made:
If assumption 1 is taken to be correct, then the probability of the effect being present can be described by a binomial distribution. Even if the effect is, in fact, part of a continuum, it will typically be rendered binomial by some criterion based on statistical testing (that is, findings are either significant or nonsignificant). For example, a study may investigate the effect of exercise on pulse rate. Although pulse rates represent a continuum (as might the effects of exercise), subjects will either show significantly altered rates or not. In a well-designed study, it is likely that the presence of the effect in each subject will be confirmed using a number of experimental paradigms and rigorous statistical analysis.
Assumption 2 is reasonable and realistic, given that the majority of studies using small sample numbers report serial successes. The situation in which subjects who do not show the effect are present is necessarily more complex and will not be discussed, except to say that any departure within a small sample necessitates a more thorough investigation with enlarged sample numbers.
Assumption 3 needs further consideration. The term selectively normal is used, because many studies have selection criteria for their subjects (e.g., criteria for general health, color vision, visual acuity). As such, subjects are not sampled from the entire population, but from a criterion-determined subpopulation (a selectively normal population). However, it is important to note that samples are often a more narrow subset than stated. Selection from undergraduate or postgraduate students, for example, will result in an overrepresentation of young, educated, myopic subjects, even if age, educational status, and refractive error are not specified as selection criteria. Similar sampling artifacts can unwittingly manifest in animal studies as well.11
If we accept these underlying assumptions, then
can be used to
describe the proportion of the selectively normal population that shows
the effect being investigated. For any number of serial successes
(N) in the sample group, this result is always
consistent with
= 1that is, the entire population shows the
effect. This defines the upper limit on the population proportion,
.
What is more important is to find the smallest population proportion
that is consistent with the observed number of serial successes. Taking
the common statistical criterion of P = 0.05, then
the lower limit for
provides the minimum population proportion for
the effect, with a 95% confidence, given a number of serial successes,
N. Stated another way, if the population proportion were
any smaller than the lower limit on
, there would be a greater than
1 in 20 chance that, in N subjects, the effect would not
be shown (that is, a failure would be present).
The following equation describes the range of values
can take:
![]() |
is the population proportion (as a fraction),
N is the number of serial successes (and is equivalent to
the sample size), and 0.05 is the level of confidence (1 in 20). The
equation is derived from that given by Clopper and
Pearson12
for the calculation of binomial distribution
confidence limits. Solving for the minimum value of
(
min, as a percentage) gives the column headed
min (P = 0.05) in Table 1
.
|
min be? For an
unknown effect, a useful starting point is that an effect must be
present in the majority of the population if it is to be classified as
"normal"; that is,
min must be at least
50%. Using this assumption (as well assumptions 13) a sample size
N = 5, all showing the effect, is required to
confidently (P = 0.05) say that the population
proportion for the effect is greater than 50%. The sample size must be
increased if subjects who do not show the effect are present (that is,
serial successes are not achieved). For completeness, Table 1
also
lists the relationship between
min and sample
size for P = 0.10 and P = 0.01. Using these
criteria, sample sizes of four and seven, respectively, are required to
be consistent with a population proportion of at least 50%.
To provide more confident estimates of the population proportion, much
larger numbers are needed. For example, to be confident (P =
0.05) that the population proportion is at least 95%, 59 subjects
showing the effect would be required. Such studies, however, are rarely
performed. Instead, it is more common for data to be collected on a
smaller sample, whose size is determined by a power analysis and mean
values for the magnitude of the effect compared with conventional
statistical analyses (e.g., t-tests). It should be noted,
however, that these latter types of analyses determine whether a
significant effect exists in the population on average and provide no
estimate of the population proportion,
. Such analyses may be
successfully used on small-sample-size psychophysical
data.13
It should also be noted that a study may not be designed to quantify
the performance of a normal population, but that of a disease group
instead.5
The model outlined herein is identical, however,
except that the predicted values for
min now
relate to the population of observers with a particular disease,
instead of the normal population.
It is possible that the model can be improved. Often, an investigated
effect is shown to be dependent on, or correlate with, a previously
documented effect. In such cases, the estimated population proportion
of this previously documented effect provides additional information
about the population proportion of the investigated effect, and so a
more confident estimation of
may be made than that given in Table 1
. As such, it may be possible to use reduced numbers of subjects to
clarify aspects of documented "normal" effects. However, there are
also instances in which the outcomes of similar experiments differ
between authors. In such cases, the estimated population proportion of
the previously documented effect provides additional knowledge that
reduces our confidence in our estimation of
. It should be
emphasized, however, that the reliability of such previous studies
depends on the number of subjects investigated and the soundness of the
studies experimental designs.
It is possible that some form of Bayesian logic could be used to combine the results of previous small-sample-size studies with new studies, in a way similar to that proposed for clinical decision making.14 Until the validity of such a model has been established for the type of data discussed in this article, the approach outlined herein provides a starting point for determining the general applicability of studies making use of small sample sizes. Despite criticisms,1 a sample size of five may well be useful in scientific research.
In summary, the model outlined allows predictions to be made from experimental data obtained from limited numbers of samples. Our approach is appropriate for studies documenting the presence of an effect in each of a small number of subjects and allows inferences to be made regarding the proportion of the population expected to show the same effect. As such, the model may be usefully employed in small-sample-size psychophysical investigations, so that the general applicability of results may be predicted. In addition, the model may be used to estimate the number of subjects needed to determine, to a desired statistical confidence, the prevalence of an effect. Our approach is not applicable to analyzing the magnitude of a particular effect within a population, however; conventional power analyses and statistical testing are available for this task.
| Footnotes |
|---|
Corresponding author: Algis Jonas Vingrys, Department of Optometry and Vision Sciences, The University of Melbourne 3010, Keppel and Cardigan Streets, Carlton, Victoria 3053, Australia. a.vingrys{at}optometry.unimelb.edu.au
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
C. V. Hutchinson and T. Ledgeway Asymmetric Spatial Frequency Tuning of Motion Mechanisms in Human Vision Revealed by Masking Invest. Ophthalmol. Vis. Sci., August 1, 2007; 48(8): 3897 - 3904. [Abstract] [Full Text] [PDF] |
||||
![]() |
J R Phillips Monovision slows juvenile myopia progression unilaterally Br. J. Ophthalmol., September 1, 2005; 89(9): 1196 - 1200. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Anderson and C. A. Johnson Anatomy of a Supergroup: Does a Criterion of Normal Perimetric Performance Generate a Supernormal Population? Invest. Ophthalmol. Vis. Sci., November 1, 2003; 44(11): 5043 - 5048. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |