|
>>Sample Size Optimization
Sampling is the foundation of all research. Reliable sampling helps
you make business decisions with confidence.
There are two main components in determining whom you will interview.
The first is deciding what kind of people to interview.
Researchers often call this group the target population. If you conduct an
employee attitude survey or an association membership survey, the population is
obvious. If you are trying to determine the likely success of a product, the
target population may be less obvious. Correctly determining the target
population is critical. If you do not interview the right kinds of people, you
will not successfully meet your goals.
The next step is to decide how many people you need to interview. Statisticians
know that a small, representative sample will reflect opinions and behavior of
the group from which it was drawn. The larger the sample, the more precisely it
represents the target group. However, the rate of improvement in the
precision decreases as your sample size increases. For example, to increase a
sample from 250 to 1,000 only doubles the precision. You must make a decision
about your sample size based on factors such as: time available, budget and
necessary degree of precision.
There are three factors that determine the size of the confidence interval for a
given confidence level. These are: sample size, percentage of sample that
picked a particular answers and population size.
The larger your sample, the more sure you can be that their answers truly reflect
the opinion of the population. This indicates that for a given confidence level,
the larger your sample size, the smaller your confidence interval. However, the
relationship is not linear (i.e., doubling the sample size does not halve the
confidence interval).
Your accuracy also depends on the percentage of your sample that picks a particular
answer. If 99% of your sample said "Yes" and 1% said "No"
the chances of error are remote, irrespective of sample size. However, if the
percentages are 51% and 49% the chances of error are much greater. It is easier
to be sure of extreme answers than of middle-of-the-road ones.
When determining the sample size needed for a given level of accuracy you must use
the worst-case percentage (50%). You should also use this percentage if you
want to determine a general level of accuracy for a sample you already have. To
determine the confidence interval for a specific answer your sample has given,
you can use the percentage picking that answer and get a smaller interval.
How many people does your sample represent? This may be the number of people in a
city you are studying, the number of people who buy new cars, etc. Often you
may not know the exact population size. This is not a problem. The mathematics
of probability proves the size of the population is irrelevant, unless the size
of the sample exceeds a few percent of the total population you are examining.
This means that a sample of 500 people is equally useful in examining the
opinions of a state of 15,000,000 as it would a city of 100,000. For this
reason, the population size is ignored when it is "large" or unknown.
Population size is only likely to be a factor when you work with a relatively
small and known group of people (e.g., the members of an association).
The confidence interval calculations assume you have a genuine random sample of the
relevant population. If your sample is not truly random, you cannot rely
on the intervals. Non-random samples usually result from some flaw in the
sampling procedure. An example of such a flaw is to only call people during the
day, and miss almost everyone who works. For most purposes, the non-working
population cannot be assumed to accurately represent the entire (working and
non-working) population.
Confidence interval
is the plus-or-minus figure usually reported in newspaper or television opinion poll
results. For example, if you use a confidence interval of 4 and 47% percent of
your sample picks an answer you can be "sure" that if you had asked
the question of the entire relevant population between 43% (47-4) and 51%
(47+4) would have picked that answer.
Confidence level tells you
how sure you can be. It is expressed as a percentage and represents how often
the true percentage of the population who would pick an answer lies within the
confidence interval. The 95% confidence level means you can be 95% certain; the
99% confidence level means you can be 99% certain. Most
researchers use the 95% confidence level.
When you put the confidence level and the confidence interval together, you can say that you are
95% sure that the true percentage of the population is between 43% and 51%.
The wider the confidence interval
you are willing to accept, the more certain you can be that the whole
population's answers would be within that range. For example, if you asked a
sample of 100 people in a city which brand of cola they preferred, and 60%
said Brand A, you can be very certain that between 50% and 70% of all the
people in the city actually do prefer that brand, but you cannot be sure
that between 59% and 61% of the people in the city prefer the brand.
Sample Size Calculator
|