So now you have the “right” questions – questions that drive meaningful response – and are ready to go. Next step is sampling.

**Consider the following famous example:**There are two hospitals: in the first, 120 babies are born every day, in the other, only 12. On average, the ratio of baby boys to baby girls born every day in each hospital is 50/50. However, one day, in one of those hospitals twice as many baby girls were born as baby boys. In which hospital was it more likely to happen?

The answer is obvious for a statistician, but as research shows, not so obvious for a lay person: it is much more likely to happen in the small hospital. The reason for this is that the probability of a random deviation from the mean decreases with the increase of the sample size.

Sampling is the foundation of all research and, if done correctly, should yield valid and reliable information.

**The sample size depends on a number of factors:**

**Population Size** – How many people does your sample represent? This may be the number of people in a city you are studying, the number of people who buy smartphones, etc. Often, you may not know the exact population size and may be ignored when it is “large” or unknown.

**Confidence interval (error rate)** – the plus-or-minus figure usually reported in newspaper or television opinion poll results. For example, if you use a confidence interval of 5 and 90% percent of your sample answered that they “like Fridays more than other days of week” you can be “sure” that if you had asked the question of the entire relevant population between 85% (90-5) and 95% (90+4) would have “liked Fridays” as well.

**Confidence level** – expressed as a percentage and represents how often the true percentage of the population who would pick an answer lies within the confidence interval. 95% confidence level means that if you repeat the survey 100 times, 95 times out of 100 it will produce the same answers. It gives you an idea how sure you can be in your results.

Your accuracy also depends on **the percentage of your sample that picks a particular answer**. If 99% of your sample said “Yes” and 1% said “No” the chances of error are remote, irrespective of sample size. However, if the percentages are 51% and 49% the chances of error are much greater.

**Here is what I read in a respectable newspaper. It said:**

“…Research findings clearly indicate that the majority of the entire adult population will purchase the new product.

The research was conducted among 390 adults, where 53% of the respondents said they would definitely or probably purchase the new product….”

**Is there a problem?**

The sample of 390 adults ensures statistical accuracy of the results with the error rate of ±5%. It means that, in reality, this 53% can actually be in the range of between 48% (53-5) and 58% (53+5). As a result, it is incorrect to conclude that “the majority of the entire adult population will purchase the new product”.

So does the sample size matter? Yes and no. The large the sample size the smaller the chance for an error, but the sheer size of a sample does not guarantee its ability to accurately represent a target population. Large unrepresentative samples can lead to wrong conclusions the same way as small ones.

Next time will talk about data analysis and see how critical it can be for delivering accurate insights and actionable recommendations.

Please visit MaCorr Research website and download free sample size calculator. You will also find there for more details about sampling.