The Proper Way to Conduct Stratified Sampling

Authored by Dr. Kelvin Goh, PhD in Physics

In an ideal world, our sampling frame would cover the entire population and we only need to rely on Random Sampling to get a representative sample. However, in practice, we can never get a sampling frame of 100% of the population.

To ensure that our sample profile matches to the population as closely as possible, we need to stratify our sample according to the population census such as Gender: Male – 52% and Female – 48%; Race: Bumiputra – 62%; Chinese – 21%; Indian – 6% and Others – 11% etc., as per the Malaysian census.

Sampling in Malaysia is especially challenging due the high variance in demographics among different states. For example: Penang have 39% Chinese while Terengganu only have 2% Chinese. Stratifying a sample according to the national average of 21% Chinese population will not be accurate because you will risk oversampling Chinese in Terengganu and undersampling Chinese in Penang, unless your survey method has a large sampling frame.

For nationwide sampling in Malaysia, it is recommended to stratify the sample using more granular segments e.g.: Female X Malay – 25%, Male X Chinese – 10%). Higher rank stratification allows for more precise representation of the population. However, it requires larger samples size and makes the sampling fieldwork more difficult, expensive and time consuming.

To reduce the need for precision representation, researchers often narrow their focus to study only specific target markets, such as online shoppers in Kuala Lumpur or regular consumers of fresh milk. However, in order to conduct stratified sampling on Kuala Lumpur’s online shopper population, one needs to know the representative strata of Kuala Lumpur’s online shopper population (e.g.: 44% Male and 56% Female) which may differ from the population census. Unlike the census, it is often the case that the representative strata of a target market are not known, and would require running an incidence rate measurement (Incidence Rate Measurement) on the general population to obtain that insight.

The importance of Incidence Rate Measurement in stratified sampling is often underestimated

Stratified sampling relies on matching the sample strata to the representative strata to accurately reflect your target market. If your sample strata are matched against the wrong strata, your sample will not be representative of the target market, hence making your insights inaccurate. Therefore, it is very important that your Incidence Rate Measurement on your target market is done properly to ensure that your survey sample is stratified to accurately reflect your target population.

Because Incidence Rate Measurement surveys are usually conducted on the general population, large sample size and sampling frame is needed to obtain high confidence level. Incidence Rate Measurement surveys are typically very short with no more than 5 questions. Therefore, to reduce cost, researchers often bundle the Incidence Rate Measurement questions for clients of various industries into an omnibus survey that samples the general population with large sample sizes. However, omnibus surveys do not run frequently, hence it may take a while before obtaining the results of your Incidence Rate Measurement.

Example of Stratified Sampling

Now that I have explained the necessary steps to conduct stratified sampling, let me summarize the process for stratified sampling by using our earlier example of a brand health survey on online shoppers in Kuala Lumpur. First step is to determine the representative strata of the online shoppers in Kuala Lumpur and that can be done by conducting Incidence Rate Measurement on the general population of Kuala Lumpur. After the strata of the online shoppers is obtained, we can then begin to implement the brand health survey on online shoppers in Kuala Lumpur by stratifying the sample according to the strata that is in accordance to the Incidence Rate Measurement. This will ensure that the data obtained from this brand health survey have a high degree of representation of the online shoppers in Kuala Lumpur.

However, it is worth noting that stratified sampling only reduces the sampling bias due to having small sampling frame by matching the sample strata of a handful of controlled variables (ie: demographics) to the census. It still does not resolve the sampling bias due to other unknown variables. The sampling bias due to unknown variables can only be resolve by having high degree of random sampling or large sampling frame in which traditional online panels do not have.

The Vodus Solution

Due to Vodus unique ability to draw large samples sizes from the general population in a very short period of time, we have the ability to help clients to significantly shorten their fieldwork timeline by completing their Incidence Rate Measurement with huge sample sizes (N = 20K) within a day. Furthermore, such huge sample size can be obtained at a significantly lower cost than the traditional survey methods.

By deriving the strata of their target market from Incidence Rate Measurement with huge sample sizes and an online panel with huge sampling frames, clients can be assured that their stratified sample will be truly representative of their target population.

For more info on how Vodus can help you gain massive sample sizes at a low cost, please reach out to us at contact@vodus.com. 

GET FULL REPORT

Let's talk

Let us help you grow your company with accurate insights.