Learn why your sample needs to be stable, homogeneous, and representative of the batch from which it is taken.
Developing a precise and accurate analytical method is important, but it is not enough. The sample itself plays a role in determining the quality of your results. Your sample needs to be stable, homogeneous, and representative of the batch from which it is taken. We discuss all this in more detail below and what it means for the quality of the results obtained in cannabis analysis.
The term sample is defined as a part of anything presented. . . as evidence of the quality of the whole (1). Samples are something we encounter in our everyday lives. We may nibble on a sample of food presented to us at a grocery store. Your doctor may collect a sample of your blood. Come election time we are bombarded with opinion polls that talk about something called “sample size” and that pesky “margin of error.” Ideally, all these samples represent a greater whole from which they are drawn. Hopefully, the product you buy because of a food sample will be as tasty as the nibble that originally enticed you. Your blood sample needs to be representative of your health or your doctor might make a wrong diagnosis. If a political poll is biased it can be misleading. A sample taken improperly, one that does not represent the quality of the whole, is called an unrepresentative sample. Unrepresentative sampling leads to what is called sampling error.
What’s all this got to do with cannabis analysis? In many other industries we must collect representative samples to perform chemical analyses, the cannabis industry is no different. In fact, in California the cannabis analysis laboratories themselves are tasked with collecting representative samples for compliance testing (2). Also, most cannabis businesses collect samples for testing for their own purposes. The purpose of this article then is to discuss the importance of representative sampling, some practical tips on how to ensure representative sampling, and how to quantitate and minimize sampling error.
Why should we care about any of this? Because an unrepresentative sample can give incorrect or misleading results. For example, it has been shown that cannabis buds harvested from the top, middle, and bottom of a plant can vary in potency by several percent (3), and that the buds at the top tend to have the highest potency. Now, if it’s your job to collect a representative sample from a field of plants, how would you go about doing it? It might be tempting to grab buds from the top of the plants because that is easiest and would give you the highest potency reading. This would also be wrong because it would not give you an accurate picture of the potency of your crop. You can’t possibly optimize the use of water, fertilizer, and light if you don’t have a representative way of sampling your grow. So, how do we know if the sample we’ve taken is truly representative of the whole? How do we go about collecting a sample to ensure this? Please keep reading.
The tern aliquot is defined as comprising a known fraction of a whole (4). An aliquot is something most people haven’t heard of, unless you’re a chemist. An aliquot is similar to a sample, but in analytical chemistry an aliquot is generally the portion of the sample that is analyzed. An aliquot then is a sample of the sample, and it too must be as representative of the whole as the original sample was.
We are forced to take aliquots because of the needs of our instruments. For example, the state of California requires the collection of a 50 lb harvest batch of cannabis plant material prior to analysis (2). We can’t shove 50 lbs of material through a chromatograph or spectrometer. In fact, today’s analytical instruments are so sensitive that only small amounts of sample are required for a successful analysis. For example, most high performance liquid chromatography (HPLC) potency measurement methods only require 1 g or so of sample (5–7). The 1-g portion analyzed in this case is an aliquot. Our same concerns about samples expressed above also apply to aliquots. How do we know if the aliquot is truly representative of the sample? How do we go about collecting an aliquot properly? How do we quantify aliquot homogeneity?
In sampling, homogeneity is our friend and inhomogeneity is our enemy. Homogeneous materials are the same throughout the whole. Inhomogeneous materials vary throughout the whole. Not surprisingly then, collecting representative samples can range from being easy to difficult depending upon how homogeneous the whole is. A perfect whole from which to collect a sample would be homogeneous and convenient to sample. Liquids fit this bill. Our enemy in sampling liquids is concentration gradients, an inhomogeneous distribution of molecules. Since molecules can diffuse in liquids, concentration gradients tend to naturally decrease over time, but this happens easier in nonviscous liquids than it does in viscous materials: think about the difference in viscosity between water and cannabis distillate. Fortunately, any concerns about liquid inhomogeneity can be overcome by simply mixing or stirring the sample. I performed a small experiment to confirm this. I was given a vial of unwinterized cannabis extract and asked to measure its potency. I used a mid-infrared (IR) method since it was fast and convenient (8,9). The sample was a thick, brown liquid with yellow streaks in it, obviously not uniform. I initially took three separate aliquots from the sample and analyzed them for total tetrahydrocannabinol (THC). I observed a spread in the values. I then simply stirred the sample to homogenize it, measured three more aliquots for total THC, and found the spread in measurements had been reduced by a factor of 2. This is a strong argument then for always stirring liquids before analyzing them.
Now, what if you are tasked with collecting a representative sample of multiple containers of liquids? If the liquids are not too viscous, and if you have a large enough container, combining the contents of all containers in one vessel, stirring, and then taking samples makes sense. But what if you have five jars of thick, viscous cannabis distillate to sample? These materials are not amenable to pouring as they have the consistency of honey or even taffy. What do you do then? Since this material is sticky you can take a long rod, such as a glass rod or metal spatula, plunge the rod deep into each jar of distillate, and then pull it out. The material that adheres to the length of the rod will come from different depths in the jar and be a good representative sample of what is in the jar. For thick samples like cannabis distillate, gentle heating to liquify the material to remove it from the rod and then stirring can be performed to homogenize it. Next you have to collect the material that adhered to each rod, combine it into a sample, and then take and analyze representative aliquots.
Representative sampling is more difficult with solids because the molecules in general do not diffuse to iron out concentration gradients. One can stir solids or shake up solids, but this can worsen inhomogeneity. Do you remember when you were a kid and you opened a cereal box for the prize inside and it was always at the top? There is physics behind this-any time a collection of particles of different sizes are shaken, the big ones will rise to the top. Thus, your instinct to shake solid samples to homogenize them is counterproductive.
So, what do we do? One way to homogenize solids is to grind them. Any number of grinding devices from mortars and pestles to expensive, high speed grinders exist (10,11). In a perfect world, a grinder will take a sample and produce a finely divided power of uniform particle size and shape. Whether this is achieved of course depends on the sample and the grinder.
An example of a substance that is particularly difficult to grind homogeneously is cannabis plant material. Even well-trimmed cannabis bud is inhomogeneous because buds contain stems and leaves. The trim used to make extracts is worse-containing buds, stems, leaves, seeds, and sometimes foreign matter. Even worse, individual buds are inhomogeneous, as can be seen under a microscope. The cover of a recent issue of this magazine illustrates my point (12).
My own experience is that a mortar and pestle does not work on cannabis plant material because it is not dry enough to pulverize into a powder. Bud grinders commonly sold by dispensaries are insufficient because they give large chunks of inhomogeneous material. Coffee grinders by themselves are problematic. My own experience is that when buds are ground you end up with a chunky, gooey mess. The problem here is that the friction of grinding heats up the sample, melts the resin, and causes the mess. What I have found works well is placing a 1-in. diameter piece of dry ice into the coffee grinder with the plant material. The dry ice absorbs the heat of grinding, prevents the resin from melting, and hardens the plant material so it pulverizes better. I have also heard that an industrial spice grinder works well on cannabis plant material (13). Regardless of what grinding method you choose, the resulting material should be as uniform as possible.
Regardless of our efforts to homogenize samples, a truly 100% homogeneous sample is rare, particularly with ground samples. How then do we get around this problem? The answer is: By taking multiple aliquots, analyzing them, and averaging the results. The more aliquots you analyze, the more likely it is that any variation in composition in your sample will be captured in your analyses, giving you an accurate picture of your sample. This is the “sampling size” you might have heard about in opinion polls. So, how many aliquots should you analyze? As many as possible. Of course, in the real-world time, money, and other constraints put limits on this. In general, three aliquots are the minimum number that should be collected and analyzed to have even a hope of obtaining a representative sample.
There are two reasons to maximize the number of aliquots analyzed. The first has to do with reducing random error and maximizing the signal-to-noise ratio (SNR) of a measurement. Recall from my last column (14) that random error can be reduced, and accuracy improved, by making multiple observations and averaging them together. The SNR of the measurement improves as the square root of the number of observations averaged as such:
SNR∝ (N)1/2 
where SNR is the signal-to-noise ratio and N is the number of observations averaged together.
The fact that we cannot analyze the whole of something and have to sample is the cause of sampling error. Sampling error is a type of random error, and its effects can be reduced like any other random error by taking multiple aliquots, analyzing them, and then calculating an average reading.
Recall (14) that random error has an equal probability of having a positive or negative value, and is expressed as a result ±x, where x is the random error in the measurement. Sample inhomogeneity can be a source of random error and contribute to its magnitude. How do we quantitate this? How do we measure whether a sample is homogeneous or heterogeneous?
Imagine you are tasked with determining the potency of a bag of cannabis buds, and you decide to experiment with two different grinding techniques. You select five aliquots of each ground sample, follow your standard potency method, and obtain five results. Let’s say the results for grinding method 1 are 18%, 19%, 20%, 21%, and 22%. The average of these five numbers is 20%.
Imagine the results for the five aliquots using grinding method 2 are as such 16%, 18%, 20%, 22%, and 24%. The average here is still 20%. Since these two sets of measurements gave the same average, are the grinding methods of equivalent quality? The answer is no. Since we sampled the same material here, and assuming the potency method was performed the same on all samples, the spread in the two sets of readings can tell us how homogeneous the samples are, which is another way of saying the sampling error. The first method is superior to the second because the tighter spread of readings produced a more homogeneous sample than the second method. How do we go about then quantitating the spread in a set of data, and get a handle on the size of sampling error?
We can quantitate the amount of scatter in a data set, and thus obtain a numerical measure of dataset quality, by calculating its standard deviation, which is calculated as such (15):
σ = (Σ(x – x’)2/n-1)1/2 
where σ is the standard deviation, x is a measured value, x’ is the average of a set of values, and n is the number of observations in the dataset.
To calculate the standard deviation for a set of values calculate the average, subtract each measured value from the average, square each of those differences, add them together, divide by the number of observations averaged minus 1, and then take the square root of that number. The standard deviation comes out in the same units as x. Thus, if x is in units of weight percent (wt.%) THC, so is the standard deviation. Simply put, the standard deviation is the average deviation between a set of individual readings and their average. A dataset with large amounts of scatter will have a larger standard deviation than a dataset with a smaller amount of scatter. The dataset from grinding method 1 above has a standard deviation of 1.58 wt.% THC, while the second grinding method dataset has a standard deviation of 3.16 wt.% THC. Even though both datasets have an average of 20 wt.%, the first grinding method is superior to the second because its standard deviation is smaller by a factor of 2. Simply put, grinding method 1 produces a more homogeneous sample and thus has less sampling error than grinding method 2.
In the bigger picture, there are many sources of random error as discussed in my previous column (14). When multiple observations of a quantity are obtained, the standard deviation measures the amount of random error in the measurement. Recall (14) that accuracy is a measure of how far away you are from the true value. Imagine x’ in equation 2 is the true value for a quantity obtained from a standard reference material, and each x is the set of values as determined by your instrument. Inserting these values into equation 2 will yield the accuracy of your instrument (16). Datasets with narrower distributions will always have less random error, have a smaller standard deviation, and have a higher accuracy than datasets with a broader distribution.
How then do we minimize sampling error, and random error in general? We can achieve that by maximizing the number of aliquots analyzed. According to equation 1, increasing the number of aliquots increases the SNR of a measurement. According to equation 2 since n, the number of observations made, appears in the denominator as it goes up σ goes down. Both equations then argue powerfully for analyzing as many aliquots as possible when sampling.
Let me be clear about something though. Taking a single aliquot of a sample and analyzing it is by definition not representative sampling, even if your state regulations say this is OK (2). All samples will contain some variation, and a single sample by itself cannot capture the variation in any sample or in the whole the sample represents. I am thus distressed by the constant trend in the cannabis analysis industry to only analyze one aliquot of a sample. The minimum number of aliquots you should test is three, and ideally many more, particularly when faced with a complex, inhomogeneous, natural material such as the cannabis plant. This argues then for the development in this industry of analytical techniques that are fast, easy, and inexpensive so that multiple aliquots can be analyzed in a timely and efficient manner (8,9) to obtain representative sampling.
We have seen that a sample ideally is a representative portion of a whole, and an aliquot is a representative portion of the sample that is analyzed. The challenges of sampling liquids, solids, and powders were discussed. A measure of accuracy, random error, sample inhomogeneity, and the standard deviation was introduced. To maximize the accuracy of a dataset, the number of aliquots from a sample analyzed should be at a maximum. Measuring a single aliquot by definition does not provide a representative sample and should be avoided.
Brian C. Smith, PhD, is Founder, CEO, and Chief Technical Officer of Big Sur Scientific in Capitola, California. Dr. Smith has more than 40 years of experience as an industrial analytical chemist having worked for such companies as Xerox, IBM, Waters Associates, and Princeton Instruments. For 20 years he ran Spectros Associates, an analytical chemistry training and consulting firm where he taught thousands of people around the world how to improve their chemical analyses. Dr. Smith has written three books on infrared spectroscopy, and earned his PhD in physical chemistry from Dartmouth College.
B.C. Smith, Cannabis Science and Technology2(1), 14-19 (2019).