Degree of Variance in THC Levels in Samples Taken from Single Lots of Medical Cannabis

In the work presented here, the author shares research from collecting reports from the New Jersey State laboratory (the Department of Health’s Environmental and Chemical Laboratory Services), which has conducted testing for the state’s medicinal program. He documented an average variance of 20% in potency levels from different samples of the same test lot.

Cannabis products are commonly labeled for sale with a single number as an indicator of potency, or one number for each cannabinoid listed. These potency values are derived from testing that is done on multiple samples for each production lot, but information on the range of potencies found in those samples is generally not reported. By collecting reports from the New Jersey State laboratory (the Department of Health’s Environmental and Chemical Laboratory Services), which has conducted testing for the state’s medicinal program, I documented an average variance of 20% in potency levels from different samples of the same test lot.Surprisingly, a greater degree of variance was found in strains that had mid-level potency (<15% THC) than in higher potency strains.

Many companies that sell cannabis products advertise the consistency of their products, especially for medicinal use. Rigorous quality control testing is needed to back up such claims, and that testing should be done on several randomly chosen samples of the product. This has long been the standard for manufactured and processed products, but is even more important for natural, unprocessed products.

All natural agricultural products can be expected to have plant-to-plant variation in their properties. Because of variable growing conditions, this is even true of plants that are clones of the same mother plant. In recognition of this natural variation, most state guidelines for cannabis testing specify the number of separate samples that must be compared to produce potency ratings, with more samples needed for larger lot sizes.

While testing guidelines are written to consider natural variation between different plants, the literature on this subject appears to lack well-controlled documentation of how much variation should be expected. In a research document detailing recommendations for a sampling program for the state of Washington, Sexton and Ziskind (1) documented variation in the quoted tetrahydrocannabinol (THC) levels between different production lots of the same strains but cited no evidence for variation within each lot. In fact, they wrote that, “... there is little literature to tell us that plants of the same strain grown under identical conditions can be assumed homogenous, or to what extent they differ.” Experienced growers and testers are surely familiar with the degree of variation that exists, but a well-controlled study may provide some further insight.

The variance across samples does not present a methodological problem for determining a single representative value: the samples can be ground together to produce a composite with greater homogeneity, just as consumers are able to do when they receive the product. Although testing laboratories have ways to derive a single representative number for a batch, that does not take away the variance between different portions (buds) of the product that consumers will confront when they acquire it.

A different kind of variation, that between ratings made by different laboratories on the same batch (2), has been a recurring concern in the industry, but is a different problem than the variation inherent in the plant. There are ways to address concerns about different laboratories producing different results, but having the means to derive a “right” answer should not obscure the inherent variances in the plant material itself. Consumers should be given the most accurate picture of the product, including how much variance they might expect from bud to bud.

One benefit of a careful examination of sample-to-sample variability inherent in cannabis products may be as a guide for regulations that results in more informative labeling. While every product is labeled with its THC content, this single value does not convey the variation that exists between different portions of the same lot. Rather, the numbers on labels, particularly when they quote cannabinoid percentages with one or two decimal places, likely give the consumer a false sense of accuracy. No one familiar with the production of these products could confidently state that the detailed THC levels on the labels of flower product is applicable for each portion of each package. Perhaps a detailed examination of sample variance will provide a useful alternative.

Experimental

Data Source

A good data set for examining sample-to-sample variability in cannabis comes from New Jersey’s medical program. All of the growing is done under highly controlled conditions, and for five years, all of the product testing has been done by a single, publicly funded laboratory (the Department of Health’s Environmental and Chemical Laboratory Services). This laboratory has publicly posted strain test results for patients to review (3).

The tests results posted from June 2016 through March 2019 included results for each of the five samples for a given strain, as well as results for a composite made of the five samples. Six concentration levels were thus reported for all eight cannabinoids being measured (after March 2019, the posted results only contain a single measure for each cannabinoid). Reviewing all the reports currently posted, 100+ sets of data points were recovered which included cannabinoid levels on each of five samples of the same batch, plus the composite. Six producers were active in New Jersey throughout 2019, and at least 10 reports were collected from each. Previous studies on this data set (4) reported on the relative frequency of strain types (overwhelmingly THC-dominant) and the covariance of cannabinoids.

Analysis

In the present study, only the THC-dominant strains were examined, reducing the data set to 98 entries. From each of the five sample reports plus the composite test, the THC percentage and the tetrahydrocannabinolic acid (THCA) percentage were recorded. As a first step in analysis, the THC and THCA values were summed for a total THC value (after the 0.877 weight correction for THCA). To analyze each strain report, a calculation was made of the average of the five total THC values as well as the standard deviation across the five values. In addition, the relative standard deviation (RSD) was calculated by dividing the standard deviation by the average value. In this way, the amount of variation can be compared across strains with a range of potencies.

A new calculation was also made that perhaps best characterizes the information consumers would find useful: how much variance can be expected between those portions with the highest concentrations and those with the lowest? This high–low variance was calculated by dividing the highest value of the five samples by the lowest, minus 1. Importantly, this measure of variance is relative to the expected concentration. If a consumer is expecting an 11% product and instead gets 14% (a 27% variance), that will likely be much more noticeable than expecting 22% and getting 25% (a 13% variance). The high–low variance, being relative to overall potency, might reflect the degree of subjective variance that patients could experience.

To present data on individual strains that contain several data points, a charting tool was adopted from the world of finance: the stock price candlestick chart. These charts are typically used to graphically present the several values that a stock price has during a single day: the stock’s high price for the day, its low price, its opening price, and its closing price all in a single image. I have adopted that format to show, for a single strain, the highest total THC concentration recorded from one sample, the lowest sample value, the average of the five sample values, and the concentration measured in the composite of the five samples (Figure 1).

Results and Discussion

A reasonable hypothesis one might have about the degree of sample-to-sample variation in THC levels is that it would be higher for strains with higher THC values. The evidence does not support this idea. Figure 2 shows two depictions of the information from one of the six growers operating in New Jersey. The strain data is the same but are arranged differently on the x-axis: in the first they are arranged by the average potency of the strains, and in the second they are arranged by the standard deviation for each set of samples. In the first example, it is apparent that the bars move up as you go left to right across the chart, but there is no trend of a greater spread between the high and low entries, nor the variance between average and composite values. In the second chart, arranged by increasing standard deviation, the entries with the wider gaps are clustered to the right, but not a trend toward higher potency values: the higher potency strains are at the mid-range of standard deviation.

The data from just one grower is displayed in Figure 2, and it appears that there are nearly the same number of white bars as black bars (that is, as many cases of the average being above or below the measure of the composite of five samples). This is not the case for the entire data set, however. Out of the 98 entries, in 57 cases the average value was greater than the composite, and for 41 it was less. There was no clear relationship between either overall potency or spread as to whether the composite or the average was greater.

Another feature of the candle charts is that it draws attention to anomalous figures, where the central bar is at the extreme of the five sample values. In each case, this is where either the composite value is lower than the lowest sample value (or nearly so), resulting in a black bar touching the bottom; or the composite value is higher than the highest sample (or nearly so), resulting in a white bar reaching to the top of the entry. Remarkably in this data set of 98 entries, there were 16 examples in which the total THC values for the composite of the five samples was outside the range of the samples themselves (eight cases of higher than the highest sample and eight lower than the lowest; each of these instances was double checked in the original posting of laboratory data). The conclusion to draw may only be that the laboratory’s process for compositing the samples was not very thorough; but it also supports the idea that the mathematical average of the samples, equally taking into account of the high and low samples, is the best single number to represent one set of samples.

Relative Standard Deviation and High–Low Variance

Relative standard deviation (RSD) is a common measure of the divergence of values in a sample set. Similar to the high–low variance measure described above; it provides added information by normalizing the variance to the base level of the sample set (a deviation of five units has a very different meaning if the base value is 10 rather than 100). The RSD for the subset from the single grower used in Figure 2 were plotted against the high–low variance for those same samples, resulting in the scatter plot in Figure 3.

With the high correlation (r² = 0.95) of these two measures of divergence, a choice can be made based on which measure is more meaningful for this purpose. The high–low variance measure has the advantage that it is likely more intuitive for the interested audience: a 30% variance between extreme samples has more practical meaning than a measure of standard deviation. In addition, the high–low variance measure is not predicated on their being a “right answer” from which the values deviate; it simply describes the observed natural variation.

High–Low Variance and Potency

With the higher standard deviations seen for strains with mid-level potencies, it is not surprising that the degree of variance between high and low samples, in percentage terms, was also found to be higher in the middle range of potencies. This is illustrated for the same single grower data sub-set used in Figure 2. The range of high–low variance in this set of 13 strains was from 6.4% at the low end to two cases of variances over 50%.

A view of the entire data set of 98 strains tests bears out the impression from the subset: greater variance is not found in the highest potency strains. A candlestick graph for all strains, arranged on the x-axis by the high–low variance, shows a trend toward lower potencies toward the right side of the chart, with higher potency strains toward the left (lower variance).

The degree of variance between samples of a single lot of cannabis flower may come as a surprise to those who have not worked in producing or testing the material. Fully 30% of the strains in this data set have a variance between high and low samples of 25% or greater, and 9% of strains had greater than 50% variance. The median variance is 19.8% (the mid-point along the x-axis in seen in Figure 4), meaning that a randomly chosen strain could be expected to have nearly 20% variance in the potency of individual portions; if one chooses lower potency strains, the chances of having higher than 20% variance increase.

High–Low Variance by Grower

The degree of intersample variance might not be a matter of chance, however, and the factors that influence it bear further investigation. One possible source of variance was investigated and is reported here, but does not prove to be helpful. Six licensed growers of medical cannabis in New Jersey were the source of all the strains tested. It may be that some growers consistently had better or worse control of the variance in their product. A graph of the median variance for each grower, plotted with the number of strain tests that they contributed to this data set, shows a comparable degree of variance among them. It does not appear growers with a greater number of strains tested over this period produced tighter variances.

The flip side to the observation that mid-potency strains had higher variances is that high-potency strains had, on average, lower variances. If this observation is confirmed in larger data sets, it may be thought to reflect a property of the plant: the combination of strain type and growing conditions necessary to yield high potency also yields high consistency.

Supporting Evidence

The evidence presented shows that there is considerable inter-sample variation in THC levels from single lots of medical cannabis tested for the New Jersey program. The possibility that the variation is an artifact of poor consistency in the testing methods, rather than in the plants themselves, was investigated by comparing strain results obtained on the same day by the laboratory. For four strains that had samples prepared and high performance liquid chromatography (HPLC) tests run on the same day, even in the same hour, the high–low variances were 10%, 25%, 51%, and 53%. It does not appear that method or machine performance can be looked at as a factor producing such disparate results. Other single day results also showed a spread of variances that mirrored the data set as a whole. Finally, if the variances were a product of laboratory performance rather than natural, we should not expect the observed relationship between potency and high–low variance that was observed.

Many laboratories and growers must have data that can support or refute these observations and conclusions, but few are publicly available. One other source of public data that allows inter-sample comparisons is the published test results generated in Washington State between June 2014 and May 2017 (5). This set includes 140,000 entries for flower product. Included with each entry is a unique identification number, strain name, the name of the grower and the testing laboratory, the test date, and reported values for total THC and total cannabidiol (CBD) concentrations. While the dataset does not identify entries as being samples of a single lot, there are numerous entries in which the same strain name, or with an identifier variant (for example, BigStrain A1 and BigStrain A2) from the same grower were tested by the same laboratory on the same day, time stamped within an hour of each other (6). A quick assessment of 12 sets of entries, which had at least five entries per strain name, revealed a median high–low variance of 28%.

Another source of data pointing to high sample-to-sample variance comes from the test of the consistency of different laboratories run by Gieringer and Hazekamp (2). Though they used blended composite samples, they reported that “lab results were consistent to within 20% of each other. To some degree, the differences in results might be explained by natural variations in the consistency of the cannabis samples used; to some degree, by differences in lab procedures.” The current study removes the variable of laboratory performance by drawing all the data from a single laboratory, leaving the observed variance to be attributed only to the natural variance inherent in the product.

Regulatory Implications

The degree of variation seen in this collection of strain results from New Jersey’s medical program might be considered a best, or lowest, case for natural variation between samples. The product was all grown in highly controlled, indoor conditions by tightly regulated growers; the testing was all done by a single, highly professional laboratory with consistent, simultaneous preparation of each sample before testing. Within the strains that the laboratory tested on the same day can be found the full range of inter-sample variance.

This information might be regarded as “well-known” by growers and laboratories, but it has not been well documented, as it seems not to have influenced the way that legal cannabis is labeled for sale. It is not uncommon for package labels to be very specific about THC levels, to a degree of specificity (19.62%) that cannot be supported based on the natural variance that can be expected from most strains. In contrast, in the Netherlands products regulated by the Office of Medicinal Cannabis are labeled with an “approximate” THC level, with an allowed test variance of 20% from the label. That Dutch maximum allowed variance is essentially the same as the median of high–low variance seen in the New Jersey data set.

An alternative labeling scheme, that would provide a more accurate picture for the consumer, would be to label the product in a way that informs them of the degree of variance that was found at testing. It would require no more space on a package label to report the THC level as “15-17%” than to provide false confidence with a label that reads “16.53%.” A similar suggestion was made to authorities in the state of Washington by Sexton and Ziskind (1).

Conclusions

Analysis of medical cannabis test results show that there is a considerable range of potencies between different samples of the same production lot. Consumers might benefit from having information about how much variation they might expect in different portions of a product that is labeled with a single, but misleadingly precise number.

Acknowledgement

I am grateful to Dr. Cheryl Fitzer-Attas for useful discussions of this topic.

References

M. Sexton and J. Ziskind, lcb.wa.gov/marijuana/botec_reports (2013).
D. Gieringer and A. Hazekamp. O’Shaughnessy’s, Autumn (2011).
https://njmmp.nj.gov/njmmp/jsp/marijuanaStrainDocsForptlogin.jsp.
T.A. Coogan Journal of Cannabis Research 1(11), doi.org/10.1186/s42238-019-0011-z (2019).
N. Jikomes and M. Zoorob, Scientific Reports 8, 4519 (2018). https://doi.org/10.7910/DVN/E8TQSD.
T.A. Coogan Cannabis Science and Technology 3(2), 32–39 (2020).