Strikingly similar results have been reported from a wide range of studies on the ratio of tetrahydrocannabinol (THC) to cannabidiol (CBD) concentrations in strains of cannabis. Whether the source has been legalized markets in the west, medical markets in the U.S. and Canada, or collections from law enforcement and researchers, three easily distinguishable types of plant have consistently been found: THC-dominant strains (with less than 1% CBD); CBD-dominant strains (less than 1% THC); and balanced strains with comparable concentrations of both substances. Another consistent finding of these studies, carried out in a variety of laboratory settings, is a positive correlation between THC and CBD levels in those plants that can make substantial quantities (>1%) of each. The correlation between THC and CBD quantities in these varied populations suggests that there is a fundamental property of the plant that makes some combinations impossible, for instance, >15% THC and also >5% CBD. Such results have never shown up in published data sets of carefully, consistently tested samples, but those were all relatively small collections. A much larger data set has been released by the state of Washington (140,000 flower samples), and this has been scrutinized for evidence of consistently propagated strains with higher than a 2-to-1 ratio.
The conspicuousness of the points between clusters can be misleading in a plot with so many points. Just as a single star is more noticeable in a dark part of a sky than if it were in the midst of a dense galaxy, so the isolation of the points between clusters can appear more prominent when there are so many data points and the clusters are especially dense. In addition, the deletion of the entries with CBD >1% has substantially thinned the THC-dominant cluster at the bottom of the chart.
To isolate those in-between data points, a new measure was calculated: the percentage of the sum of THC and CBD that is from CBD. A plot of the percent of cannabinoid from CBD shows a very interesting pattern (Figure 7). This bar chart of 6818 items is so dense that it appears to be a curve, with an interesting shape that provides a new way to identify clusters: beginning at lower left, there is a long flat area with low CBD percentage that are the THC-dominant strains; a steep upward slope, then a rising plateau are the balanced strains; and a smaller plateau with the highest CBD percentage are the CBD-dominant strains. (See upper right for Figure 7, click to enlarge.)
The in-between strains are at the slope of values between 11% and 45% for CBD as a percentage of the combined cannabinoids. This subgroup was further reduced by calculating the ratio of THC to CBD, and selecting those entries with ratios between 2 and 8; too many of the entries with a ratio of 1–2 were of very low CBD concentration and are not in the set we are seeking. A scatter plot of just the resulting 465 “slope” entries (Figure 8) show that this data reduction has captured the very points we wish to investigate: do these represent plants that are exceptions to the defined clusters, or do they simply represent testing noise? (See upper right for Figure 8, click to enlarge.)
With the data set reduced to this manageable size by selecting the “slope” entries, a closer scrutiny of each entry is feasible. For many entries, this examination revealed duplicate samples: unique identification numbers for the same strain name from the same grower tested on the same day by the same laboratory. Duplicate samples with equivalent cannabinoid profile do not add to our understanding of the validity of the slope entries, so duplicates were removed.
Removing samples with identical profiles, though, raises another topic that was a focus area of the Jikomes and Zoorob study, in addition to variation between laboratories, and that is strain consistency. Close examination of the data set reduced by removal of duplicate samples turned up a number of anomalous results. For 19 sets of entries, covering 49 entries in total, different tests of the same strain from the same grower, conducted on the same day by the same laboratory (usually recorded within an hour of each other) had such discrepancies in the reported values as to call them into question. An extreme example of this is shown in Figure 9a: five samples of the same strain tested on the same day show THC values from 3.9% to 11.7% and CBD values from 2.5% to 8.5%. There is no way now to go back and determine which of these is the most accurate number. These 19 sets of entries also raise questions about other results that are outliers, but for which there is no comparator value that would allow us to identify an anomalous result. This sort of data adds to the concern that testing noise is a contributor to the subset of results that fall between clusters. (See upper right for Figure 9, click to enlarge.)
A different sort of anomalous result in shown in Figure 9b: rather than individual results showing too much discrepancy, these results show more uniformity than should be expected. In this case, 10 different strains from one grower were tested on the same day by one laboratory, and the results are curiously consistent. While the THC values range from 5.7% to 11.8%, the CBD values are in a very tight range, from 1.1–1.2% with only one as high as 1.5%. Such a result is not impossible of course, and might be plausible if each of these strains was proprietary to the grower (leaving aside the business logic of growing 10 strains with the same profile). A check on the strains in question, though, show several strain names that are very common in this data set (Dutch Treat and Snoop’s Dream). The values in this suspect set are unlike the results of hundreds of other entries that have the same strain name, but are from other growers. This is another set of points between the clusters that we would conclude is not reliable evidence for plants with unique properties.
A different type of anomalous result, the flip side to the curious consistency in the CBD values of entries with different names, is illustrated in Figure 9c: a set of 12 tests of a single strain name, done over 3 months for one grower, with striking consistency in the THC values (10 results between 15.1 and 15.9), but with CBD values that increase from 1.08 to 4.2. The increase in the reported CBD values occurs sequentially over the 3 months of testing with only minor deviation. For one strain from one grower to shift, steadily and smoothly, from a THC–CBD ratio of 14.7 to 3.7 in 3 months, without a change in the THC value, is a remarkable result.
Putting aside those results where there is a clear question about reliability, we are left with a set of 354 “slope” entries that fall between the clusters, and now we can investigate if there are consistent results in them or whether they are better regarded as noise. The next path to look for a reliable finding was to investigate strain names that are over-represented in this set. Several of the most common strain names in the full data set are also the most common in this “slope” subset, for example Harlequin and Cannatonic. As pointed out by Jikomes and Zoorob in their study, there is considerable evidence that common strain names have been applied to plants with very dissimilar characteristics. This is evident in a scatter plot of cannabinoid concentrations for all of the 287 entries (in the set of 6818) with the strain name “Harlequin” (Figure 9d). The majority of entries are in the blended cluster at the center of the plot, but there is wide divergence, making this plot appear as a microcosm of the population as a whole (compare Figure 6).
- A. Schwabe and M. McGlaughlin, J. Cannabis Res. 1, 3 https://doi.org/10.1186/s42238-019-0001-1 (2019).
- J. Sawler, J. Stout, K. Gardner, D. Hudson, J. Vidmar, L. Butler, J. Page, and S. Myles, PLoS One 10(8), https://doi.org/10.1371/journal.pone.0133292 (2015).
- E.M. Mudge, S.J. Murch, and P.N. Brown, Scientific Reports 8, 13090 https://doi.org/10.1038/s41598-018-31120-2 (2018).
- U. Reimann-Philipp, M. Speck, C. Orser, S. Johnson, A. Hilyard, H. Turner, A. Stokes, and A. Small-Howard, Cannabis and Cannabinoid Research https://doi.org/10.1089/can.2018.0063 (2019).
- E. de Meijer, M. Bagatta, A. Carboni, P. Crucitti, V. Moliterni, P. Ranalli, and G. Mandolin, Genetics 163, 335–346 (2003).
- K. Hillig and P. Mahlberg, Amer. J. Bot. 91, 966–75 (2004).
- T. Coogan, J. Cannabis Res. 1, 11 https://doi.org/10.1186/s42238-019-0011-z (2019).
- N. Jikomes and M. Zoorob, Sci. Rep. 8, 4519 https://doi.org/10.1038/s41598-018-22755-2 (2018).
About the Author
Thomas A. Coogan, PhD, is an Academic and Research Liaison with the New Jersey Cannabis Industry Association. Direct correspondence to: [email protected]
How to Cite this Article
T.A. Coogan, Cannabis Science and Technology 3(2), 32–39 (2020).