Advertisement

Evaluation of genetic markers for the analysis of THC levels of Cannabis sativa samples using principal component analysis – A preliminary study

Published:October 06, 2022DOI:https://doi.org/10.1016/j.fsigss.2022.10.004

      Abstract

      Cannabis sativa is a worldwide commercial plant used for medicinal purposes, food and fiber production, and also as a recreational drug. Therefore, the identification and differentiation between legal and illegal C. sativa is of great importance for forensic investigations. In this study, principal component analysis (PCA), an exploratory data analysis technique, was tested to correlate the specific genotype with the concentration of tetrahydrocannabinol (THC) in the samples. C. sativa samples were obtained from legal growers in Piedmont, Italy, and from illegal drug seizures in the Turin region. DNA was extracted, quantified, amplified with a 13-loci multiplex STR and finally analyzed with an automated sequencer. The results showed a trend in the analyzed samples as they differed by their THC content and allele profiles. PCA yielded two clusters of samples that differed by specific allele profiles and THC concentrations. Further validation studies are needed, but this study could provide a new approach to forensic investigation and be a valuable aid to law enforcement in significant marijuana seizures or in tracing illicit drug trafficking routes.

      Abbreviations:

      THC (Tetrahydrocannabinol), STRs (Short tandem repeats), PCA (Principal Component Analysis)

      Keywords

      1. Introduction

      Cannabis sativa is widely used worldwide as a multipurpose plant as a source of fiber, food, and oil, but is also commonly used as a medicine and recreational drug due to its Δ9-tethrahydrocannabinol (Δ9-THC) content [
      • Small E.
      • Cronquist A.
      A practical and natural taxonomy for caNNABIS.
      ,
      • Bonini S.A.
      • Premoli M.
      • et al.
      Cannabis sativa: a comprehensive ethnopharmacological review of a medicinal plant with a long history.
      ]. Despite the wide range of legal uses for cannabis, cultivation, possession, and sale are still prohibited by law in several countries. In Italy, the legal limit for Δ9-THC content is 0.2 %, based on D.P.R. 309/1990 Testo Unico on psychoactive drugs. Recently, a tolerance threshold of 0.6 % was introduced for industrial purposes (Legge n. 242 del 2 dicembre 2016) [

      2020) 〈https://www.gazzettaufficiale.it/atto/stampa/serie_generale/originario〉 (accessed April 30, Legge 2 dicembre 2016, n. 242. Disposizioni per la promozione della coltivazione e della filiera agroindustriale della canapa.

      ], by taking into account the variability that can occur during the growing process in cultivation, so that the plants with Δ9-THC content within 0.6 % are not criminal (and therefore seized and destroyed). However, the interest in the 0.6 % threshold stems from the fact that products with a Δ9-THC content greater than 0.2 % are sold illegally and are found in seizures by investigative agencies aimed at assessing the effective psychoactive potency of plants, which is commonly attributed to products with a Δ9-THC content greater than 0.5–0.6 %. In this study, a multidisciplinary procedure with principal component analysis (PCA) was applied to C. sativa samples to correlate STR profiles and Δ9-THC concentration, considering the threshold of 0.6 % Δ9-THC.

      2. Materials and methods

      C. sativa samples were obtained from legal growers in Piedmont (Italy) and from illegal drug seizures in the Turin region and analyzed at the Centro Regionale Antidoping e di Tossicologia "A. Bertinaria" in Orbassano (Turin, Italy). Gas chromatography coupled with mass spectrometry (GC-MS) [
      • Pourseyed Lazarjani M.
      • Torres S.
      • et al.
      Methods for quantification of cannabinoids: a narrative review.
      ] was used to quantify the Δ9-THC content of the samples, while a 13-loci multiplex system STR [
      • Alghanim H.J.
      • Almirall J.R.
      Development of microsatellite markers in Cannabis sativa for DNA typing and genetic relatedness analyses.
      ,
      • Houston R.
      • Birck M.
      • et al.
      Evaluation of a 13-loci STR multiplex system for Cannabis sativa genetic identification.
      ,
      • Houston R.
      • Birck M.
      • et al.
      Developmental and internal validation of a novel 13 loci STR multiplex method for Cannabis sativa DNA profiling.
      ,
      • Di Nunzio M.
      • Agostini V.
      • et al.
      A Ge.F.I. – ISFG European collaborative study on DNA identification of Cannabis sativa samples using a 13-locus multiplex STR method.
      ] was tested to obtain their STR profile. Both methods were fully optimized and validated. Subsequently, a sparse version of PCA [
      • Bro R.
      • Smilde A.K.
      Principal component analysis.
      ] was applied to analyze and examine the collected STRs data and evaluate the behavior of the samples in terms of their collection category and Δ9-THC content. C. sativa samples were classified into two categories (i.e., 'lower' and 'higher') based on their Δ9-THC content, using a threshold of 0.6 % w/w. PCA was calculated using the R environment (version 4.1.3) [

      R Core Team, R: A language and environment for statistical computing, (2014). 〈http://www.r-project.org/〉.

      ]. For this purpose, the following R packages were used: dplyr [

      H. Wickham, R. François, L. Henry, K. Müller, dplyr: A Grammar of Data Manipulation, (2020). 〈https://cran.r-project.org/package=dplyr〉.

      ], mixOmics [
      • Rohart F.
      • Gautier B.
      • Singh A.
      • Lê Cao K.A.
      mixOmics: an R package for ‘omics feature selection and multiple data integration.
      ] and plotly [

      C. Sievert, plotly for R, (2018). 〈https://plotly-r.com〉.

      ].

      3. Results and discussion

      Sparse PCA plotted the collected data and assessed the presence of subgroups or clusters within the observations, noting the separation into two main groups: legal and illegal samples. Along the PC1 axis (x-axis), a trend can be observed where most illegal samples are located on the left side of the graph with the most negative PC1 values, while legal samples have higher PC1 values on average (Fig. 1A). The correlation of PCA scores with Δ9-THC % values is shown in Fig. 1B. Again, a trend along PC1 can be observed, with most of the legal samples on the right side of the graph, while the illegal samples have on average the lowest PC1 values.
      Fig. 1
      Fig. 1PCA scores plots with samples are colored according to Δ9-THC % values defined by quantitative analysis GC-MS. In Figure A, samples with Δ9-THC % levels greater than 0.6 % w/w are labeled as "higher" (red circles), while samples with Δ9-THC % levels less than 0.6 % w/w are labeled as "lower" (blue circles). In Figure B, a color scale is used to represent the C. sativa samples according to their measured Δ9-THC % content.

      4. Conclusions

      The objective of this proof-of-concept study was to develop a multidisciplinary method to analyze the correlation of STR profiles and Δ9-THC concentration in C. sativa samples. The combination of genetic profiles and measurement of Δ9-THC concentration was used using a sparse PCA approach. Interesting trends and results were observed in separating samples into legal and illegal samples, according to the threshold of 0.6 % w/w. In the future, more machine learning approaches and samples of C. sativa will be tested to build more robust models. This approach could be a useful tool to help police trace the trafficking routes of cannabis samples and associate a cannabis plant with a specific geographic area.

      Conflict of interest statement

      None.

      References

        • Small E.
        • Cronquist A.
        A practical and natural taxonomy for caNNABIS.
        Taxon. 1976; 25: 405-435https://doi.org/10.2307/1220524
        • Bonini S.A.
        • Premoli M.
        • et al.
        Cannabis sativa: a comprehensive ethnopharmacological review of a medicinal plant with a long history.
        J. Ethnopharmacol. 2018; 227: 300-315https://doi.org/10.1016/j.jep.2018.09.004
      1. 2020) 〈https://www.gazzettaufficiale.it/atto/stampa/serie_generale/originario〉 (accessed April 30, Legge 2 dicembre 2016, n. 242. Disposizioni per la promozione della coltivazione e della filiera agroindustriale della canapa.

        • Pourseyed Lazarjani M.
        • Torres S.
        • et al.
        Methods for quantification of cannabinoids: a narrative review.
        J. Cannabis Res. 2020; 2: 35https://doi.org/10.1186/s42238-020-00040-2
        • Alghanim H.J.
        • Almirall J.R.
        Development of microsatellite markers in Cannabis sativa for DNA typing and genetic relatedness analyses.
        Anal. Bioanal. Chem. 2003; 376: 1225-1233https://doi.org/10.1007/s00216-003-1984-0
        • Houston R.
        • Birck M.
        • et al.
        Evaluation of a 13-loci STR multiplex system for Cannabis sativa genetic identification.
        Int. J. Leg. Med. 2016; 130: 635-647https://doi.org/10.1007/s00414-015-1296-x
        • Houston R.
        • Birck M.
        • et al.
        Developmental and internal validation of a novel 13 loci STR multiplex method for Cannabis sativa DNA profiling.
        Leg. Med. 2017; 26: 33-40https://doi.org/10.1016/j.legalmed.2017.03.001
        • Di Nunzio M.
        • Agostini V.
        • et al.
        A Ge.F.I. – ISFG European collaborative study on DNA identification of Cannabis sativa samples using a 13-locus multiplex STR method.
        Forensic Sci. Int. 2021; 329111053https://doi.org/10.1016/j.forsciint.2021.111053
        • Bro R.
        • Smilde A.K.
        Principal component analysis.
        Anal. Methods. 2014; 6: 2812-2831https://doi.org/10.1039/c3ay41907j
      2. R Core Team, R: A language and environment for statistical computing, (2014). 〈http://www.r-project.org/〉.

      3. H. Wickham, R. François, L. Henry, K. Müller, dplyr: A Grammar of Data Manipulation, (2020). 〈https://cran.r-project.org/package=dplyr〉.

        • Rohart F.
        • Gautier B.
        • Singh A.
        • Lê Cao K.A.
        mixOmics: an R package for ‘omics feature selection and multiple data integration.
        PLoS Comput. Biol. 2017; 13https://doi.org/10.1371/journal.pcbi.1005752
      4. C. Sievert, plotly for R, (2018). 〈https://plotly-r.com〉.