Advertisement

Statistical analysis tools of mixture DNA samples: When the same software provides different results

  • Camila Costa
    Correspondence
    Correspondence to: i3S, Rua Alfredo Allen, 208, 4200-135 Porto, Portugal.
    Affiliations
    FCUP – Faculdade de Ciências da Universidade do Porto, Portugal

    i3S – Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Portugal
    Search for articles by this author
  • Carolina Figueiredo
    Affiliations
    FCUP – Faculdade de Ciências da Universidade do Porto, Portugal

    i3S – Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Portugal
    Search for articles by this author
  • António Amorim
    Affiliations
    FCUP – Faculdade de Ciências da Universidade do Porto, Portugal

    i3S – Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Portugal

    IPATIMUP – Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Portugal
    Search for articles by this author
  • Lourdes Prieto
    Affiliations
    Grupo de Medicina Xenómica, Instituto de Ciencias Forenses, Universidad de Santiago de Compostela, Santiago de Compostela, Spain

    Comisaría General de Policía Científica, Laboratorio ADN, Madrid, Spain
    Search for articles by this author
  • Sandra Costa
    Affiliations
    LPC-PJ – Biologia, Laboratório de Polícia Cientifica da Polícia Judiciária, Lisboa, Portugal
    Search for articles by this author
  • Paulo Miguel Ferreira
    Affiliations
    LPC-PJ – Biologia, Laboratório de Polícia Cientifica da Polícia Judiciária, Lisboa, Portugal
    Search for articles by this author
  • Nádia Pinto
    Affiliations
    i3S – Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Portugal

    IPATIMUP – Instituto de Patologia e Imunologia Molecular da Universidade do Porto, Portugal

    CMUP – Centro de Matemática da Universidade do Porto, Portugal
    Search for articles by this author
Published:September 24, 2022DOI:https://doi.org/10.1016/j.fsigss.2022.09.014

      Abstract

      The high complexity of the genetic analysis of crime scene samples is mainly related to the unknown number of contributors, low DNA quantity and quality, and associated stochastic effects. The difficulty and subjectivity of interpreting casework samples was the motto for the development of software to mitigate these conditions and allow the quantification of the genetic evidence. Currently, there are several tools for statistical analysis of mixture samples based on either qualitative or quantitative models. The first considers the electropherograms’ qualitative information, while the latter also considers the associated quantitative information. This work’s main goal was to evaluate the effect that parameters’ settings variation may have on the LR computation, specifically the drop-in frequency parameter. For that, a qualitative – LRmix Studio – and two quantitative software – STRmix™ and EuroForMix – were considered and an intra-software analysis was performed, using as input real casework samples. The drop-in frequency variation showed an impact, leading to differences higher than four units (log10 scale) for some pairs of samples. In addition, for some cases, no comparisons were performed either because the tool computed a null LR value or displayed an error message. Thus, this work reinforces the importance of proper parameters’ modeling and estimation in forensic casework evaluation.

      Keywords

      1. Introduction

      DNA samples recovered from crime scenes may have associated several characteristics that make them difficult and complex for the expert to interpret and analyze. DNA mixtures, i.e., samples containing contributions from more than one donor (exact number unknown), are typically encountered in this context. Beyond the unknown number of contributors (NOC), the high complexity also results from allele sharing between contributors, contributions in different proportions, stutters which may be confounded with alleles of a minor contributor (or vice versa), amplification stochastic effects (heterozygotic imbalance, drop-in, and drop-out) due to low-template DNA, or degradation [
      • Walsh P.S.
      • Fildes N.J.
      • Reynolds R.
      Sequence analysis and characterization of stutter products at the tetranucleotide repeat locus vWA.
      ,
      • Gill P.
      Application of low copy number DNA profiling.
      ,
      • Schneider P.M.
      • Bender K.
      • Mayr W.R.
      • Parson W.
      • Hoste B.
      • Decorte R.
      • Cordonnier J.
      • Vanek D.
      • Morling N.
      • Karjalainen M.
      • et al.
      STR analysis of artificially degraded DNA-results of a collaborative European exercise.
      ,
      • Gill P.
      • Brenner C.H.
      • Buckleton J.S.
      • Carracedo A.
      • Krawczak M.
      • Mayr W.R.
      • Morling N.
      • Prinz M.
      • Schneider P.M.
      • Weir B.S.
      • et al.
      DNA commission of the International Society of Forensic Genetics: recommendations on the interpretation of mixtures.
      ,
      • Fondevila M.
      • Phillips C.
      • Naverán N.
      • Cerezo M.
      • Rodríguez A.
      • Calvo R.
      • Fernández L.M.
      • Carracedo Á.
      • Lareu M.V.
      • Challenging D.N.A.
      Assessment of a range of genotyping approaches for highly degraded forensic samples.
      ,
      • Balding D.J.
      • Buckleton J.
      Interpreting low template DNA profiles.
      ,
      • Gibb A.J.
      • Huell A.L.
      • Simmons M.C.
      • Brown R.M.
      Characterisation of forward stutter in the AmpFlSTR SGM Plus PCR.
      ,
      • Westen A.A.
      • Nagel J.H.
      • Benschop C.C.
      • Weiler N.E.
      • de Jong B.J.
      • Sijen T.
      Higher capillary electrophoresis injection settings as an efficient approach to increase the sensitivity of STR typing.
      ,
      • Freire-Aradas A.
      • Fondevila M.
      • Kriegel A.K.
      • Phillips C.
      • Gill P.
      • Prieto L.
      • Schneider P.M.
      • Carracedo A.
      • Lareu M.V.
      A new SNP assay for identification of highly degraded human DNA.
      ,
      • Gittelson S.
      • Biedermann A.
      • Bozza S.
      • Taroni F.
      Decision analysis for the genotype designation in low-template-DNA profiles.
      ,
      • Steele C.D.
      • Greenhalgh M.
      • Balding D.J.
      Evaluation of low-template DNA profiles using peak heights.
      ,
      • Dash H.R.
      • Shrivastava P.
      • Das S.
      Analysis of capillary electrophoresis results by GeneMapper® ID-X v 1.5 Software.
      ].
      To deal with the abovementioned conditions and quantify the genetic evidence, several probabilistic genotyping software were developed. These tools are based on either qualitative or quantitative models. The first only considers the qualitative information of the electropherogram (observed alleles), while the other also considers the quantitative information (height of the detected peaks) [
      • Alladio E.
      • Omedei M.
      • Cisana S.
      • D'Amico G.
      • Caneparo D.
      • Vincenti M.
      • Garofano P.
      DNA mixtures interpretation - a proof-of-concept multi-software comparison highlighting different probabilistic methods' performances on challenging samples.
      ,
      • Coble M.D.
      • Bright J.A.
      Probabilistic genotyping software: an overview.
      ]. Both models quantify the genetic evidence through the computation of a Likelihood Ratio (LR) [
      • Gill P.
      • Brenner C.H.
      • Buckleton J.S.
      • Carracedo A.
      • Krawczak M.
      • Mayr W.R.
      • Morling N.
      • Prinz M.
      • Schneider P.M.
      • Weir B.S.
      • et al.
      DNA commission of the International Society of Forensic Genetics: recommendations on the interpretation of mixtures.
      ], comparing the probabilities of observing the genetic evidence assuming two alternative and mutually exclusive hypotheses. This calculation can account for several parameters, regarding population and analytical factors, on which the LR depends, and which are introduced by the user, such as NOC, drop-in, co-ancestry coefficient, and threshold detection.
      The main goal of this work is to evaluate the impact that the estimation of the drop-in frequency parameter may have on the quantification of the evidence, using casework samples. Both qualitative – LRmix Studio v.2.1.3 [
      • Haned H.
      • Slooten K.
      • Gill P.
      Exploratory data analysis for the interpretation of low template DNA mixtures.
      ] – and quantitative tools – EuroForMix v. 3.4.0 [
      • Bleka O.
      • Storvik G.
      • Gill P.
      EuroForMix: an open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts.
      ] and STRmix™ v.2.7 [
      • Taylor D.
      • Bright J.A.
      • Buckleton J.
      The interpretation of single source and mixed DNA profiles.
      ] were considered. A drop-in allele corresponds to a spurious allele in the problem sample profile, i.e., an allele that shows up in the problem sample profile but cannot be explained by any of the contributors [
      • Balding D.J.
      • Buckleton J.
      Interpreting low template DNA profiles.
      ]. This stochastic effect generates discordances between the mixture and the contributors’ profiles, which can result in misinterpretations and misleading conclusions. So, understanding its impact is extremely important [
      • Gittelson S.
      • Biedermann A.
      • Bozza S.
      • Taroni F.
      Decision analysis for the genotype designation in low-template-DNA profiles.
      ,
      • Steele C.D.
      • Greenhalgh M.
      • Balding D.J.
      Evaluation of low-template DNA profiles using peak heights.
      ].

      2. Material and methods

      Resorting to evidence from Laboratório de Polícia Cientifica da Políca Judiciária former cases, a set of 156 irreversibly anonymized mixture/single contributor sample pairs were selected. For each case the single contributor sample was either from sampled individuals or single-source problem sample profiles obtained in the same case of the analyzed mixture It was assumed as belonging to the person of interest (POI) but are not necessarily known. The mixture samples considered were estimated to have either two or three NOC. Casework samples were chosen due to their unknown composition, uniqueness, and inability to predict and replicate, carrying a much higher complexity than mock ones.
      For each pair selected, a LR value was computed assuming the alternative hypotheses: “The POI is a contributor of the mixture” and “The POI is unrelated to any contributor of the mixture”. The LR values were computed using both qualitative (LRmix Studio v.2.1.3 [
      • Haned H.
      • Slooten K.
      • Gill P.
      Exploratory data analysis for the interpretation of low template DNA mixtures.
      ]) and quantitative (EuroForMix v.3.4.0 [
      • Bleka O.
      • Storvik G.
      • Gill P.
      EuroForMix: an open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts.
      ] and STRmix™ v.2.7 [
      • Taylor D.
      • Bright J.A.
      • Buckleton J.
      The interpretation of single source and mixed DNA profiles.
      ]) tools. To understand the impact that drop-in frequency may have on the LR computation, the frequency was varied to lower (0.00) and higher values (0.10) relative to the established default (0.05) [
      • Haned H.
      • Benschop C.C.
      • Gill P.D.
      • Sijen T.
      Complex DNA mixture analysis in a forensic context: evaluating the probative value using a likelihood ratio model.
      ]. Afterward, an intra-software analysis was carried out through a comparison between the values computed under the default and varied conditions for each pair of samples.
      The allelic frequencies of the National Institute of Standards and Technology (NIST) database concerning the Caucasian population were used [
      • Hill C.R.
      • Duewer D.L.
      • Kline M.C.
      • Coble M.D.
      • Butler J.M.
      US population data for 29 autosomal STR loci.
      ].

      3. Results and discussion

      In general, the variation of drop-in frequency did not have a great impact on the LR values provided by the same software, as most of the computed differences were within one unit in a log10 scale. For all the qualitative software comparisons (LRmix Studio), no calculated difference exceeded one log10 unit, reflecting its lower sensitivity to this parameter variation compared to quantitative tools.
      For both quantitative tools, differences higher than four log10 units were observed - in EuroForMix when the drop-in frequency was considered null (one case with a mixture with two, and one case with a mixture with three estimated NOC), and in STRmix™ when the drop-in frequency was considered either as null or equal to 0.10 (two cases for each comparison; mixtures with three estimated NOC).
      The impact of parameter’s variation also translates into cases for which no comparison was performed, as the software either computed a null LR value – as STRmix™ – or does not compute a value at all – as both LRmix Studio and EuroForMix. This was observed when a null drop-in frequency was considered. For LRmix Studio and EuroForMix, this occurred in 25% and 27% (respectively) of the cases involving mixtures with two estimated NOC, and in 4% and 5% (respectively) of the cases with mixtures with three estimated. In STRmix™, this occurred in only 1% of the cases for both mixtures with two and three estimated contributors.

      4. Conclusion

      Before performing any computation, the expert needs to establish and introduce a value for some software parameters, such as NOC, FST, drop-in, or detection threshold. Thus, it is crucial to understand the impact those parameters may have on the resulting LR. Focusing on the drop-in frequency parameter, LR values were computed under both the established default and varied values of the drop-in frequency, for each pair analyzed case. The results obtained by LRmix Studio, EuroForMix, and STRmix™ were compared and the impact within software was evaluated.
      LRmix Studio showed to be less sensitive to this parameter variation than the quantitative tools. However, in both quantitative differences of more than four units (log10 scale) were observed. Furthermore, for all software, there were cases for which no comparisons were performed due to the computation of a null LR value or an error message display.
      This work emphasizes the importance of experts’ understanding of the models incorporated in the existing tools and of its proper estimation, specifically for the case of drop-in.

      Conflict of interest statement

      The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

      Acknowledgements

      This project was supported by the Laboratório de Polícia Científica da Polícia Judiciária (LPC-PJ), Instituto de Patologia e Imunologia Molecular da Universidade do Porto (IPATIMUP) and Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Portugal (i3S). This work was partially financed by FEDER – Fundo Europeu de Desenvolvimento Regional funds through the COMPETE 2020 – Operational Program for Competitiveness and Internationalization (POCI), Portugal 2020, and by Portuguese funds through FCT – Fundação para a Ciência e a Tecnologia/Ministério da Ciência, Tecnologia e Inovação in the framework of the projects “Institute for Research and Innovation in Health Sciences” (POCI-01–0145-FEDER-007274). NP is supported by FCT, under the program contract provided in Decree-Law no.57/2016 of August 29. CC is funded by a FCT doctoral grant 2021.05655. BD.

      References

        • Walsh P.S.
        • Fildes N.J.
        • Reynolds R.
        Sequence analysis and characterization of stutter products at the tetranucleotide repeat locus vWA.
        Nucleic Acids Res. 1996; 24: 2807-2812
        • Gill P.
        Application of low copy number DNA profiling.
        Croat. Med. J. 2001; 42: 229-232
        • Schneider P.M.
        • Bender K.
        • Mayr W.R.
        • Parson W.
        • Hoste B.
        • Decorte R.
        • Cordonnier J.
        • Vanek D.
        • Morling N.
        • Karjalainen M.
        • et al.
        STR analysis of artificially degraded DNA-results of a collaborative European exercise.
        Forensic Sci. Int. 2004; 139: 123-134
        • Gill P.
        • Brenner C.H.
        • Buckleton J.S.
        • Carracedo A.
        • Krawczak M.
        • Mayr W.R.
        • Morling N.
        • Prinz M.
        • Schneider P.M.
        • Weir B.S.
        • et al.
        DNA commission of the International Society of Forensic Genetics: recommendations on the interpretation of mixtures.
        Forensic Sci. Int. 2006; 160: 90-101
        • Fondevila M.
        • Phillips C.
        • Naverán N.
        • Cerezo M.
        • Rodríguez A.
        • Calvo R.
        • Fernández L.M.
        • Carracedo Á.
        • Lareu M.V.
        • Challenging D.N.A.
        Assessment of a range of genotyping approaches for highly degraded forensic samples.
        Forensic Sci. Int. Genet. Suppl. Ser. 2008; 1: 26-28
        • Balding D.J.
        • Buckleton J.
        Interpreting low template DNA profiles.
        Forensic Sci. Int. Genet. 2009; 4: 1-10
        • Gibb A.J.
        • Huell A.L.
        • Simmons M.C.
        • Brown R.M.
        Characterisation of forward stutter in the AmpFlSTR SGM Plus PCR.
        Sci. Justice. 2009; 49: 24-31
        • Westen A.A.
        • Nagel J.H.
        • Benschop C.C.
        • Weiler N.E.
        • de Jong B.J.
        • Sijen T.
        Higher capillary electrophoresis injection settings as an efficient approach to increase the sensitivity of STR typing.
        J. Forensic Sci. 2009; 54: 591-598
        • Freire-Aradas A.
        • Fondevila M.
        • Kriegel A.K.
        • Phillips C.
        • Gill P.
        • Prieto L.
        • Schneider P.M.
        • Carracedo A.
        • Lareu M.V.
        A new SNP assay for identification of highly degraded human DNA.
        Forensic Sci. Int. Genet. 2012; 6: 341-349
        • Gittelson S.
        • Biedermann A.
        • Bozza S.
        • Taroni F.
        Decision analysis for the genotype designation in low-template-DNA profiles.
        Forensic Sci. Int. Genet. 2014; 9: 118-133
        • Steele C.D.
        • Greenhalgh M.
        • Balding D.J.
        Evaluation of low-template DNA profiles using peak heights.
        Stat. Appl. Genet. Mol. Biol. 2016; 15: 431-445
        • Dash H.R.
        • Shrivastava P.
        • Das S.
        Analysis of capillary electrophoresis results by GeneMapper® ID-X v 1.5 Software.
        Principles and Practices of DNA Analysis: A Laboratory Manual for Forensic DNA Typing. Springer, 2020: 213-237
        • Alladio E.
        • Omedei M.
        • Cisana S.
        • D'Amico G.
        • Caneparo D.
        • Vincenti M.
        • Garofano P.
        DNA mixtures interpretation - a proof-of-concept multi-software comparison highlighting different probabilistic methods' performances on challenging samples.
        Forensic Sci. Int. Genet. 2018; 37: 143-150
        • Coble M.D.
        • Bright J.A.
        Probabilistic genotyping software: an overview.
        Forensic Sci. Int. Genet. 2019; 38: 219-224
        • Haned H.
        • Slooten K.
        • Gill P.
        Exploratory data analysis for the interpretation of low template DNA mixtures.
        Forensic Sci. Int. Genet. 2012; 6: 762-774
        • Bleka O.
        • Storvik G.
        • Gill P.
        EuroForMix: an open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts.
        Forensic Sci. Int. Genet. 2016; 21: 35-44
        • Taylor D.
        • Bright J.A.
        • Buckleton J.
        The interpretation of single source and mixed DNA profiles.
        Forensic Sci. Int. Genet. 2013; 7: 516-528
        • Haned H.
        • Benschop C.C.
        • Gill P.D.
        • Sijen T.
        Complex DNA mixture analysis in a forensic context: evaluating the probative value using a likelihood ratio model.
        Forensic Sci. Int. Genet. 2015; 16: 17-25
        • Hill C.R.
        • Duewer D.L.
        • Kline M.C.
        • Coble M.D.
        • Butler J.M.
        US population data for 29 autosomal STR loci.
        Forensic Sci. Int. Genet. 2013; 7: e82-e83