Advertisement
Research Article| Volume 5, e104-e106, December 2015

Development of new peak-height models for a continuous method of mixture interpretation

  • Sho Manabe
    Correspondence
    Corresponding author at: Department of Forensic Medicine, Kyoto University Graduate School of Medicine, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan. Fax: +81 75 761 9591.
    Affiliations
    Department of Forensic Medicine, Kyoto University Graduate School of Medicine, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan

    Research Fellow of Japan Society for the Promotion of Science, Japan
    Search for articles by this author
  • Yuya Hamano
    Affiliations
    Department of Forensic Medicine, Kyoto University Graduate School of Medicine, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan

    Forensic Science Laboratory, Kyoto Prefectural Police Headquarters, 85-3, 85-4, Yabunouchi-cho, Kamigyo-ku, Kyoto 602-8550, Japan
    Search for articles by this author
  • Chihiro Kawai
    Affiliations
    Department of Forensic Medicine, Kyoto University Graduate School of Medicine, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
    Search for articles by this author
  • Chie Morimoto
    Affiliations
    Department of Forensic Medicine, Kyoto University Graduate School of Medicine, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
    Search for articles by this author
  • Keiji Tamaki
    Affiliations
    Department of Forensic Medicine, Kyoto University Graduate School of Medicine, Yoshida-Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
    Search for articles by this author
Published:September 18, 2015DOI:https://doi.org/10.1016/j.fsigss.2015.09.042

      Abstract

      DNA mixture interpretation based on a continuous model is an effective strategy for calculating rigorous likelihood ratios using peak heights and considering stochastic effects. Such a model would require the elucidation of various biological parameters affecting the expected peak heights. In the present study, we estimated the distributions of locus-specific amplification efficiency, heterozygote balance, and stutter ratio in 15 commercially available short tandem repeat (STR) loci using 234 single-source DNA samples. Our data suggested that the locus-specific amplification efficiency followed a normal distribution, whereas the heterozygote balance followed a log-normal distribution for each locus. We modeled log-normal distributions for stutter ratios with allele-specific mean values, which exhibited a positive correlation with allele repeat numbers. However, with the D8S1179, D21S11, and D2S1338 loci, the log-normal distribution did not fit our data because of the complex repeat structures involved. Therefore, an alternative model for each of these three loci will need to be incorporated into a software program based on a continuous model.

      Keywords

      1. Introduction

      DNA mixture interpretation using short tandem repeat (STR) loci is based on a binary model that does not account for peak-height information in DNA profiles. In recent years, some countries have begun to use continuous models that use the peak heights, including stochastic effects (e.g., allele drop-out), to calculate rigorous likelihood ratios [
      • Taylor D.
      • Bright J.-A.
      • Buckleton J.
      The interpretation of single source and mixed DNA profiles.
      ]. This model can avoid some of the criticisms regarding the subjectivity of DNA mixture interpretation.
      Appropriate use of the continuous model requires application of some biological parameters affecting the probability of the peak heights given all the possible genotype combinations of the contributors. In the present study, we estimated the distributions of three parameters (i.e., locus-specific amplification efficiency, heterozygote balance, and stutter ratio) in 15 commercially available STR loci using single-source DNA samples.

      2. Materials and methods

      2.1 STR typing

      Buccal samples were collected from 276 individuals using a Buccal DNA Collector (Bode Technology, Lorton, VA). Extraction from buccal cells was performed using BioRobot® EZ1 (Qiagen, Hilden, Germany) found in the EZ1 DNA investigator kit according to standard protocols. Extracted DNA was amplified using an AmpFSTR® Identifiler® Plus PCR Amplification Kit (Life Technologies, Carlsbad, CA) following the manufacturer’s instructions. PCR products were then analyzed on an ABI 3130xl Genetic Analyzer (Life Technologies) and data were analyzed using GeneMapper™ ID version 3.2.1 (Life Technologies) using 30 relative fluorescence units (RFU) as the limit of detection. We excluded 42 DNA samples from our estimation of the distributions of the parameters because of primer-binding site mutations (n = 4), tri-allelic patterns (n = 1), off-ladder alleles (n = 5), and pull-up peaks stacked on the stutter peaks (n = 33, including one sample in which we also detected an off-ladder allele). Finally, we used 234 DNA samples to estimate the distributions of the three parameters.

      2.2 Calculation of the three parameters

      Locus specific amplification efficiency (Al) was defined as follows:
      Al=TlT¯


      where Tl denotes the sum of all allelic and stutter peak heights in locus l (l = 1,2,...L), and T¯ denotes the mean value of Tl (i.e., T¯=l=1LTl/L). We calculated Al values in each locus of all 234 experimental profiles.
      Heterozygote balance (Hb) was defined as follows:
      Hb=Oa1+OaOa1+Oa


      where Oa refers to the height of the low-molecular-weight allele, and Oa′ is the height of the high molecular weight allele. Oa−1 and Oa′−1 denote stutter peak heights of alleles a and a′, respectively. To implement the Hb distribution in a software program based on a continuous model, the effects of stutter ratio should be eliminated from the Hb calculation. Thus, we defined Hb as the ratio of total allelic products (i.e., sum of the allele peak height and stutter peak height), not the ratio of allele peak heights. If the stutter peak of the high-molecular-weight allele was masked by the low-molecular-weight allele (i.e., a = a′ − 1), we did not calculate the Hb value.
      Stutter ratio (SR) was calculated as follows:
      SR=Oa1Oa


      where Oa refers to the height of the allele a, and Oa−1 refers to the stutter peak height of allele a. If the stutter position was the same as another allelic position in a heterozygous locus, we did not calculate the SR value.

      3. Results and discussion

      Fig. 1. shows the distribution of the Al values in each locus. The D8S1179 locus had the highest median value of Al (1.37), whereas the D18S51 locus had the lowest median value of Al (0.757). We assumed that Al followed a normal distribution because the data were symmetrically distributed. The assumption was checked using quantile–quantile (Q–Q) plots. The Q–Q plots showed good agreement with the observed Al values.
      Figure thumbnail gr1
      Fig. 1Distributions of the Al values at each locus.
      In the same way, we investigated the distribution of the Hb values at each locus. The median values were nearly equal to one for all loci. We assumed that Hb followed a log-normal distribution because the data were symmetrically distributed in the logarithmic scale. The Q–Q plots of the log-normal distribution showed good agreement with the observed Hb values.
      Fig. 2 shows the distributions of SR values in D18S51 locus. The SR values were positively correlated with allele repeat numbers. We observed this trend in 11 loci but not in for D8S1179, D21S11, TH01, and D2S1338. As previously reported, we assumed that the SR values in the 11 loci followed a log-normal distribution with allele-specific mean values [
      • Bright J.-A.
      • Taylor D.
      • Curran J.M.
      • Buckleton J.S.
      Developing allelic and stutter peak height models for a continuous method of DNA interpretation.
      ]. The assumption resulted in good prediction of the observed SR values using the Q–Q plots.
      Figure thumbnail gr2
      Fig. 2Plots of SR vs. allele repeat number in D18S51.
      In the TH01 locus, the SR values of allele 9.3 were close to those of allele 6. Bright et al. showed that the longest uninterrupted stretch (LUS) is a more reliable predictor of SR than the allele repeat number [
      • Bright J.-A.
      • Taylor D.
      • Curran J.M.
      • Buckleton J.S.
      Developing allelic and stutter peak height models for a continuous method of DNA interpretation.
      ]. The LUS value of allele 9.3 is 6 according to a previous sequence analysis [

      J.M., Butler, D.J., Reeder, Short tandem repeat DNA internet database. Available from: www.cstl.nist.gov/biotech/strbase.

      ]. The TH01 locus also followed the log-normal distribution as the allele repeat number of allele 9.3 was 6.
      However, LUS values could not be determined for D8S1179, D21S11, and D2S1338. For example, there are two types of repeat structures in the D8S1179 locus (i.e., [TCTA]a and TCTA TCTG [TCTA]a−2 for allele a) [

      J.M., Butler, D.J., Reeder, Short tandem repeat DNA internet database. Available from: www.cstl.nist.gov/biotech/strbase.

      ]. Therefore, for these three loci, an alternative model is required and must be incorporated into a software program based on a continuous model.

      Role of funding

      None.

      Conflict of interest

      None.

      Acknowledgment

      This work was supported by a Grant-in-Aid for JSPS Fellows (JSPS KAKENHI grant number 14J03372).

      References

        • Taylor D.
        • Bright J.-A.
        • Buckleton J.
        The interpretation of single source and mixed DNA profiles.
        Forensic Sci. Int. Genet. 2013; 7: 516-528
        • Bright J.-A.
        • Taylor D.
        • Curran J.M.
        • Buckleton J.S.
        Developing allelic and stutter peak height models for a continuous method of DNA interpretation.
        Forensic Sci. Int. Genet. 2013; 7: 296-304
      1. J.M., Butler, D.J., Reeder, Short tandem repeat DNA internet database. Available from: www.cstl.nist.gov/biotech/strbase.