If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Taking peak area information into account when analysing STR DNA mixtures is acknowledged to be a difficult task. There have been a number of non-probabilistic approaches proposed in the literature, and some have been incorporated into computer systems, but comparatively little has been published from a probabilistic perspective. Here we briefly review our previous work on using Bayesian networks to analyse two-person mixtures within a probabilistic framework, and present preliminary results obtained for analysing two-person and three-person mixtures that combine peak area information from multiple independent samples.
] we have presented a probabilistic methodology for analysing peak area information from DNA mixtures based on Bayesian networks. A representative fragment of these networks is shown in Fig. 1 for a two-person mixture. This represents peak area information on three alleles, denoted by a, b and c, of some marker system. At the top we have two nodes representing the genotypes of the contributors and . On the next layer we have nodes such as that count the number of alleles of type that person has. These nodes take values in the set . They depend on the genotypes of the persons, this dependence is represented by the directed arrow from the genotype to the nodes. The node to the left represents the relative proportions of DNA in the mixture from each contributor prior to PCR amplification, so that the proportion from person is with . From the proportions and the allele count nodes we calculate the mean , with similar formula for the mean nodes and . These are the fraction of alleles of type a, b and c for the marker in the mixture prior to PCR amplification. The bottom layer of nodes represents the peak areas of the individual alleles as measured by the PCR apparatus after amplification of the mixture sample. We model the stochastic variations in these areas by Gamma distributions, where the Gamma distribution of the area for allele a depends on the mean and has expectation proportional to ; similarly for alleles b and c. For further details of the Gamma model and Bayesian networks, and how the probability calculations are performed, see [
] we have analysed peak-area data on two-person mixtures taken from a variety of publications. Here we illustrate the power of our methodology for combining peak area information from two independent samples that each have the same contributors.
In our first example there are two individuals. Two mixtures were prepared in a laboratory, with each mixture having approximately the same amount of DNA from each person. We separated each mixture individually, and also separated the pair of mixtures together.
With these proportions it should not be possible to separate the mixtures. That we are able to do so indicates that the effective fraction from each contributor was not exactly one half.
Our results are shown in Table 1. Using only the first mixture, the genotypes of both contributors are correctly identified on all markers. Using only the second mixture the profiles on two markers were not identified correctly (as indicated by italics). When combining the two traces both profiles were correctly identified on all markers, with probabilities increased on all but one marker profiles. Note especially the increase in probabilities in the profiles for markers D3 and D19, which were incorrectly identified when analysing the second mixture by itself.
Table 1Profile separation of a pair of two-person mixtures
Marker
First trace only (correct all markers)
Second trace only (correct 9 out of 11 markers)
Both traces combined (correct all markers)
Amelogenin
0.6668
0.6392
0.7772
D2
0.4582
0.3838
0.6956
D3
0.8152
0.4854
0.8531
D8
0.6471
0.4831
0.7357
D16
0.6078
0.7534
0.7877
D18
0.4095
0.3574
0.6872
D19
0.4994
0.2928
0.6605
D21
0.7480
0.7485
0.8592
FGA
0.6727
0.6058
0.7701
TH01
1
1
1
VWA
0.3529
0.7656
0.7457
Each mixture was prepared in 1:1 ratio. They were analysed both individually, and also together assuming common contributors. Posterior probabilities shown are for the correct profile, with incorrect identifications italicized.
In our second example, we consider three-person mixtures. We analyse two laboratory prepared mixtures of differing proportions, using the known profile of one of the contributors. Our results are shown in Table 2. Incorrect classifications are shown in italics. Using only the first mixture, only 3 of the 14 markers were correctly identified, whilst using the second mixture by itself only 3 marker profiles were incorrectly identified, these having low probabilities. However, when using both markers together all marker profiles are correctly identified. Note in particular the increase in probabilities for the profiles on markers D5, D16, and TH01, none of which were correctly identified with a single mixture analysis.
Table 2Profile separation of two three-person mixtures, each mixture taken separately and then together assuming common contributors, using the profile of one contributor in all three separations
Marker
First trace only 1:1:1 (correct 3 out of 14 markers)
Second trace only 1:5:2 (correct 11 out of 14 markers)
Both traces combined (correct all markers)
CSF
0.145
1.000
1.000
D2
0.178
1.000
1.000
D3
0.285
0.768
0.987
D5
0.432
0.190
0.883
D7
0.179
0.930
0.975
D8
0.270
0.739
0.776
D16
0.171
0.299
0.967
D18
0.126
0.999
0.999
D19
0.360
0.927
1.000
D21
0.154
0.997
0.997
FGA
0.400
0.892
1.000
TH01
0.009
0.212
0.529
TPOX
0.496
0.525
0.985
VWA
0.179
0.985
0.982
Posterior probabilities shown are for the correct profile, with incorrect identifications italicized.
We have presented preliminary results from applying a simple probabilistic model-based approach for mixture peak area values, for what we believe is a novel example of combining peak area information from independent mixture samples that have DNA from the same set of contributors in order to enhance the profile separation. Our results show the power and flexibility of the Bayesian network approach. We intend to expand on our findings elsewhere. In addition, the same approach can deal with stutter peaks, and also possible kinship relationships between contributors to mixtures: again we intend to publish more details on the additional possibilities elsewhere. In our previous publications we have also shown how the same methodology may be used to find likelihoods and likelihood ratios of hypotheses concerning the contributors to a mixture.
In the future we intend to fine tune the parameters in our model for better performance, and to analyse data with stutter peaks. We also intend to develop methods to take into account the possibility of dropout.
Conflict of interest
None.
Acknowledgements
We would like to thank the UK Forensic Science Service for providing the data on two-person mixtures analysed in Table 1. We would also like to thank G. Lago of the Raggruppamento Carabinieri Investigazioni Scientifiche, Rome, Italy, for providing the data on three-person mixtures analysed in Table 2.
References
Cowell R.G.
Lauritzen S.L.
Mortera J.
MAIES: a tool for DNA mixture analysis.
in: Dechter R. Richardson T. Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI 2006). 2006: 90-97