If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
The number of contributors is hard to determine in DNA mixture profiles. Here, we deal with the special but frequent case that either two or three contributors are possible. In fact, it might happen that two contributors can explain the number of alleles seen but that three contributors are necessary if a specific person of interest is to be included in the mixture. Then the likelihood ratio assuming two contributors will be zero while the likelihood ratio for three contributors may be large. We evaluate this situation and offer suggestions on how to arrive at an overall likelihood ratio. To exemplify our line of reasoning we use an example proposed by Biedermann, Taroni and Thompson.
In DNA mixtures it is often not clear which number of contributors (NoC) to choose. There has been a debate ongoing on whether the NoC should be the same or not for the two hypotheses to be compared in a likelihood ratio (LR) calculation [
] where it was clarified under which conditions the same or different NoC should be selected. In this study we will focus on a situation which arises frequently in casework for mixed DNA profiles: The trace profile shows at most four alleles for each locus. Then in a common (though overly simplified) approach, two contributors might be assumed. Later, the profile of a person of interest (PoI) is derived and it turns out that under the condition that this PoI is a contributor, three contributors are necessary to explain the mixed trace. Is it then reasonable to calculate the LR for three contributors (as the prosecution might favour) or have two contributors to be chosen (as the defense might insist)? This problem will be considered in the following and practical advice about how to handle this scenario will be given. In this manuscript, we will frequently refer to a highly illustrative example given by Biedermann, Taroni and Thompson [
] in the following way: “Two brothers B and C were prosecuted for murder. According to the prosecution theory, the brothers entered a store where B wrestled with the clerk and C shot the clerk with a handgun. A video surveillance tape showed the crime occurring. Although the faces of the assailants could not be seen, the video revealed that the shooter had worn a hat that was found at the crime scene. Key evidence in the case was the DNA profile for a mixed stain found on the hat […].” [
] This seven-loci stain had three or four alleles to each locus, so a NoC of two might be acceptable. When comparing suspect C to the stain, however, it was found that C was homozygous for allele 13 for locus D5S818 whereas the trait consisted of alleles {8, 11, 12, 13}. If we denote by Hp the hypothesis that C contributed to the trace and D are the profile data, then three contributors must be assumed under Hp for a non-zero likelihood L(D) = P(D|Hp) (under a simple model without using peak heights and no drop-in or drop-out). A similar problem was noted for locus D21S11.
] the NoC was treated as a nuisance parameter in Bayesian fashion. Let us suppose that we have a minimal NoC nmin and a maximal NoC nmax and that all NoCs between nmin and nmax are possible under both Hp and Hd. Then the overall LR LR(D) = P(D|Hp)/P(D|Hd) can be written as a weighted average over the LR(n)s for n contributors:
(1)
Here, P(NoC = n|D, Hd) is the a posteriori probability to have n contributors when the PoI is not contributing and given the data D. The expressions P(NoC = n|Hp) and P(NoC = n|Hd) refer to the prior probabilities to have n contributors either under Hp or Hd. For these prior probabilities the genetic information D is not taken into account and they are solely determined by other case circumstances. Although they might be different for Hp and Hd, this is only in some cases sensible [
] and we will restrict ourselves to the case that P(NoC = n|Hp) = P(NoC = n|Hd). Then the prior probabilies in Eq. (1) cancel out and we obtain the simplified equation
Let us now return to the case of either two or three contributors. For simplicity we assume that other NoCs are not possible. Then the sum consists only of two terms
We are interested in the situation that under Hp only tree contributors are possible. That means that LR(2) = 0 and we obtain the central equation
(2)
which we will investigate further in order to get leads about how to deal with the ambiguity in the NoC.
3. Results
When handling the two versus three contributors situation, we have two LRs to consider: LR(2) = 0 and LR(3). The latter might be large. Which LR should be reported? When looking at the overall LR of Eq. (2), we see that LR(3) is weighted by the probability of having three contributors derived from the trace data when the PoI has not contributed P(NoC = 3|D, Hd). This probability is between 0 and 1, thus LR is between 0 and LR(3). For evaluating whether LR is nearer to 0 or to LR(3), the weighting factor P(NoC = 3|D, Hd) has to be (at least roughly) estimated. To do that, we write this probability in a different kind of way:
where LRd = P(D|NoC = 3, Hd)/P(D|NoC = 2, Hd). We see that P(NoC = 3|D, Hd) depends on the two terms LRd and P(NoC = 3) and we will consider these separately in the following.
3.1 The likelihood ratio LRd
LRd = P(D|NoC = 3, Hd)/P(D|NoC = 2, Hd) compares the probability of the trace data under either two or three contributors in the situation where the PoI is not part of the mixture. LRd will be large if three contributors are much more likely than two when looking at the mixture. Therefore, it is not surprising that for large values of LRd, also the weighting factor P(NoC = 3|D, Hd) gets large, i.e. near to one and the overall LR is near to LR(3). LRd is a LR for two different propositions, namely two or three contributors and can therefore be readily calculated from the data.
3.2 The prior probability P(NoC=3)
Whereas for LRd the mixture profile D is essential, P(NoC=3) is the prior probability of having three contributors to the trace without using the genetic information. The larger the prior probability for three contributors is, the nearer the overall LR will be to LR(3). P(NoC=3) can only be derived from the non-genetic evidence and case circumstances. As such, it cannot be calculated and can at best be loosely estimated in contrast to LRd. Depending on the information available for a specific case, it might be possible to make inference about the order of P(NoC=3), e.g. by eye witness evidence or when it is know that only two persons have access to a special item (such as victim and PoI). In other circumstances, no information about P(NoC=3) might be available. Then a prior of 1/2 could be one possibility. For sensitivity analysis, a number of sensible priors should be taken into account.
3.3 Returning to the Hat Example
Let us now return to the Hat Example of Biedermann, Taroni and Thompson and try to apply our previous considerations. For this mixture stain, we have LR(2) = 0 and LR(3) ≈ 10,000 [
]. The overall LR therefore is between 0 and 10,000. To assess the weighting factor P(NoC = 3|D, Hd), we have to consider LRd and P(NoC=3). As noted before, LRd can be calculated from the data and is 2.27 in this case [
]. As usual, the determination of the prior probability P(NoC=3) is much more difficult. Because we have a stain on a hat which can be touched by many or few people, no real prior information is available. Both two or three contributors are possible. The dependence of LR on the prior P(NoC=3) is shown in Fig. 1. If we take P(NoC=3)=1/2, then a LR around 7000 will result. Because the prior distribution is uncertain and to be conservative, it would make sense to arrive at on overall LR which is somewhat lower, e.g. around 5000.
Fig. 1LR as a function of P(NoC=3) for the Hat Example, where LR(2) = 0, LR(3) ≈ 10,000, LRd = 2.27.
This manuscript deals with a special case in the NoC discussion to make the results clear and accessible also to non-statisticians. This is also the reason why we apply the results to the elucidating Hat Example of Biedermann, Taroni and Thompson [
]. Of course, generally other NoCs are possible. The Hat Example regards a simple discrete model without drop-out or drop-in. If fully continuous models are considered (so-called ‘probabilistic genotyping’), the situation somewhat changes because, if too many contributors are chosen, then their contribution might be modelled as nearly zero. Another important simplification in this manuscript is that only one prior distribution for the NoC is applied, the same for Hp and Hd. Although this is a reasonable approximation for many cases, there are also situations where this is not appropriate. For more information on these subjects see [
Because drop-out, drop-in and minor contributions are possible for DNA mixtures, the NoC can never be determined without uncertainty. Therefore, calculation of the LR for several NoCs is required. The probabilites of the data for different NoCs under Hd and the prior distributions of the NoC then influence the overall LR.
Conflict of interest statement
The author of this manuscript declares no conflict of interest.
Acknowledgements
The author would like to thank Klaas Slooten for many happy and fruitful discussions.
References
Biedermann A.
Taroni F.
Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.