Research Article| Volume 7, ISSUE 1, P234-236, December 2019

# Two or three contributors of DNA mixtures? – Some practical considerations

Published:October 16, 2019

## Abstract

The number of contributors is hard to determine in DNA mixture profiles. Here, we deal with the special but frequent case that either two or three contributors are possible. In fact, it might happen that two contributors can explain the number of alleles seen but that three contributors are necessary if a specific person of interest is to be included in the mixture. Then the likelihood ratio assuming two contributors will be zero while the likelihood ratio for three contributors may be large. We evaluate this situation and offer suggestions on how to arrive at an overall likelihood ratio. To exemplify our line of reasoning we use an example proposed by Biedermann, Taroni and Thompson.

## 1. Introduction

In DNA mixtures it is often not clear which number of contributors (NoC) to choose. There has been a debate ongoing on whether the NoC should be the same or not for the two hypotheses to be compared in a likelihood ratio (LR) calculation [
• Brenner C.H.
Fairness in evaluating DNA mixtures.
,
• Evett I.W.
Another response to “About the number of contributors to a forensic sample”.
,
• Gill P.
A response to “About the number of contributors to a forensic sample”.
,
• Presciuttini S.
• Egeland T.
About the number of contributors to a forensic sample.
]. This subject has been treated thoroughly in [
• Slooten K.
• Caliebe A.
Contributors are a nuisance (parameter) for DNA mixture evidence evaluation.
] where it was clarified under which conditions the same or different NoC should be selected. In this study we will focus on a situation which arises frequently in casework for mixed DNA profiles: The trace profile shows at most four alleles for each locus. Then in a common (though overly simplified) approach, two contributors might be assumed. Later, the profile of a person of interest (PoI) is derived and it turns out that under the condition that this PoI is a contributor, three contributors are necessary to explain the mixed trace. Is it then reasonable to calculate the LR for three contributors (as the prosecution might favour) or have two contributors to be chosen (as the defense might insist)? This problem will be considered in the following and practical advice about how to handle this scenario will be given. In this manuscript, we will frequently refer to a highly illustrative example given by Biedermann, Taroni and Thompson [
• Biedermann A.
• Taroni F.
• Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.
] that will be stated now.

### 1.1 Hat Example of Biedermann, Taroni and Thompson

The so-called ‘Hat Example’ was described in [
• Biedermann A.
• Taroni F.
• Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.
] in the following way: “Two brothers B and C were prosecuted for murder. According to the prosecution theory, the brothers entered a store where B wrestled with the clerk and C shot the clerk with a handgun. A video surveillance tape showed the crime occurring. Although the faces of the assailants could not be seen, the video revealed that the shooter had worn a hat that was found at the crime scene. Key evidence in the case was the DNA profile for a mixed stain found on the hat […].” [
• Biedermann A.
• Taroni F.
• Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.
] This seven-loci stain had three or four alleles to each locus, so a NoC of two might be acceptable. When comparing suspect C to the stain, however, it was found that C was homozygous for allele 13 for locus D5S818 whereas the trait consisted of alleles {8, 11, 12, 13}. If we denote by Hp the hypothesis that C contributed to the trace and D are the profile data, then three contributors must be assumed under Hp for a non-zero likelihood L(D) = P(D|Hp) (under a simple model without using peak heights and no drop-in or drop-out). A similar problem was noted for locus D21S11.

## 2. Methods

In [
• Slooten K.
• Caliebe A.
Contributors are a nuisance (parameter) for DNA mixture evidence evaluation.
] the NoC was treated as a nuisance parameter in Bayesian fashion. Let us suppose that we have a minimal NoC nmin and a maximal NoC nmax and that all NoCs between nmin and nmax are possible under both Hp and Hd. Then the overall LR LR(D) = P(D|Hp)/P(D|Hd) can be written as a weighted average over the LR(n)s for n contributors:
$LR(D)=∑n=nminnmaxLR(n)(D)P(NoC=n|D,Hd)P(NoC=n|Hp)P(NoC=n|Hd).$
(1)

Here, P(NoC = n|D, Hd) is the a posteriori probability to have n contributors when the PoI is not contributing and given the data D. The expressions P(NoC = n|Hp) and P(NoC = n|Hd) refer to the prior probabilities to have n contributors either under Hp or Hd. For these prior probabilities the genetic information D is not taken into account and they are solely determined by other case circumstances. Although they might be different for Hp and Hd, this is only in some cases sensible [
• Slooten K.
• Caliebe A.
Contributors are a nuisance (parameter) for DNA mixture evidence evaluation.
] and we will restrict ourselves to the case that P(NoC = n|Hp) = P(NoC = n|Hd). Then the prior probabilies in Eq. (1) cancel out and we obtain the simplified equation
$LR(D)=∑n=nminnmaxLR(n)(D)P(NoC=n|D,Hd).$

Let us now return to the case of either two or three contributors. For simplicity we assume that other NoCs are not possible. Then the sum consists only of two terms
$LR(D)=LR(2)P(NoC=2|D,Hd)+LR(3)P(NoC=3|D,Hd).$

We are interested in the situation that under Hp only tree contributors are possible. That means that LR(2) = 0 and we obtain the central equation
$LR(D)=LR(3)P(NoC=3|D,Hd)$
(2)

which we will investigate further in order to get leads about how to deal with the ambiguity in the NoC.

## 3. Results

When handling the two versus three contributors situation, we have two LRs to consider: LR(2) = 0 and LR(3). The latter might be large. Which LR should be reported? When looking at the overall LR of Eq. (2), we see that LR(3) is weighted by the probability of having three contributors derived from the trace data when the PoI has not contributed P(NoC = 3|D, Hd). This probability is between 0 and 1, thus LR is between 0 and LR(3). For evaluating whether LR is nearer to 0 or to LR(3), the weighting factor P(NoC = 3|D, Hd) has to be (at least roughly) estimated. To do that, we write this probability in a different kind of way:
$P(NoC=3|D,Hd)=P(NoC=3)LRdP(NoC=3)(LRd−1)+1$

where LRd = P(D|NoC = 3, Hd)/P(D|NoC = 2, Hd). We see that P(NoC = 3|D, Hd) depends on the two terms LRd and P(NoC = 3) and we will consider these separately in the following.

### 3.1 The likelihood ratio LRd

LRd = P(D|NoC = 3, Hd)/P(D|NoC = 2, Hd) compares the probability of the trace data under either two or three contributors in the situation where the PoI is not part of the mixture. LRd will be large if three contributors are much more likely than two when looking at the mixture. Therefore, it is not surprising that for large values of LRd, also the weighting factor P(NoC = 3|D, Hd) gets large, i.e. near to one and the overall LR is near to LR(3). LRd is a LR for two different propositions, namely two or three contributors and can therefore be readily calculated from the data.

### 3.2 The prior probability P(NoC=3)

Whereas for LRd the mixture profile D is essential, P(NoC=3) is the prior probability of having three contributors to the trace without using the genetic information. The larger the prior probability for three contributors is, the nearer the overall LR will be to LR(3). P(NoC=3) can only be derived from the non-genetic evidence and case circumstances. As such, it cannot be calculated and can at best be loosely estimated in contrast to LRd. Depending on the information available for a specific case, it might be possible to make inference about the order of P(NoC=3), e.g. by eye witness evidence or when it is know that only two persons have access to a special item (such as victim and PoI). In other circumstances, no information about P(NoC=3) might be available. Then a prior of 1/2 could be one possibility. For sensitivity analysis, a number of sensible priors should be taken into account.

### 3.3 Returning to the Hat Example

Let us now return to the Hat Example of Biedermann, Taroni and Thompson and try to apply our previous considerations. For this mixture stain, we have LR(2) = 0 and LR(3) ≈ 10,000 [
• Biedermann A.
• Taroni F.
• Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.
]. The overall LR therefore is between 0 and 10,000. To assess the weighting factor P(NoC = 3|D, Hd), we have to consider LRd and P(NoC=3). As noted before, LRd can be calculated from the data and is 2.27 in this case [
• Biedermann A.
• Taroni F.
• Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.
]. As usual, the determination of the prior probability P(NoC=3) is much more difficult. Because we have a stain on a hat which can be touched by many or few people, no real prior information is available. Both two or three contributors are possible. The dependence of LR on the prior P(NoC=3) is shown in Fig. 1. If we take P(NoC=3)=1/2, then a LR around 7000 will result. Because the prior distribution is uncertain and to be conservative, it would make sense to arrive at on overall LR which is somewhat lower, e.g. around 5000.

## 4. Discussion

This manuscript deals with a special case in the NoC discussion to make the results clear and accessible also to non-statisticians. This is also the reason why we apply the results to the elucidating Hat Example of Biedermann, Taroni and Thompson [
• Biedermann A.
• Taroni F.
• Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.
]. Of course, generally other NoCs are possible. The Hat Example regards a simple discrete model without drop-out or drop-in. If fully continuous models are considered (so-called ‘probabilistic genotyping’), the situation somewhat changes because, if too many contributors are chosen, then their contribution might be modelled as nearly zero. Another important simplification in this manuscript is that only one prior distribution for the NoC is applied, the same for Hp and Hd. Although this is a reasonable approximation for many cases, there are also situations where this is not appropriate. For more information on these subjects see [
• Biedermann A.
• Taroni F.
• Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.
,
• Slooten K.
• Caliebe A.
Contributors are a nuisance (parameter) for DNA mixture evidence evaluation.
].

## 5. Conclusion

Because drop-out, drop-in and minor contributions are possible for DNA mixtures, the NoC can never be determined without uncertainty. Therefore, calculation of the LR for several NoCs is required. The probabilites of the data for different NoCs under Hd and the prior distributions of the NoC then influence the overall LR.

## Conflict of interest statement

The author of this manuscript declares no conflict of interest.

## Acknowledgements

The author would like to thank Klaas Slooten for many happy and fruitful discussions.

## References

• Biedermann A.
• Taroni F.
• Thompson W.C.
Using graphical probability analysis (Bayes Nets) to evaluate a conditional DNA inclusion.
Law Probab. Risk. 2011; 10: 89-121
• Brenner C.H.
Fairness in evaluating DNA mixtures.
Forensic Sci. Int. Genet. 2017; 27: 186
• Evett I.W.
Another response to “About the number of contributors to a forensic sample”.
Forensic Sci. Int. Genet. 2017; 28: e11
• Gill P.
A response to “About the number of contributors to a forensic sample”.
Forensic Sci. Int. Genet. 2017; 26: e9-e13
• Presciuttini S.
• Egeland T.
About the number of contributors to a forensic sample.
Forensic Sci. Int. Genet. 2016; 25: e18-e19
• Slooten K.
• Caliebe A.
Contributors are a nuisance (parameter) for DNA mixture evidence evaluation.
Forensic Sci. Int. Genet. 2018; 37: 116-125