Advertisement

Genetic peopling of Pakistan: Influence of consanguinity on population structure and forensic evaluation of traces

Published:September 26, 2019DOI:https://doi.org/10.1016/j.fsigss.2019.09.089

      Abstract

      Pakistan is one of the most consanguineous country in the world [
      • Hamamy H.
      Consanguineous marriages.
      ] where the cousin marriages account for more than half of the total unions. To investigate the genetic evidence of consanguinity in Pakistani populations, 1020 samples were collected from the volunteers belonging to four different populations. The magnitude of the effect of population structure and consanguinity on the calculation of likelihood ratios were also explored.

      Keywords

      1. Introduction

      Pakistan is located at important cross-roads of human history. It has nurtured many civilizations and have seen several mass migrations. Historical records have shown that humans from Africa traversed to India, China and up to Australia passing through current day Pakistan [
      • Bae C.J.
      • Douka K.
      • Petraglia M.D.
      On the origin of modern humans: asian perspectives.
      ]. Therefore, it keeps a unique mixture of ancient and modern civilizations. Pakistan is the sixth largest country in terms of population where people speak around 74 different languages [

      Pakistan, (Date Accessed 16 September 2019). https://www.ethnologue.com/country/PK.

      ]. The language barrier and some other factors force people to marry within their ethnicity. In Pakistan, 56% of the marriages are between first and second cousins only [
      • Hamamy H.
      Consanguineous marriages.
      ,

      Pakistan demographic and Health Survey, (Date Accessed 16 September 2019). https://dhsprogram.com/pubs/pdf/FR290/FR290.pdf.

      ], which makes it interesting to study its genetic composition and population structure.
      There has been some genetical studies on Pakistani populations, but with some constraints such as the limited number of samples and of loci analyzed and the extension of populations considered. An earlier study reported the genetic distance between three co-resident Pakistani populations to be 13% and suggested to use this value as a correction factor to calculate the match probabilities [
      • Zhivotovsky L.A.
      • Ahmed S.
      • Wang W.
      • Bittles A.H.
      The forensic DNA implications of genetic differentiation between endogamous communities.
      ]. But soon it was revealed by another study that the previous results were in contradiction with published guidelines. During this period, the first study was quoted in many court cases where the defense suggested the use of FST equal to 0.13 as a correction factor. In one of the case, the paternity index reduced substantially from 1300 to just 350 [
      • Curran J.M.
      • Buckleton J.
      The appropriate use of subpopulation corrections for differences in endogamous communities.
      ].
      So, there was a need to thoroughly study the genetic composition and structure of Pakistani populations, which ensures the robust sampling of an extended number of Pakistani populations and also covers most of the population. A more reliable and representative database of allele frequencies could be established. That can be used in forensic evaluation of biological evidence. The genetic and forensic parameters of interest such as heterozygosity, coefficient of co-ancestry (FST) and coefficient of inbreeding (FIS) of Pakistani populations can be used to explain the evolutionary development of these populations. Those parameters could be efficiently used in the evidential assessment of DNA results in forensic context.
      Furthermore, both the ENFSI guidelines [] and NRC recommendations [] suggest the use of the likelihood ratio and the use of parameters to assess the probative value of forensic evidence, respectively. In case of a correspondence between the value of the genetic profiles obtained from the evidence material and the reference samples, should be supported by some underlying data describing the relevant population genetically.

      2. Materials and methods

      In order to study the structure of Pakistani populations and the impact of migrations and consanguinity, 1020 healthy individuals belonging to Punjabi, Saraiki, Pakhtun and Sindhi populations based on 15 autosomal STR loci (AmpFlSTR Identifiler® kit) were analyzed after getting the written consent form to participate in this study. Genetic Analyses were performed by using the FORSTAT [
      • Ristow P.G.
      • D’Amato M.E.
      Forensic statistics analysis toolbox (FORSTAT): a streamlined workflow for forensic statistics.
      ] program and hierfstat [
      • Goudet J.
      Hierfstat a package for r to compute and test hierarchical F-statistics.
      ] package of R language.

      3. Results

      Allele frequencies were calculated and population databases were established for each sub-population as well as for the whole country. In order to check the applicability of these STR markers to be used as a potential human identification tool, heterozygosity was calculated on each locus for each sub-population. Results have shown that mean Observed Heterozygosity (HO) ranged from 0.7267 in Sindhi population to 0.7805 in Pakhtun population, while the mean Gene Diversity (HS) ranged from 0.7978 in Sindhi to 0.8038 in Saraiki population. Population specific coefficient of co-ancestry (FST) was observed to be -0.003 in Pakhtun population, 0.002 in Punjabi as well as in Saraiki and 0.005 in Sindhi population. This FST value was in contrast with one published earlier [
      • Zhivotovsky L.A.
      • Ahmed S.
      • Wang W.
      • Bittles A.H.
      The forensic DNA implications of genetic differentiation between endogamous communities.
      ]. The genetic distance based on FST was lowest (0.0007) between Pakhtun and Sindhi, while it was highest (0.0024) between Punjabi and Sindhi populations.
      With a view to explore the extent of consanguinity in Pakistani populations and to search for the genetic proofs, coefficient of inbreeding (FIS) was calculated. Mean values for the coefficient of inbreeding ranged from 0.029 to 0.091 which is quite high for human populations.
      These genetic parameters can influence the assessment of a forensic case. To quantify the impact of coefficient of co-ancestry (FST) and the coefficient of inbreeding (FIS) on the calculation of likelihood ratio of a match, we used the formulas suggested by Balding and Nichols [
      • Balding D.J.
      • Nichols R.A.
      DNA profile match probability calculation: how to allow for population stratification relatedness, database selection and single bands.
      ] and Ayres & Overall [
      • Ayres K.L.
      • Overall A.D.J.
      Allowing for within-subpopulation inbreeding in forensic match probabilities.
      ]. It can be easily shown that the value of evidence is strongly reduced by integrating FST and FIS values in the likelihood ratio calculation. This emphasizes the crucial need of genetical parameters such as FST and FIS to assess DNA profiles in populations where high degree of consanguinity is present along with population sub structure.

      4. Discussion

      Pakistani populations have been living together since ages. There has been introduction of new alleles into these populations due to migrations, catastrophes, wars and search for the better socioeconomic opportunities. Yet these populations remained admixed with each other and maintained their unique genetic and geographical structure. High prevalence of first and second cousin marriages may have contributed to the formation of this admixture by keeping their gene pool intact in social boundaries. Ease of communication, unique cultural heritages and religious priorities have also allowed these population to marry within families. These inter-family marriages have produced greater similarities between the individuals of these populations. These similarities between individuals should be quantified and taken into account in every evidence evaluation involving DNA profiles.

      5. Conclusion

      Pakistan is a country with an important demographic history. Pakistani populations are highly consanguineous and they have retained their gene pool by keeping the tradition of endogamous marriages intact, which is quite evident from high FIS values. This unique behavior advocates for the use of FIS values along with FSTwhen calculating the conditional match probabilities. This will avoid overstatement about the value of the evidence.

      Declaration of Competing Interest

      Authors agree that there was no conflict of interest.

      Acknowledgements

      This research was supported by the Swiss Excellence Government Scholarship for PhD studies of foreign students in Switzerland. The authors would like to thank Shahid Hussain, Attaur Rehman, Manzoor Hussain, Jacques Linden, Lorenzo Gaborini and Tacha Hicks for their technical help in carrying out this project.

      References

        • Hamamy H.
        Consanguineous marriages.
        J. Community Genet. 2011; 3: 185-192https://doi.org/10.1007/s12687-011-0072-y
        • Bae C.J.
        • Douka K.
        • Petraglia M.D.
        On the origin of modern humans: asian perspectives.
        Science. 2017; 358 (eaai9067)https://doi.org/10.1126/science.aai9067
      1. Pakistan, (Date Accessed 16 September 2019). https://www.ethnologue.com/country/PK.

      2. Pakistan demographic and Health Survey, (Date Accessed 16 September 2019). https://dhsprogram.com/pubs/pdf/FR290/FR290.pdf.

        • Zhivotovsky L.A.
        • Ahmed S.
        • Wang W.
        • Bittles A.H.
        The forensic DNA implications of genetic differentiation between endogamous communities.
        Forensic Sci. Int. 2001; 119: 269-272https://doi.org/10.1016/s0379-0738(00)00442-4
        • Curran J.M.
        • Buckleton J.
        The appropriate use of subpopulation corrections for differences in endogamous communities.
        Forensic Sci. Int. 2007; 168: 106-111https://doi.org/10.1016/j.forsciint.2006.06.073
      3. ENFSI, (Date Accessed 16 September 2019). http://enfsi.eu/wp-content/uploads/2016/09/m1_guideline.pdf.

      4. NRC II, (Date Accessed 16 September 2019). https://www.ncbi.nlm.nih.gov/books/NBK232610/pdf/Bookshelf_NBK232610.pdf.

        • Ristow P.G.
        • D’Amato M.E.
        Forensic statistics analysis toolbox (FORSTAT): a streamlined workflow for forensic statistics.
        Forensic Sci. Int. Genet. Suppl. Ser. 2017; 6: e52-e54https://doi.org/10.1016/j.fsigss.2017.09.006
        • Goudet J.
        Hierfstat a package for r to compute and test hierarchical F-statistics.
        Mol. Ecol. Notes. 2005; 5: 184-186https://doi.org/10.1111/j.1471-8286.2004.00828.x
        • Balding D.J.
        • Nichols R.A.
        DNA profile match probability calculation: how to allow for population stratification relatedness, database selection and single bands.
        Forensic Sci. Int. 1994; 64: 125-140https://doi.org/10.1016/0379-0738(94)90222-4
        • Ayres K.L.
        • Overall A.D.J.
        Allowing for within-subpopulation inbreeding in forensic match probabilities.
        Forensic Sci. Int. 1999; 103: 207-216https://doi.org/10.1016/s0379-0738(99)00087-0