Advertisement

GENETIC PORTRAIT OF THE PUNJABI POPULATION FROM PAKISTAN USING THE PRECISION ID ANCESTRY PANEL

  • Muhammad Adnan Shan
    Correspondence
    Corresponding author at: Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
    Affiliations
    Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark

    Centre for Applied Molecular Biology, University of the Punjab, Lahore, Pakistan
    Search for articles by this author
  • Mie Refn
    Affiliations
    Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
    Search for articles by this author
  • Niels Morling
    Affiliations
    Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
    Search for articles by this author
  • Claus Børsting
    Affiliations
    Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
    Search for articles by this author
  • Vania Pereira
    Affiliations
    Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
    Search for articles by this author
Published:October 17, 2019DOI:https://doi.org/10.1016/j.fsigss.2019.09.034

      Abstract

      Prediction of geographical ancestry using genetic markers has a great potential in forensic genetics and may be used as an investigative lead in crime casework or missing person identification. Exploration of AIMs in Pakistan is interesting due to the distinct subpopulations with multidirectional ancestry from different groups. In the current study, 87 individuals from the Punjabi population from Pakistan were investigated using the Precision ID Ancestry Panel (Thermo Fisher Scientific) to assess whether it was possible to diff ;erentiate Punjabi individuals from other populations. With this panel, it is revealed that Punjabis are admixed and cannot be distinguished from other populations in South Central Asia and the Middle East.

      Keywords

      1. Introduction

      Genetic markers that present marked allele frequency differences among populations can be used as ancestry informative markers (AIMs). Prediction of geographical ancestry using AIMs has a great potential in forensic genetics and may be used as an investigative lead in crime casework or missing person identification [
      • Halder I.
      • Shriver M.
      • Thomas M.
      • et al.
      A panel of ancestry informative markers for estimating individual biogeographical ancestry and admixture from four continents: utility and applications.
      ,
      • Pereira R.
      • Phillips C.
      • Pinto N.
      • et al.
      Straightforward inference of ancestry and admixture proportions through ancestry-informative insertion deletion multiplexing.
      ,
      • Phillips C.
      Forensic genetic analysis of bio-geographical ancestry.
      ]. Exploration of AIMs in Pakistan is interesting due to the distinct subpopulations with multidirectional ancestry from neighbouring states and native groups [
      • Petraglia M.D.
      • Allchin B.
      The Evolution and History of Human Populations in South Asia: Inter-disciplinary Studies in Archaeology, Biological Anthropology, Linguistics and Genetics.
      ,
      • Wright R.P.
      The Ancient Indus: Urbanism, Economy, and Society.
      ]. Punjab is the second largest and most populous of the four provinces of Pakistan. It has a population of more than 90 million corresponding to 46 % of the Pakistani population [

      2017 Census Archived 15 October 2017 at the Wayback Machine.

      ]. Punjab is located in the north-western part of the Indian plate at the Indus River system. The Punjabi population is the largest ethnic group in Pakistan. It consists of a heterogeneous population group with various tribes, clans, and communities. The native language of the province is Punjabi. Various ethnic groups settled in this region and formed the Indus Valley Civilization in the bronze age 3,300 to 1,300 BCE [
      • Petraglia M.D.
      • Allchin B.
      The Evolution and History of Human Populations in South Asia: Inter-disciplinary Studies in Archaeology, Biological Anthropology, Linguistics and Genetics.
      ,
      • Wright R.P.
      The Ancient Indus: Urbanism, Economy, and Society.
      ].
      In this work, the Precision ID Ancestry Panel (Thermo Fisher Scientific) [
      • Pereira V.
      • Mogensen H.S.
      • Børsting C.
      • et al.
      Evaluation of the Precision ID Ancestry Panel for crime casework: a SNP typing assay developed for typing of 165 ancestral informative markers.
      ,
      • Themudo G.E.
      • Mogensen H.S.
      • Børsting C.
      • et al.
      Frequencies of HID-ion ampliseq ancestry panel markers among Greenlanders.
      ] was used to genotype Punjabi individuals and estimate their ancestry. The panel includes 165 autosomal AIMs for genoogeographic prediction. The marker set is a combination of 55 SNPs of the Kidd AISNP panel [
      • Kidd K.K.
      • Speed W.C.
      • Pakstis A.J.
      • et al.
      Progress toward an efficient panel of SNPs for ancestry inference.
      ] and 123 AISNPs from the Seldin panel [
      • Kosoy R.
      • Nassir R.
      • Tian C.
      • et al.
      Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America.
      ,
      • Nassir R.
      • Kosoy R.
      • Tian C.
      • et al.
      An ancestry informative marker set for determining continental origin: validation and extension using human genome diversity panels.
      ] with 13 SNPs overlapping [
      • Thermo Fisher Scientific
      Ion AmpliSeq™ library preparation for human identification applications.
      ]. The most likely population of origin was investigated with GenoGeographer [
      • Tvedebrink T.
      • Eriksen P.S.
      • Mogensen H.S.
      • et al.
      Weight of the evidence of genetic investigations of ancestry informative markers.
      ,
      • Mogensen H.S.
      • Tvedebrink T.
      • Børsting C.
      • et al.
      Ancestry prediction efficiency of the software GenoGeographer using a z-score method and the ancestry informative markers in the Precision ID Ancestry Panel.
      ], a tool that calculates the population likelihoods of a profile for each reference population included in the database (Sub-Saharan Africa, Somalia, North Africa, Europe, Middle East, South Central Asia, and Greenland).

      1.1 Materials and methods

      A total of 87 unrelated Punjabi individuals were typed for 165 ancestry informative markers using the Precision ID Ancestry Panel (Thermo Fisher Scientific). The Arlequin v.3.5.2.2 software [
      • Excoffier L.
      • Lischer H.E.L.
      Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows.
      ] was used to estimate deviations from Hardy-Weinberg equilibrium (HWE). Data were compared to those of populations studied for the same markers (data kindly provided by the Kidd Lab, assembled from publicly available data). The SNP rs10954737 was excluded from the inter-population comparisons due to lack of data of the reference populations. Principal component analysis (PCA) was performed using an in-house developed Python script. The software GenoGeographer was used to calculate z-scores, likelihoods and likelihood ratios to infer the most likely population of Punjabis.

      2. Results and Discussion

      All studied 165 AIMs markers were in HWE after Bonferroni correction for multiple testing, except for one locus (rs310644; p-value_0.0001). Punjabis clustered with individuals from South Central Asia and the Middle East as visualised in the PCA plot (Fig. 1). The GenoGeographer tool was used to infer ancestry and calculate the weight of the evidence (examples in Table 1). Likelihood ratios = P(DNA│H1)/P(DNA│H2) were calculated for 74 out of the 87 tested individuals (z-score ≤1.64). For 13 out of the 87 Punjabis (14.9%), no appropriate reference population was found in the database (z score >1.64). Thus, ancestry inference could not be done. Of the 74 individuals with z-score ≤1.64, the most likely population of origin was South Central Asia (n=71) or Middle East (n=3).
      Fig. 1
      Fig. 1PCA plot of the Punjabi individuals (shown in brown) and selected reference populations. Each coloured symbol represents an individual according to the population membership. Abbreviations used: SC Asia: South-Central Asia, SS Africa: Sub-Saharan Africa.
      Table 1Likelihood ratios for 16 randomly selected individuals typed with the Precision ID Ancestry Panel.
      Population abbreviations: SC Asia: South-Central Asia, ME: Middle East, N Africa: North Africa, E. Asia: East Asia.
      Columns with LRs were coloured according to size: Bright red contains: Low LR, dark red contains: High LR.
      *Based on the databases available in GenoGeographer [5;6].

      3. Conclusions

      The present study represents an attempt to evaluate the genetic composition of the Punjabi population. PCA indicated that the Punjabi population is an admixed population with genetic ancestry components similar to those of other South Central Asian populations and the Middle East. The same was observed with Geno Geographer: The LRs in Table 1 revealed that Punjabis are more closely related to South Central Asian and Middle Eastern populations than to East Asian populations. To be able to distinguish Punjabi from these populations will most likely require a larger set of SNPs and/or a second tier panel specifically designed for these populations.

      References

        • Halder I.
        • Shriver M.
        • Thomas M.
        • et al.
        A panel of ancestry informative markers for estimating individual biogeographical ancestry and admixture from four continents: utility and applications.
        Hum. Mutat. 2008; 29: 648-658
        • Pereira R.
        • Phillips C.
        • Pinto N.
        • et al.
        Straightforward inference of ancestry and admixture proportions through ancestry-informative insertion deletion multiplexing.
        PLoS One. 2012; 7 (e29684)
        • Phillips C.
        Forensic genetic analysis of bio-geographical ancestry.
        Forensic Sci. Int. Genet. 2015; 18: 49-65
        • Petraglia M.D.
        • Allchin B.
        The Evolution and History of Human Populations in South Asia: Inter-disciplinary Studies in Archaeology, Biological Anthropology, Linguistics and Genetics.
        Springer Science & Business Media. 2007; 6 (ISBN 978-1-4020-5562-1)
        • Wright R.P.
        The Ancient Indus: Urbanism, Economy, and Society.
        Cambridge University Press, 2009: 44-51 (ISBN 978-0-521-57652-9)
      1. 2017 Census Archived 15 October 2017 at the Wayback Machine.

        • Pereira V.
        • Mogensen H.S.
        • Børsting C.
        • et al.
        Evaluation of the Precision ID Ancestry Panel for crime casework: a SNP typing assay developed for typing of 165 ancestral informative markers.
        Forensic Sci. Int. Genet. 2017; 28: 138-145
        • Themudo G.E.
        • Mogensen H.S.
        • Børsting C.
        • et al.
        Frequencies of HID-ion ampliseq ancestry panel markers among Greenlanders.
        Forensic Sci. Int. Genet. 2016; 24: 60-64
        • Kidd K.K.
        • Speed W.C.
        • Pakstis A.J.
        • et al.
        Progress toward an efficient panel of SNPs for ancestry inference.
        Forensic Sci. Int. Genet. 2014; 10: 23-32
        • Kosoy R.
        • Nassir R.
        • Tian C.
        • et al.
        Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America.
        Hum. Mutat. 2009; 30 (2009): 69-78
        • Nassir R.
        • Kosoy R.
        • Tian C.
        • et al.
        An ancestry informative marker set for determining continental origin: validation and extension using human genome diversity panels.
        BMC Genet. 2009; 10: 39
        • Thermo Fisher Scientific
        Ion AmpliSeq™ library preparation for human identification applications.
        Thermo Fisher Scientific Inc., Carlsbad2015
        • Tvedebrink T.
        • Eriksen P.S.
        • Mogensen H.S.
        • et al.
        Weight of the evidence of genetic investigations of ancestry informative markers.
        Theor. Popul. Biol. 2018; 120 (1-10)
        • Mogensen H.S.
        • Tvedebrink T.
        • Børsting C.
        • et al.
        Ancestry prediction efficiency of the software GenoGeographer using a z-score method and the ancestry informative markers in the Precision ID Ancestry Panel.
        Forensic Sci. Int. Genet. 2019; (in press)
        • Excoffier L.
        • Lischer H.E.L.
        Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows.
        Mol Ecol Resour. 2010; 10: 564-567