Advertisement

Sample collection strategies when building mitochondrial DNA forensic databases

Published:September 30, 2022DOI:https://doi.org/10.1016/j.fsigss.2022.09.033

      Abstract

      For establishing databases that capture the existing diversity in populations, the sample collection strategy is a determining factor and caution must be taken when choosing the suitable approach. Many researchers choose to restrict the sampling to individuals with inheritance for three generations in a specific geographic location. However, the appropriate database in a forensic context is the one representing the current population. We analyzed mtDNA composition across generations in populations from Colombia, Ecuador, and Paraguay. An overall genetic homogeneity was detected, with statistically significant differences on macrohaplogroup frequencies for few department/regions.

      Keywords

      1. Introduction

      In a forensic context, the statistical weight of a match between two identical mtDNA profiles depends on the frequency of the haplotype in a database. The high diversity of mtDNA haplotypes reported worldwide highlights the importance of large representative databases that capture the existing variability and allow population substructure to be evaluated [
      • Parson W.
      • Gusmão L.
      • Hares D.R.
      • Irwin J.A.
      • Mayr W.R.
      • Morling N.
      • Pokorak E.
      • Prinz M.
      • Salas A.
      • Schneider P.M.
      • Parsons T.J.
      DNA Commission of the International Society for Forensic Genetics: Revised and extended guidelines for mitochondrial DNA typing.
      ]. In these circumstances, the sample collection strategy is a crucial factor. Many authors choose to restrict the sample collection to individuals with residence and proven inheritance for, at least, three generations in a specific geographic location, which not necessarily represent the reference database in most forensic scenarios, namely in populations subject to recent migrations.
      The current genetic diversity in South American populations is mainly attributed to admixture events during the colonial period. More recent immigration from Europe and Asia are also influencing the genetic composition of these populations, as well as the continuous movement of individuals between and within countries. It is known that the admixture processes between individuals from different continental backgrounds have been happening distinctively across South America and resulted in patterns of admixture that vary throughout the subcontinent [e.g. [
      • Saloum de Neves Manta F.
      • Pereira R.
      • Vianna R.
      • Rodolfo Beuttenmüller de Araújo A.
      • Leite Góes Gitaí D.
      • Aparecida da Silva D.
      • de Vargas Wolfgramm E.
      • da Mota Pontes I.
      • Ivan Aguiar J.
      • Ozório Moraes M.
      • Fagundes de Carvalho E.
      • Gusmão L.
      Revisiting the genetic ancestry of Brazilians using autosomal AIM-Indels.
      ,
      • Corach D.
      • Lao O.
      • Bobillo C.
      • van Der Gaag K.
      • Zuniga S.
      • Vermeulen M.
      • van Duijn K.
      • Goedbloed M.
      • Vallone P.M.
      • Parson W.
      • De Knijff P.
      • Kayser M.
      Inferring continental ancestry of argentineans from autosomal, Y-chromosomal and mitochondrial DNA.
      ]]. Considering this, the aim of this work was to evaluate to what extent sample strategies impact on capturing the existing diversity in South America. Accordingly, the maternal genetic background of admixed populations was studied, to evaluate if differences exist on the genetic composition of populations over close generations.

      2. Materials and methods

      Detailed information on the samples used are described on Table 1.
      Table 1Origin and number of the samples used in the present study.
      ColombiaaParaguaybEcuadora
      Region/ departmentAntioquiaBoyacáCaldasCundinamarcaHuílaNorte SantanderRisaraldaSantanderTolimaAlto ParanáCaaguazúCaazapáCapitalCentralConcepciónCordilleraGuaíraItapúaMisionesParaguaríCentralNorthSouth
      Living placeØØØØØØØØØ2152030610311176137ØØØ
      Birthplace42493756474940203521122315913172210567939687042
      Mother birthplace333828404124321414927303052331642113642072222820
      Grandmother brithplace3043332739222313853ØØØØØØØØØØØ311728
      All generations*263318213718161263625101314028048127161615
      Legend: Ø Information not available, *individuals with inheritance in that geographic location for three generations, aunplublished, bSimão et al.
      • Simão F.
      • Ribeiro J.
      • Vullo C.
      • Catelli L.
      • Gomes V.
      • Xavier C.
      • Huber G.
      • Bodner M.
      • Quiroz A.
      • Ferreira A.P.
      • Carvalho E.F.
      • Parson W.
      • Gusmão L.
      The ancestry of eastern paraguay: a typical south american profile with a unique pattern of admixture.
      Note1: The total number of samples for each country vary throughout generations because information was not available for all individuals.
      Note2: Only datasets with more than 15 samples were used in the analyses.
      MtDNA haplotypes from Paraguay were retrieved from Simão et al. [
      • Simão F.
      • Ribeiro J.
      • Vullo C.
      • Catelli L.
      • Gomes V.
      • Xavier C.
      • Huber G.
      • Bodner M.
      • Quiroz A.
      • Ferreira A.P.
      • Carvalho E.F.
      • Parson W.
      • Gusmão L.
      The ancestry of eastern paraguay: a typical south american profile with a unique pattern of admixture.
      ] and haplotypes from Ecuador and Colombia were obtained with the same methodologies as in Simão et al. [
      • Simão F.
      • Ribeiro J.
      • Vullo C.
      • Catelli L.
      • Gomes V.
      • Xavier C.
      • Huber G.
      • Bodner M.
      • Quiroz A.
      • Ferreira A.P.
      • Carvalho E.F.
      • Parson W.
      • Gusmão L.
      The ancestry of eastern paraguay: a typical south american profile with a unique pattern of admixture.
      ]. Values of haplotype diversity (H) and Analysis of molecular variance (AMOVA) were obtained with Arlequin [
      • Excoffier L.
      • Laval G.
      • Schneider S.
      Arlequin (version 3.0); an integrated software package for population genetics data analysis.
      ]. The exclusion power (mtCE) was calculated according to Simão et al. [
      • Simão F.
      • Ferreira A.P.
      • de Carvalho E.F.
      • Parson W.
      • Gusmão L.
      Defining mtDNA origins and population stratification in Rio de Janeiro.
      ]. Analyses were performed after discarding indels at homopolymeric tracts.

      3. Results

      The values of H and mtCE obtained for each department/region did not change significantly over the generations established (Table 2). Nonetheless, in some departments/regions it is possible to detect a loss of diversity for the individuals with three generations at a specific birthplace, when compared with the values obtained considering individuals’ birthplace (Table 2).
      Table 2Values of haplotype diversity (H), exclusion power (mtCE) and AMOVA obtained for three generations in Paraguay, Colombia, and Ecuador.
      Haplotype diversity (H)Exclusion power (mtCE)AMOVA
      CountryRegion/DepartmentLiving placeBirthplaceMother birthplaceGrandmother birhtplaceAll generationshighest minus lowestLiving placeBirthplaceMother birthplaceGrandmother birhtplaceAll generationshighest minus lowestAmong populationsWithin populationsFSTP-value
      ColombiaAntioquia0,98490,98110,97470,984600100,95120,95450,93330,96000018-2,69102,69-0,026911,00000 ± 0,00000
      Boyacá0,96510,97440,97670,975400120,90480,93170,93580,93180031-1,82101,82-0,018161,00000 ± 0,00000
      Caldas0,96850,97350,93940,954200340,91740,90480,87690,88890041-1,36101,36-0,013570.86911 ± 0.00320
      Cundinamarca0,98640,98590,98580,985700010,93160,94360,93160,92860012-1,78101,78-0,017760.99980 ± 0.00014
      Huíla0,98330,98050,97980,977500030,96360,96340,96360,95950,000-2,22102,22-0,022231,00000 ± 000000
      Norte Santander0,98720,98910,98700,993500000,93070,94570,93070,97390015-2,25102,25-0,02250.99624 ± 0.00059
      Risaralda0,93590,94350,94070,933300050,80900,83060,81420,81670022-2,42102,42-0,024250.99901 ± 0.00030
      Santander0,97730,97860,97800,977700010,92580,93380,93640,93890011-0,49100,49-0,004931,00000 ± 0,00000
      Tolima0,98420,98470,98770,985700040,96370,95410,96370,96350010-1,18101,18-0,011760.99158 ± 0.00094
      ParaguayAlto Paraná0,99270,99070,98580,986700070,98390,99040,98290,98330008-0,65100,65-0,006450.99970 ± 0.00017
      Caaguazú0,99210,983900080,99600,97470021-2,22102,22-0,022170.95772 ± 0.00210
      Caazapá0,99050,981600090,97140,97240001-2,9102,9-0,029050.94386 ± 0.00252
      Capital0,99310,99560,993200030,94250,98120,98570,961500430,7899,220,007780.12139 ± 0.00366
      Central0,99450,99780,990500070,98690,98920,98860,98900002-1,7101,7-0,017031,00000 ± 0,00000
      Concepción0,99170,9833
      Cordillera1,00000,988400120,94810,97330025-1,94101,94-0,019430.95059 ± 0.00219
      Guaíra0,97580,97620,98170,975300060,96130,95990,96630,95570006-0,74100,74-0,007431,00000 ± 0,00000
      Itapúa0,99650,99460,99060,992900060,98770,98690,97670,98050011-1,33101,33-0,013281,00000 ± 0,00000
      Misiones0,98950,9789
      Paraguarí0,97000,98110,99220,977200220,95800,97030,98320,96580025-1,28101,28-0,012810.99505 ± 0.00067
      EcuadorCentral0,99340,99570,99351,000000020,97810,99570,99141,00000018-1,45101,45-0,014520.98455 ± 0.00118
      North0,99831,00001,00001,000000020,99630,99471,00001,00000005-1,25101,25-0,012480.92554 ± 0.00256
      South0,99300,99470,99210,990500030,99070,98420,98940,98100006-1,59101,59-0,015950.96535 ± 0.00202
      Note1: AMOVA results obtained after 10100 permutations.
      Note2: highest vs. lowest columns refers to the different between the highest and lowest value of diversities calculated, independent of the generations.
      The AMOVA performed for each department/region (after grouping samples according to the three generations established) showed no statistically significant differentiation among generations (data not shown), in both haplotype and haplogroup composition.
      A Fisher test was performed to assess the presence of statistically significant differences in macro-haplogroup proportions across generations (after grouping samples into A, B, C, D, Eurasian and African). Haplogroup frequencies were constant over the generations, with some exceptions. The frequency of haplogroup B was higher in the group of individuals living in Capital (Paraguay) (57%) than in the subsets of individuals and mothers born in the region (30% and 35%, respectively). Statistically significant differences (p < 0.05) were also detected in haplogroup C from Capital, and European lineages from Alto Paraná and South Ecuador (data not shown).

      4. Discussion

      An overall genetic homogeneity was detected, although residual differentiation seems to exist in some department/regions. If changes occurred on the maternal composition over recent generations, the high mtDNA diversity in South America [e.g. [
      • Simão F.
      • Ribeiro J.
      • Vullo C.
      • Catelli L.
      • Gomes V.
      • Xavier C.
      • Huber G.
      • Bodner M.
      • Quiroz A.
      • Ferreira A.P.
      • Carvalho E.F.
      • Parson W.
      • Gusmão L.
      The ancestry of eastern paraguay: a typical south american profile with a unique pattern of admixture.
      ,
      • Gómez-Carballa A.
      • Moreno F.
      • Álvarez-Iglesias V.
      • Martinón-Torres F.
      • García-Magariños M.
      • Pantoja-Astudillo J.A.
      • Aguirre-Morales E.
      • Bustos P.
      • Salas A.
      Revealing latitudinal patterns of mitochondrial DNA diversity in Chileans.
      ]] may have hampered the detection of differences among the subsets established. Based on genealogical data, these results can also be explained due to geneflow among populations that are not significantly different. For example, in the case of Paraguay, for the last three generations there was a high migration among departments with similar mtDNA genetic background [
      • Simão F.
      • Ribeiro J.
      • Vullo C.
      • Catelli L.
      • Gomes V.
      • Xavier C.
      • Huber G.
      • Bodner M.
      • Quiroz A.
      • Ferreira A.P.
      • Carvalho E.F.
      • Parson W.
      • Gusmão L.
      The ancestry of eastern paraguay: a typical south american profile with a unique pattern of admixture.
      ].

      5. Conclusion

      Haplotype frequency databases built to disclose population history are often used for forensic purposes. However, such databases do not always represent the current genetic diversity of the population, but the fraction of those individuals with three-generation heritage in a geographic region. Therefore, to assess if databases built with different purposes can be interchangeable, it is crucial to consider demographic and historical data that may point to genetic differences between generations. This aspect is particularly important in the construction of databases of South American populations, which have complex population dynamics, with different levels of recent immigration and of isolation.

      Financial support

      L.G. and F.S. were supported by FAPERJ, Brazil (CNE-2022 and E-26/202.275/2019). L.G. was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico–CNPq, Brazil (ref. 306342/2019-7). AC was supported by Vicerectoría de Investigación y Extensión, Universidad Industrial de Santander (internal financing code 2488 of 2019). G.B was supported by MED.GBF.20.07 funded by DGIV from Universidad de Las Américas; Quito, Ecuador.

      Conflict of interest

      None.

      References

        • Parson W.
        • Gusmão L.
        • Hares D.R.
        • Irwin J.A.
        • Mayr W.R.
        • Morling N.
        • Pokorak E.
        • Prinz M.
        • Salas A.
        • Schneider P.M.
        • Parsons T.J.
        DNA Commission of the International Society for Forensic Genetics: Revised and extended guidelines for mitochondrial DNA typing.
        Forensic Sci. Int. Genet. 2014; 13: 134-142
        • Saloum de Neves Manta F.
        • Pereira R.
        • Vianna R.
        • Rodolfo Beuttenmüller de Araújo A.
        • Leite Góes Gitaí D.
        • Aparecida da Silva D.
        • de Vargas Wolfgramm E.
        • da Mota Pontes I.
        • Ivan Aguiar J.
        • Ozório Moraes M.
        • Fagundes de Carvalho E.
        • Gusmão L.
        Revisiting the genetic ancestry of Brazilians using autosomal AIM-Indels.
        PLoS One. 2013; 8e75145
        • Corach D.
        • Lao O.
        • Bobillo C.
        • van Der Gaag K.
        • Zuniga S.
        • Vermeulen M.
        • van Duijn K.
        • Goedbloed M.
        • Vallone P.M.
        • Parson W.
        • De Knijff P.
        • Kayser M.
        Inferring continental ancestry of argentineans from autosomal, Y-chromosomal and mitochondrial DNA.
        Ann. Hum. Genet. 2010; 74: 65-76
        • Simão F.
        • Ribeiro J.
        • Vullo C.
        • Catelli L.
        • Gomes V.
        • Xavier C.
        • Huber G.
        • Bodner M.
        • Quiroz A.
        • Ferreira A.P.
        • Carvalho E.F.
        • Parson W.
        • Gusmão L.
        The ancestry of eastern paraguay: a typical south american profile with a unique pattern of admixture.
        Genes. 2021; 12
        • Excoffier L.
        • Laval G.
        • Schneider S.
        Arlequin (version 3.0); an integrated software package for population genetics data analysis.
        Evol. Bionform. Online. 2005; 1: 47-50
        • Simão F.
        • Ferreira A.P.
        • de Carvalho E.F.
        • Parson W.
        • Gusmão L.
        Defining mtDNA origins and population stratification in Rio de Janeiro.
        Forensic Sci. Int. Genet. 2018; 34: 97-104
        • Gómez-Carballa A.
        • Moreno F.
        • Álvarez-Iglesias V.
        • Martinón-Torres F.
        • García-Magariños M.
        • Pantoja-Astudillo J.A.
        • Aguirre-Morales E.
        • Bustos P.
        • Salas A.
        Revealing latitudinal patterns of mitochondrial DNA diversity in Chileans.
        Forensic Sci. Int. Genet. 2016; 20: 81-88