If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro, BrazilIPATIMUP (Institute of Pathology and Molecular Immunology from the University of Porto), Porto, PortugalInstituto de Investigação e Inovação em Saúde, Universidade do Porto, Portugal
Ancestry informative markers (AIMs) are useful to estimate individual and population ancestries, providing important information to forensic investigations. Several AIM sets were described and evaluated by comparison with data from GWAS. Taking into account that an efficient set of AIMs shall provide identical results between full brothers and GWAS are not easily performed, we aimed to see if the accuracy of the ancestry estimates are correlated to differences obtained in siblings. Pairs of siblings from Brazil were genotyped for 83 InDels; and values of African, European and Native American contributions were compared using diverse sets of markers. The comparison of the ancestry in siblings was only meaningful for markers with high inter-populations variation. The lowest average differences between brothers were obtained for the complete set of 83 InDels, even including markers with low inter-populations variation.
Ancestry informative markers – AIMs present significant differences in their allelic frequencies in different ancestral or geographically distant populations. They can be successfully used to estimate ancestry, at both individual and population levels, providing important information to forensic investigations [
]. There are several sets of markers described as being useful to determine ancestry, and their efficiency to estimate accurate ancestry proportions is generally evaluated by comparison with data generated by GWAS – Genome Wide Association Studies [
]. The determination of ancestry in siblings may be a good strategy to evaluate markers’ performance, taking into account that an efficient set of markers shall provide identical results of ancestry between them.
The aim of this study was to compare ancestry values among siblings for different groups of InDel markers. More precisely, how the inter-population diversity as well as the number of markers would affect the accuracy and differences between siblings’ ancestry estimates.
2. Materials and methods
A total of 26 pairs of siblings were selected from kinship cases investigated in the DNA Diagnostic Laboratory of the State University of Rio de Janeiro, Brazil. Written informed consent was obtained from all participants for cooperation in this study under strictly confidential conditions. DNA was extracted with Chelex [
Investigation of the STR locus HUMTH01 using PCR and two electrophoresis formats: UK and Galician Caucasian population surveys and usefulness in paternity investigations.
]. Samples were genotyped for 83 InDels with different degrees of diversity and inter-population variation, using two PCR multiplex protocols previously described [
]. Capillary electrophoresis and detection were performed on a 3500 Genetic Analyser using POP-7™ polymer (Applied Biosystems). The genotypes were assigned using the software GeneMapper ID v4.1 (Applied Biosystems).
The apportionment of genetic ancestral contributions was estimated in all samples using the STRUCTURE v2.3.3 software [
]. A supervised analysis was performed using prior information on the geographic origin of the reference samples, assuming an essentially tri-hybrid contribution from Native Americans, Europeans and Africans (i.e., K = 3). STRUCTURE runs consisted of 100,000 burnin steps followed by 100,000 Markov Chain Monte Carlo (MCMC) iterations. The option “Use population Information to test for migrants” was used with the Admixture model. Allele frequencies were correlated and updated using only individuals with POPFLAG = 1 (in this case, the HGDP-CEPH samples used as reference).
3. Results and discussion
In order to test the effect of using markers with low vs. high levels of population differentiation, values of African, European and Native American contributions were calculated for 26 pairs of siblings from the admixed population of Rio de Janeiro, using the 30 markers with lowest (Set 1) and the highest (Set 2) inter-population variation (Fig. 1A and B). Ancestry estimates from the three contributing populations (both within and between the 26 pairs) were very similar for Set 1; contrasting with the higher variation presented by the Set 2. Despite the low efficiency of the first set of markers to estimate ancestry, the differences between siblings were lower than those obtained with the second set (Set 1 and Set 2 in Table 1). Such fact is due to the observation that markers with low levels of population differentiation tend to produced similar errors. Indeed, values of ancestry below 0.33 for set 2 were always overestimated by set 1, and higher values were underestimated. The non-random deviation of estimates for set 1 precludes the usefulness of a comparative analysis in siblings..
Fig. 1Ancestry estimates for markers included in Set 1 (A), Set 2 (B), 46 AIMs (C) and for the 83 InDels (D).
Table 1Values of African (AFR), European (EUR) and Native American (NAM) ancestry estimated in the whole data set using different groups of markers, together with the sum and the average differences observed between siblings.
Results were also compared for set 2, 46 AIMs and the 83 full set (Fig. 1B–D). The differences observed between pairs of sibling were apparently random, which makes the comparison of siblings meaningful. Although the ancestry proportions were not significantly different for the three sets (Table 1), the highest differences between brothers were found for the 30 markers’ set, followed by the 46 AIMs and were lower for the full set of 83 markers. These results, support a better performance of the complete set in ancestry estimation, although including markers with low inter-populations variation.
Conclusion
The approach followed in this study to evaluate the performance of groups of genetic marker using pairs of siblings, proved not to be adequate when markers with very different inter-populations variation are compared. Despite the low efficiency of some markers to produce accurate ancestry estimates, they produce the same type of errors, reducing, therefore, the differences observed among siblings.
The deviations of the estimates obtained for groups of markers with high inter-populations variation were apparently random, making the comparison of the ancestry in siblings relevant. Using this strategy, we observed that the average differences between brothers decrease with the addition of more markers, supporting a better performance of large sets of markers, independently of their individual performance.
Financial support
Financial support was granted by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and DNA Program – State University and Justice Court of Rio de Janeiro, Brazil. IPATIMUP integrates the i3S Research Unit, which is partially supported by FCT, the Portuguese Foundation for Science and Technology.
Conflict of interest
None.
References
Pereira R.
Phillips C.
Pinto N.
et al.
Straightforward inference of ancestry and admixture proportions through ancestry-informative insertion deletion multiplexing.
Investigation of the STR locus HUMTH01 using PCR and two electrophoresis formats: UK and Galician Caucasian population surveys and usefulness in paternity investigations.