Advertisement

STRAND: A Cloud expert system for non-human DNA analysis

Published:September 24, 2019DOI:https://doi.org/10.1016/j.fsigss.2019.09.057

      Abstract

      The aim of this paper is to present the STRAND (STR ANimal Database) cloud expert system for non-human DNA analysis. The cloud expert system (CES) combines the cross-referenced registries of STR markers for different species and a DNA database for comparison of DNA profiles, with a repository of scientific papers and a dashboard for unpublished data, protocols, negative results and announcements related to animal DNA typing.

      Keywords

      1. Introduction

      Poaching animals for use in traditional medicines and illegal trade has a large and negative impact on the critically endangered animals that are protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). DNA barcoding techniques are established methods [
      • Linacre A.
      • Tobe S.S.
      An overview to the investigative approach to species testing in wildlife forensic science.
      ] that, despite some pending problems [
      • Collins R.
      • Cruickshank R.
      The seven deadly sins of DNA barcoding.
      ], provide information about the identification of the specimen within a tested sample by offering qualitative information about the matching organism [
      • Ratnasingham S.
      • Hebert P.D.
      A DNA-based registry for all animal species: the Barcode Index Number (BIN) system.
      ]. The investigation of wildlife crime based solely on specimen identification is not sufficient in many cases because individual identification or paternity is often the key information that enables the case to be solved. Human STR-based DNA identification for forensic purposes is a well-established method that offers a comparison of DNA profiles nation- and worldwide [
      A hierarchical database design and search method for CODIS.
      ,
      • Niezgoda S.J.
      CODIS program overview.
      ,
      • van der Beek C.
      Past, present and future of forensic.
      ] and allows specialists to perform family searches [
      • Ge J.
      • Chakraborty R.
      • Eisenberg A.
      • Budowle B.
      Comparisons of familial DNA database searching strategies.
      ] using CODIS software. Forensic animal DNA typing, on the other hand, is still far from the standardization level of human DNA identification. The recommendations of ISFG [
      • Linacre A.
      • Gusmao L.
      • Hecht W.
      • Hellmann A.
      • Mayr W.
      • Parson W.
      • et al.
      ISFG: recommendations regarding the use of non-human (animal) DNA in forensic genetic investigations.
      ] serve as a good framework for the development of STR typing systems, but unfortunately, there are few scientific papers describing fully validated STR typing multiplexes or CODIS-like databases for CITES organisms involved in wildlife crime. Additionally, there is a lack of a knowledge base, such as the NIST STRBase [
      • Butler J.M.
      New resources for the forensic genetics community available on the NIST STRBase website.
      ], for the field of animal DNA typing.

      2. Basic layout of STRAND CES

      The STRAND CES uses an https-secure connection, a standard info domain for data sources (www.nhDNAdb.info) and a virtual server in the Czech Radiocommunications governmental data center. Access to the STRAND CES requires log-in credentials that can be provided by the STRAND CES administrators upon request. The STRAND cloud expert system supports 3 levels of users: registered users, DNA laboratories, and reference laboratories. The basic idea is that every species has one selected reference laboratory that defines and validates the set(s) of STR loci, manages the population studies and provides scientific support to laboratories submitting DNA profiles to the STRAND database. This set of STR markers should follow the ISFG recommendations for animal DNA typing (allelic ladders, preferably 4 bp STR repeats, population studies of allelic frequencies, species-specific primers, 17025 accreditation, multiplexes,…) [
      • Linacre A.
      • Gusmao L.
      • Hecht W.
      • Hellmann A.
      • Mayr W.
      • Parson W.
      • et al.
      ISFG: recommendations regarding the use of non-human (animal) DNA in forensic genetic investigations.
      ]. The standardized set of validated STRs allows interlaboratory comparison, and thus, the reference laboratory should use either published primer sequences or be able to provide kit(s) to other laboratories. The reference laboratory should also have previous experience with forensic cases and have the capacity for sample retesting in disputed cases.

      3. Data privacy and access to cloud expert system resources

      Laboratories own their data and can mark their stored DNA profiles as private, shared or public in the STRAND CES. Public data sets can be viewed and searched by all registered laboratories. Only registered collaborating labs are allowed access to DNA profiles. DNA profiles can also be marked as references for population studies. The DNA database enables the addition of more STR loci to the existing sets. Registered users can browse the bibliographic records, list of taxons that have DNA profiles and short reports and can submit the following queries: “Show me the STR systems used/recommended for a particular species”, “Who can compare my forensic unknown DNA profile with a registry of DNA profiles”, “Show me all publications for this species”, “Show me all authors who have published about this species” (see Fig. 1), “Show me the (reference) labs that work with a particular species”, and “Show me WHO did WHAT within the STRAND CES so I can ask them”.
      Fig. 1
      Fig. 1Expert System/DNA Bibliography (example listing for Panthera tigris). You can directly access the PDF/abstract of the article and the Wikipedia/Google entry for the taxon.

      4. The structure of the cloud expert system

      The CES supports cross-referencing between all modules (the taxon registry, the registry of STR markers, reference database of DNA profiles, bibliography, dashboard). The CES offers the following modules:

      4.1 The taxon registry

      The taxon registry has a tree structure (class/order/family/genus/species). DNA profile records are always linked to a particular species. Bibliography records can be linked at all levels of the tree structure (class/order/family/genus/species). The taxon registry contains CITES flags.

      4.2 STR markers registry (markers/species)

      The basic idea is that every taxon has a defined set of validated STR markers at the species level (Fig. 2 supplemental material). This set of STR markers should follow the ISFG recommendations and thus enable interlaboratory comparisons. Each STR locus is linked with allelic frequency data derived from the population data of unrelated individuals and thus allows the calculation of the probability of identity [
      • Waits L.P.
      • Luikart G.
      • Taberlet P.
      Estimating the probability of identity among genotypes in natural populations: cautions and guidelines.
      ] and the parentage exclusion [
      • Jamieson A.
      • Taylor C.
      Comparisons of three probability formulae for parentage exclusion.
      ].

      4.3 DNA profile reference database

      The CES allows users to compare unknown DNA profiles with database records (the closest matches are listed at the top) (Fig. 3 supplemental material), to compare 2-N database records against each other, to perform basic population data statistics and to perform paternity comparisons (finding candidate parents for the unknown individual or finding candidate offspring for the parents).

      Bibliography

      The users of the CES Bibliography module are DNA scientists, experienced users (governmental agencies) and lay people. This module is not only a repository of scientific papers but also contains additional primary registries such as authors, departments (labs) and journals. These registries are linked to the DOI index and an abstract/full paper. This database contains internal linkages to STR markers, DNA profiles and taxons (Fig. 1).

      Short communication dashboard

      This module serves as a dashboard for unpublished data, protocols, negative results and announcements related to animal DNA typing.

      Conclusions

      The STRAND CES enables powerful cooperation among all parties (law enforcement, environmental agencies, DNA laboratories, research institutes, scientific societies, etc.) involved in combating the illegal trade of endangered species or performing identification and research studies on different animal species. The DNA database of the cloud expert system provides centralized storage for DNA profiles regardless of the location where the DNA profile was generated, assuming that DNA profiles were generated using standardized and validated sets of STRs. The power of any database of DNA profiles increases with the number of entries.

      Declaration of Competing Interest

      The authors declare no conflict of interest

      Appendix A. Supplementary data

      The following is Supplementary data to this article:

      References

        • Linacre A.
        • Tobe S.S.
        An overview to the investigative approach to species testing in wildlife forensic science.
        Investig. Genet. 2011; 2: 2
        • Collins R.
        • Cruickshank R.
        The seven deadly sins of DNA barcoding.
        Mol. Ecol. Resour. 2013; 13: 969-975
        • Ratnasingham S.
        • Hebert P.D.
        A DNA-based registry for all animal species: the Barcode Index Number (BIN) system.
        PLoS One. 2013; 8: e66213
      1. A hierarchical database design and search method for CODIS.
        in: Birdwell J. Horn R. Icove D. Wang T. Yadav P. Niezgoda S. Tenth International Symposium on Human Identification. 1999
        • Niezgoda S.J.
        CODIS program overview.
        Profiles in DNA. 1998; 1: 12-13
        • van der Beek C.
        Past, present and future of forensic.
        DNA Databases. Handbook of Forensic Genetics: Biodiversity and Heredity in Civil and Criminal Investigation. World Scientific, 2017: 217-229
        • Ge J.
        • Chakraborty R.
        • Eisenberg A.
        • Budowle B.
        Comparisons of familial DNA database searching strategies.
        J. Forensic Sci. 2011; 56: 1448-1456
        • Linacre A.
        • Gusmao L.
        • Hecht W.
        • Hellmann A.
        • Mayr W.
        • Parson W.
        • et al.
        ISFG: recommendations regarding the use of non-human (animal) DNA in forensic genetic investigations.
        Forensic Sci. Int. Genet. 2011; 5: 501-505
        • Butler J.M.
        New resources for the forensic genetics community available on the NIST STRBase website.
        Forensic Sci. Int. Genet. Suppl. Ser. 2008; 1: 97-99
        • Waits L.P.
        • Luikart G.
        • Taberlet P.
        Estimating the probability of identity among genotypes in natural populations: cautions and guidelines.
        Mol. Ecol. 2001; 10: 249-256
        • Jamieson A.
        • Taylor C.
        Comparisons of three probability formulae for parentage exclusion.
        Anim. Genet. 1997; 28: 397-400