Genetic tools for population-level identification
Microsatellites, also called Short Tandem Repeats (STRs) in forensics, are tandem sequence repeats of one to six nucleotides (e.g. ‘cgtacgtacgtacgtacgta') in the genome. Their high polymorphism is characterised by variable repeat numbers (between 5 and 100) even between individuals. Microsatellites are the standard marker for human identity testing by DNA profiling and for forensic genetic crime scene investigation (Butler, 2005). They have also been extensively used in fish population studies, and there potential value as traceability markers for origin assignment is very high. However, despite the widespread application of microsatellites, there are drawbacks, particularly scoring error and lack of comparability among laboratories (Dewoody et al., 2006). Nevertheless, numerous examples exist where microsatellites are used for fish population/stock analysis, management, and also origin assignment (Manel et al., 2005, Hauser and Carvalho, 2008), including Atlantic salmon (Primmer et al., 2000), Pacific salmon (Fisheries and Oceans Canada, DFO) and cod (Nielsen et al., 2001).
Meanwhile, Single Nucleotide Polymorphisms (SNPs) entered the realm of fisheries genetics, offering a great potential for origin assignment (Hudson, 2008). SNPs are genome sites where more than one nucleotide (A, C, G or T) is present in a species. They are the most abundant polymorphism in the genome (Brumfield et al., 2003), but per locus normally only two alleles exists (biallelic markers), thus they are less variable than Microsatellites, where often many alleles exist. The lack of potential information per SNP marker is outweighed by their high abundance. Compared to other genetic markers, where routine genotyping and transfer of protocols between laboratories proves difficult, the information retrieved from SNPs is categorical, and data can be standardized across laboratories for forensic applications (Sobrino et al., 2005). However, a substantial research effort targeting all commercial marine fish species will be necessary before SNPs can be employed routinely for origin assignment. Despite this, available studies on marine fish using SNPs are encouraging. SNPs as markers to distinguish stocks of Atlantic cod (Gadus morhua) provided a high resolution power for stock identification, comparable to that of microsatellite loci (Wirgin et al., 2007). Another example is the North Pacific Anadromous Fish Commission (NPAFC) that is developing SNP arrays for Pacific salmon (http://www.npafc.org). The application of SNPs to population genetics is not without some problems, including so-called "ascertainment bias"-the selection of loci based on an unrepresentative sample of individuals. For example if SNPs have been developed from a few individuals (small ascertainment depth), SNPs with high heterozygosities are preferentially found, providing a false impression of overall genomic polymorphism. Likewise, if SNPs are developed from a biased sample of individuals (e.g. not covering the full range of populations), comparative analysis with respect to population-specific indices of variability can be biased. However, in the context of mixed stock analysis (MSA) for example, ascertainment bias is not expected to create problems. Population-biased ascertainment could result in marginally lower power for MSA in populations not included in the ascertainment sample; however, the high number of markers employed would most likely compensate for this.
Among the most recent application of SNP markers to fisheries were the outputs deriving from an EU Seventh Framework Project, FishPopTrace (http://fishpoptrace.jrc.ec.europa.eu/). Among the most striking scientific results is the provision of several hundred novel genetic markers in, hake, herring and sole. Although these fish represent a major part of the European catch, many aspects of their biology remain unknown. This holds also for the number, location and independence of biological populations. The lack of high resolution genetic data has complicated sustainable management, which should rely on the basal biological independent units rather than geographically defined "stocks". However, access to new genetic methods, the so-called next generation sequencing, has changed the picture in a matter of just a few years. From a dozen genetic markers a few years ago, we now have knowledge about thousands of small genetic differences (genetic variation) at numerous genes, allowing the design of hundreds to thousands of new genetic markers. The unique combinations of the variation make it feasible to assign the fish to specific populations and in some conditions to identify unique individuals.
It is now possible to correctly assign fish to populations from more areas and with higher certainty than previously possible, reaching standards which can be used in a court of law. Based on use of the most highly distinct genes among populations it has been possible to develop "minimum assays with maximum power" with from 10-30 SNPs. These assays have been developed to target some of the most pertinent needs for traceability tools in European fisheries management. For example, fast, efficient and forensically robust tools are now available to discriminate between cod from Canada, North Sea, Baltic Sea and Northeast Arctic populations, between North Sea and North Atlantic herring, between sole from the Irish Sea and Thames and between hake from the Mediterranean and Atlantic areas.
One major advantage of using SNPs is the ability to alter the number of markers in relation to the biology of the species (levels of genetic differentiation) and scale of geographic structuring of interest. Thus by varying the numbers used on a SNP-chip, it is possible, for example, to assign individuals back to their source population across different geographic scales with high levels of certainty and reproducibility. Such outputs are especially significant since previous types of genetic markers either detect levels of population differences that are too low, or there are inherent difficulties in comparing data generated from different laboratories. The use of a marker system such as SNPs, which is essentially based on the presence or absence of large numbers of single genetic variants means that data can be compiled from sources in a much more reliable and high throughput way. The approach thereby enables the generation of baseline and ongoing additions for subsequent genetic monitoring. Moreover it is imperative that any such tools can be used in a legal context, necessitating forensic validation. This has been achieved for SNP markers within FishPopTrace across a range of policy-driven IUU scenarios (see Traceability of Fish Populations and Fish Products: http://fishpoptrace.jrc.ec.europa.eu/).