Genomics reveals domestication history and facilitates breed development

82 Utilisation and conservation of farm animal genetic resources Miguel Toro and Asko Mäki-Tanila 2. How have the events in breed history modified the genetic variation?

2.1. Marker and sequence information

Genomic research has proven to be a powerful approach in revealing: the history of animal populations; the number and sites for domestication; population expansions and contractions; the impact of selection; the origin and mixing of maternal and paternal lineages. he most widely used genetic markers in diversity studies are microsatellites and SNPs single nucleotide polymorphisms, see Box 3.3 in chapter 3. It is now possible with modern DNA chip technology to analyse even up to thousands of loci, including a vast number of putative neutral loci across the genome. he information can be combined from several linked sites and used to follow the combinations of alleles from diferent loci or haplotypes and the haplotype frequencies in the processes of recombination, selection and drit. Now it is also feasible to obtain complete sequence information on chosen areas of the genome. he outcome can be used to estimate the relatedness of sequence variants and trace them back to ancestral sequences.

2.2. Detection of selection

Genetic mutations that increase mutual beneit to domesticated animals and mankind, give these the individual animals carrying them a selective advantage, helping them to have more ofspring, and this will be repeated in the ofspring that inherit the mutations. hese kinds of mutations are very likely to spread through the population over generations, rather than disappear from the population, and are seen in genomes today. Consequently, it should be clear to ind the cases where a particular allele of a locus has been so beneicial that it has spread quickly and widely in populations and thereby reduced the variation. he consequence of this spread is called a signature of selection: the level of variability will be reduced and the level of linkage disequilibrium and the genetic diferentiation between populations will be increased. Spotlighting signatures of selection will also face obstacles. It is possible that demographic processes produce similar patterns as selection. herefore the efects of selection and breed history may be hard to untangle. A high frequency of an allele in one population and complete absence in another related older population may be an outcome of diferent selection pressures. But it can also be an historical accident - not a • • • • • Utilisation and conservation of farm animal genetic resources 83

Chapter 4. Genomics reveals domestication history and facilitates breed development

mark of selection - if the founders of the population just happened to be carriers, even if uncommon, of the allele, or drit in frequencies by isolated lines. Detection of selection starts from understanding the behaviour of the genome or parts of it under neutral conditions when selection is not present. Loci under selection will oten behave diferently and therefore reveal outlier patterns in variation. Predictions for neutral loci in very large populations, which have had the same size over generations, can be obtained and also extended to inite small populations. hese predictions are usually accompanied by case-speciic statistical parameters, and tests applicable for single markers, large sets of markers and sequence data. Oten user-friendly sotware packages are attached to diferent test procedures. Instead of deducing expectations from population genetics theory, genotyping a large number of neutral loci provides a baseline to test for outliers. Demographic processes, such as migrations, population contractions and expansions, and random drit, afect the whole genome, whereas selection leaves its mark on speciic important functional regions in the genome. Using the whole genome as a baseline, it is straightforward to ind deviating patterns within the genome.

2.3. Basic methods for finding outliers

2.3.1. The Lewontin-Krakauer test A recent mutation in a population is irst relatively rare. In some populations it can quickly become quite common. Such common “young” alleles can be a sign of selection, because new favourable mutations replace other alleles faster than the neutral ones. When a locus shows extraordinary levels of diferentiation between populations measured by Fst, chapter 5 compared with other loci, this may be interpreted as evidence for selection of an allele in one of the populations. A classic test for selection neutrality by Lewontin and Krakauer 1973 exploits this fact. he test rejects the neutral model for a locus, if the level of genetic diferentiation between populations is larger than predicted. his test has been rediscovered, and new versions have been developed for large-scale genomic data Akey et al., 2002, and have augmented it by statistical sophistication Beaumont et al., 2002. 2.3.2. Sequence diversity Several descriptive measures are used to summarise polymorphisms of DNA sequences. Under a neutral model, the expected level of diversity commonly symbolised θ can be deduced from the generation of new alleles by mutations and from the elimination of alleles by drit which is inversely proportional to efective population size, i.e. θ = 4 x 84 Utilisation and conservation of farm animal genetic resources Miguel Toro and Asko Mäki-Tanila efective population size N e x mutation rate. here are several measures for the molecular genetic diversity: numbers of alleles, numbers of segregating sites or the number of nucleotide positions at which polymorphism is found S and the average number of pair wise diferences in a set of sequences π. For comparison, in a human population two randomly chosen individuals difer at ~1 in 1,000 nucleotides 1 SNP per kilo base. he genetic diversity of mankind is low compared to other older species. In cattle and sheep the mean nucleotide diversity is 2-2.5 SNPs per kilo base cf. Meadows et al., 2004, whilst for the chicken the estimate is 4-5.5 Hillier et al., 2004. he outliers or departures from the neutral expectation can be assessed using statistical tests. he average number of pair wise diferences is estimated as π = Σ x i x j δ ij n where n is the length of the sequence, x i is the frequency of sequence type i, x j is the frequency of sequence type j and δ ij is the number of nucleotide diferences between the haplotypes i and j. It is directly an estimate of θ say θ π and the estimate deduced from the number of segregating sites θ S is S Σ 1i where summing is over n-1. Tajima 1989 has constructed a measure to compare the estimates θ π and θ S . With no selection the estimates should be indistinguishable and the test statistic D = θ π - θ S should be zero. A prolonged population bottleneck should reduce S and results in a positive D. Purifying selection will reduce heterozygosity, hence negative D values, vice versa positive values will be observed under balancing selection. If a population is expanding, many sequence types are seen. But the contribution to heterozygosity will be low and D will be negative. 2.3.3. Linkage disequilibrium Linkage disequilibrium LD describes a situation in which some combinations of alleles of two or more loci haplotypes occur less or more frequently than would be expected from the allele frequencies at the loci. In a way, loci in LD are co-segregating. LD is caused by selection, bottlenecks, migration and mutation. he natural state of a new mutation is in LD since it occurs in one animal in the midst of a single sequence in the population. Related to LD, the genome contains regions with reduced haplotype diversity – called haplotype blocks Wall and Pritchard, 2003 – separated by regions of higher diversity. heir identiication is suggested to facilitate whole-genome screenings for interesting genes with fewer markers than when the haplotype blocks are ignored Johnson et al., 2001. he generation of haplotype blocks is not completely understood. Oten they are associated with variation of recombination rate in the genome e.g. Daly et al., 2001, but it is also shown that blocks may stem form uniform recombination rate and drit only Zhang et al., 2003.