注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Bioinformatics home

 
 
 

日志

 
 

Positive selection, negative selection, and selective sweep  

2012-10-03 04:31:41|  分类: 生物信息编程 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

Positive selection

Positive selection, negative selection, and selective sweep - xiaofeng1982 - Tiger-Leon
 

Figure 1. Signatures of positive selection. On the left, patterns of neutral polymorphism (denoted as blue circles) are shown for a sample of six haplotypes. A new advantageous mutation (indicated by the red circle) arises on a specific haplotype (middle panel highlighted in gray). As the advantageous allele increases in frequency it drags along linked neutral polymorphisms. On the right, an incomplete selective sweep is shown such that the advantageous allele has not yet reached fixation. This process perturbs patterns of genetic variation relative to neutral expectations and imparts signatures such as reduced levels of genetic variation, a skew in the site frequency spectrum (also referred to as allele frequency distribution), and increased levels of LD. Recombination between haplotypes carrying and not carrying the advantageous allele delimit the region over which the signature of selection extends. Commonly used summary statistics that have been proposed to test for these signatures are also indicated and described in more detail in Box 1. Note that the relative magnitude of these signatures of positive selection depend on many parameters such as when the advantageous allele arose, the strength of selection, whether the sweep is ongoing or has reached fixation, the amount of time that has elapsed since fixation, and local rates of recombination and mutation.

Tests based on polymorphisms within species

 

Tajima's D: this statistic measures the difference between two estimators of the population mutation rate, θw and π [53]. Under neutrality, the means of θw and π should be approximately equal to one another. Therefore, the expected value of Tajima's D for populations conforming to a standard neutral model is zero. Significant deviations from zero indicate a skew in the allele frequency distribution relative to neutral expectations. Positive values of Tajima's D arise from an excess of intermediate frequency alleles and can result from population bottlenecks, structure and/or balancing selection. Negative values of Tajima's D indicate an excess of low frequency alleles and can result from population expansions or positive selection.

 

Fu and Li's D and F: this set of tests is similar to Tajima's D in that it tests for a skew in the allele frequency spectrum, but makes the distinction between old and recent mutations as determined by where they occur on the branches of genealogies. The D and F statistics compare an estimate of the population mutation rate based on the number of derived variants seen only once in a sample (referred to as singletons) with θw or π, respectively [54]. Similar to Tajima's D, the expected value of D and F is zero, and both positive and negative deviations are informative about distinct demographic and/or selective events.

 

Fay and Wu's H test: a statistic that detects the presence of an excess of high frequency derived alleles in a sample, which is a hallmark of positive selection [55].

 

Long range haplotype (LRH) test: this test examines the relationship between allele frequency and the extent of LD [23]. Positive selection is expected to accelerate the frequency of an advantageous allele faster than recombination can break down LD at the selected haplotype. Thus, a hallmark of recent positive selection is an allele that has greater long-range LD given its frequency in the population relative to neutral expectations. To capture this signature, the LRH test begins by selecting a ‘core’ haplotype (note this could also be applied to a single SNP). Next, the decay in LD is assessed for flanking markers by calculating EHH, which is defined as the probability that two randomly chosen chromosomes carrying the core SNP or haplotype are identical by descent. For each core, haplotype homozygosity is initially 1 and decays to 0 at increasing distances. Positive selection is formally tested by finding core haplotypes that have elevated EHH relative to other core haplotypes at the locus conditional on haplotype frequency. By focusing on relative EHH, the various core haplotypes control for local rates of recombination.

 

iHS: this statistic is applied to individual SNPs and begins by calculating the integrated EHH (iHH), which is defined as the integral of the observed decay of EHH (i.e. the area under the curve of EHH versus distance) away from a specified core allele until EHH reaches 0.05 [22]. The log ratio of iHH for the ancestral and derived alleles is then standardized such that it has a mean of 0 and variance of 1 irrespective of allele frequency at the core SNP. Large positive and negative values of iHS indicate unusually long haplotypes carrying the ancestral and derived allele, respectively.

 

LD decay (LDD): The goal of this test, similar in spirit to the iHH and EHH statistics, is to detect large differences in the extent of LD between two alleles at a particular locus [21]. The test begins by identifying individuals who are homozygous for the SNP being considered, which eliminates the need to infer haplotypes. Individuals are then sorted according to whether they are homozygous for the major or minor allele. The fraction of recombinant chromosomes (FRC) is computed for all adjacent SNPs within an a priori defined window and the FRC and distance from the target SNP are then used to calculate an ALnLH statistic. SNPs with high ALnLH values imply that the decay in LD for one allele is unusual compared with that of the alternative allele, in which the pattern of LD decay is within an a priori defined bound of the genome-wide average.

 

FST: a statistic that quantifies levels of differentiation between sub-populations [56]. Many estimators of FST have been proposed, but a conceptually simple one is (HT-HS)/HT. Here HT is an estimate of total heterozygosity and HS is a measure of the average heterozygosity across subpopulations. Thus, one way to interpret FST is the reduction in heterozygosity among subpopulations relative to what is expected under random mating. Under neutrality, levels of FST are largely determined by genetic drift and migration, but local adaptation can accentuate levels of population differentiation at particular loci thus resulting in large FST values.

Tests based on polymorphisms within species and the divergence between species
Hudson–Kreitman–aguade (HKA) test: the neutral theory predicts a positive correlation between levels of polymorphism within species and divergence between species [57]. The HKA test is used to determine if levels of nucleotide variation within and between species at two or more loci conform to this expectation. A significant HKA test can thus be caused by increased levels of polymorphism at one locus or reduced levels of polymorphism at the other, or by excess divergence at one locus or limited divergence at the other.

 

McDonald Kreitman (MK) test: this test also makes use of polymorphism and divergence data, but compares different types of mutations, such as synonymous versus non-synonymous sites at a specific locus [58]. In the MK test, a 2×2 contingency table is formed to compare the number of non-synonymous and synonymous sites that are polymorphic within a species (PN and PS) and fixed between species (DN and DS). Under neutrality PN/PS=DN/DS, whereas positive selection leads to an increase in non-synonymous divergence (DN/DS>PN/PS).

Tests between species

dn/ds test: in the simplest forms, the ratio of non-synonymous (dn) to synonymous (ds) substitutions is compared in protein coding loci [59], [60] and [61]. The dn/ds ratio provides information about the evolutionary forces operating on a particular gene. For example, under neutrality dn/ds=1. For genes that are subject to functional constraint such that non-synonymous amino acid substitutions are deleterious and purged from the population, dn/ds<1. For positively selected genes, dn/ds>1. Although the observation of dn/ds>1 provides strong evidence for positive selection, it is conservative if only a few sites have been targets of adaptive evolution. The basic dn/ds test has been extended to include models of codon and transition and/or transversion bias, to detect variation in dn/ds ratios among lineages and to identify specific sites that might be under selection.

negative selection

negative selection is sometimes also called purifying selection or background selection. One key reason why this form of selection is so prevalent is the success of evolution in optimizing biological structures: As soon as a system has been improved, there is the danger of losing that improvement by a deleterious mutation. Purifying selection makes sure that deleterious mutations cannot take over a population and that any improved structures—once fixed in a population—are maintained as long as they are needed. A dramatic example of such maintenance can be found in so-called "living fossils": If a species' ecological niche happens not to change for millions of years, fossil forms of the species can be almost indistinguishable from their present-day descendants

There are two ways to measure the strength of negative selection: an absolute one and a relative one. The absolute selection coefficient s quantifies the relative fitness difference between the rare variant (mutant) and the most common variant (wild type). The advantage of this measure is its independence from population size. For mutations with large effects, it might even be possible to measure selection coefficients by comparing direct counts of surviving offspring from many individuals with and without a well-specified mutation.

the relative effective selection coefficient Nes, which is the product of the absolute selection coefficient and the effective population size Ne. A value of 1 (Nes = 1) approximately denotes a threshold: Mutations with Nes < 0.25 are fixed with probabilities that are comparable to those of neutral mutations, even if they are harmful. On the other hand, mutations in which Nes > 4 will very rarely become fixed in a population with reasonable levels of recombination. Moreover, it is unlikely that a harmful mutation with Nes exceeding a few dozen could have ever been fixed in the entire history of the known universe.

 

Consequences of Negative Selection

 

The main consequence of negative selection is the extinction of less-adapted variants. If the best-adapted variant does not change because it is at a stable local optimum, then negative selection will remove all new variants for that optimal trait.

 

It is important to note that negative selection can also impact molecular diversity. Consider the simple case of a population with genes that are optimally adapted to the existing constant environment. Such a setting is probably realistic for many "housekeeping" genes that ensure the proper working of the basic molecular machinery of life. Almost every mutation that happens in these genes will be deleterious, and, because mutations are the inevitable consequence of the molecular machinery that copies DNA, we can expect a substantial number of such harmful mutations. The deleterious nature of these mutations will result in their quick removal. In any real-life setting, however, an important side effect of such a removal will be the accompanying removal of linked mutations. This has important implications for the study of molecular diversity, as all neutral mutations linked to deleterious mutations will not be observed in the population and the corresponding Ne values will thus be reduced. This reduction in Ne has consequences for adaptive evolution, because the effective strength of selection for positive mutations will be reduced as well, and more advantageous mutations with small effects will be lost by chance.

 

If negative selection is too weak to remove harmful mutations, then deleterious mutation accumulation will occur, and a gradual decay of genomic integrity will be the result. This can lead to extinction for some species if it continues long enough; however, the resulting widespread existence of deleterious mutations in such a genome will eventually also lead to the occurrence of back mutations, which (among many other factors) can significantly contribute toward maintenance of a reasonable level of integrity in the genome of other species in the long term.

 

If negative selection is too strong for the whole population, extinction will occur, unless the population is rescued in time. Extinction can occur if the negative selection considered is "hard" selection, which actually reduces the number of surviving offspring that are produced. "Soft" selection (which occurs when the reproductive capacity of an organism is high enough) can also be negative, but it will lead only to competition over who will increase in frequency within the population, effectively without a reduction of the maximal number of offspring that can be produced. Thus, no extinction risk exists with soft selection.

 

Climate change and other habitat alterations are currently placing many species under such extreme negative selection that these species' survival is threatened. Also, mutagenic substances that are released into the environment by humans lead to a general increase in the frequency of mutations, a vast majority of which are deleterious and further increase the negative selection pressure for many populations. Thus, understanding the causes, extent, and consequences of negative selection can contribute important insights toward securing biodiversity in the long term.

selective sweep

A selective sweep is the reduction or elimination of variation among the nucleotides in neighboring DNA of a mutation as the result of recent and strong positive natural selection.

Positive selection, negative selection, and selective sweep - xiaofeng1982 - Tiger-Leon
 
Positive selection, negative selection, and selective sweep - xiaofeng1982 - Tiger-Leon
 
Positive selection, negative selection, and selective sweep - xiaofeng1982 - Tiger-Leon
 

A selective sweep can occur when a new mutation occurs that increases the fitness of the carrier relative to other members of the population. Natural selection will favour individuals that have a higher fitness and with time the newly mutated variant (allele) will increase in frequency relative to other alleles. As its prevalence increases, neutral and nearly neutral genetic variation linked to the new mutation will also become more prevalent. This phenomenon is called genetic hitchhiking. A strong selective sweep results in a region of the genome where the positively selected haplotype (the mutated allele and its neighbours) is essentially the only one that exists in the population, resulting in a large reduction of the total genetic variation in that chromosome region.

Detecting selective sweeps

Whether or not a selective sweep has occurred can be investigated in various ways. One method is to measure linkage disequilibrium, i.e., whether a given haplotype is overrepresented in the population. Under neutral evolution, genetic recombination will result in the reshuffling of the different alleles within a haplotype, and no single haplotype will dominate the population. However, during a selective sweep, selection for a positively selected gene variant will also result in selection of neighbouring alleles and less opportunity for recombination. Therefore, the presence of strong linkage disequilibrium might indicate that there has been a recent selective sweep, and can be used to identify sites recently under selection.

A study of genetic variation among 269 humans found evidence for selective sweeps on chromosomes 1, 2, 4, 8, 12, and 22.[1]

In maize, a recent comparison of yellow and white corn genotypes surrounding Y1 — the phytoene synthetase gene responsible for the yellow endosperm color, shows strong evidence for a selective sweep in yellow germplasm reducing diversity at this locus and linkage disequilibrium in surrounding regions. White maize lines had increased diversity and no evidence of linkage disequilibrium associated with a selective sweep.

 

  评论这张
 
阅读(3887)| 评论(1)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017