注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Bioinformatics home

 
 
 

日志

 
 

http://genome.ucsc.edu/cgi-bin/hgTables http://genome.ucsc.edu/FAQ/FAQtracks#tracks1  

2011-03-20 13:50:55|  分类: 默认分类 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

http://hapmap.ncbi.nlm.nih.gov/tutorials.html.en
Schema for HapMap SNPs - HapMap SNPs (rel27, merged Phase II + Phase III genotypes)

  Database: hg18    Primary Table: hapmapSnpsYRI    Row Count: 3,782,818
Format description: HapMap genotype summary
fieldexampleSQL type description
bin 585int(10) unsigned Indexing field to speed chromosome range queries.
chrom chr1varchar(255) Chromosome
chromStart 45161int(10) unsigned Start position in chrom (0 based)
chromEnd 45162int(10) unsigned End position in chrom (1 based)
name rs10399749varchar(255) Reference SNP identifier from dbSnp
score 0int(10) unsigned Minor allele frequency normalized (0-500)
strand +enum('+', '-', '?') Which genomic strand contains the observed alleles
observed C/Tvarchar(255) Observed string from genotype file
allele1 Cenum('A', 'C', 'G', 'T') This allele has been observed
homoCount1 56int(10) unsigned Count of individuals who are homozygous for allele1
allele2  enum('C', 'G', 'T', 'none') This allele may not have been observed
homoCount2 0int(10) unsigned Count of individuals who are homozygous for allele2
heteroCount 0int(10) unsigned Count of individuals who are heterozygous


  Connected Tables and Joining Fields

        hg18.hapmapAllelesChimp.name (via hapmapSnpsYRI.name)
      hg18.hapmapAllelesMacaque.name (via hapmapSnpsYRI.name)
      hg18.hapmapAllelesSummary.name (via hapmapSnpsYRI.name)
      hg18.hapmapLdPhCeu.name (via hapmapSnpsYRI.name)
      hg18.hapmapLdPhChbJpt.name (via hapmapSnpsYRI.name)
      hg18.hapmapLdPhYri.name (via hapmapSnpsYRI.name)
      hg18.hapmapSnpsCEU.name (via hapmapSnpsYRI.name)
      hg18.hapmapSnpsCHB.name (via hapmapSnpsYRI.name)
      hg18.hapmapSnpsJPT.name (via hapmapSnpsYRI.name)


  Sample Rows

 
binchromchromStartchromEndnamescorestrandobservedallele1homoCount1allele2homoCount2heteroCount
585chr14516145162rs103997490+C/TC56
00
585chr14541245413rs294942134+A/TA0T554
585chr14684346844rs26913100+A/CC60
00
585chr17243372434rs40303030-C/TC60
00
585chr17251472515rs40303000-G/TG60
00
585chr17768877689rs38559520-C/TT54
00
585chr18146781468rs133287140+C/TC59
00
586chr1222076222077rs114909370+A/GG60
00
587chr1391725391726rs94394240+A/GA60
00
589chr1524445524446rs66834660+C/GC59
00

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.



  HapMap SNPs (hapmapSnps) Track Description

 

Description

The HapMap Project identified a set of approximately four million common SNPs, and genotyped these SNPs in four populations in Phase II of the project. In Phase III, it genotyped approximately 1.4 to 1.5 million SNPs in eleven populations. This track shows the combined data from Phases II and III. The intent is that this data can be used as a reference for future studies of human disease. This track displays the genotype counts and allele frequencies of those SNPs, and (when available) shows orthologous alleles from the chimp and macaque reference genome assemblies.

The four million HapMap Phase II SNPs were genotyped on individuals from these four human populations:

Phase III expanded to eleven populations: the four above, plus the following: Each of the populations is displayed in a separate subtrack.

The HapMap assays provide biallelic results. Over 99.8% of HapMap SNPs are described as biallelic in dbSNP build 129; approximately 6,800 are described as more complex types (in-del, mixed, etc). 70% of the HapMap SNPs are transitions: 35% are A/G, 35% are C/T.

The orthologous alleles in chimp (panTro2) and macaque (rheMac2) were derived using liftOver.

No two HapMap SNPs occupy the same position. Aside from 430 SNPs from the pseudoautosomal region of chrX and chrY, no SNP is mapped to more than one location in the reference genome. No HapMap SNPs occur on "random" chromosomes (concatenations of unordered and unoriented contigs).

Display Conventions and Configuration

Note: calculation of heterozygosity has changed since the Phase II (rel22) version of this track. Observed heterozygosity is calculated as follows: each population's heterozygosity is computed as the proportion of heterozygous individuals in the population. The population heterozygosities are averaged to determine the overall observed heterozygosity. [For Phase II genotypes, expected heterozygosity was calculated as follows: the allele counts from all populations were summed (not normalized for population size) and used to determine overall major and minor allele frequencies. Assuming Hardy-Weinberg equilibrium, overall expected heterozygosity was calculated as two times the product of major and minor allele frequencies (see Modern Genetic Analysis, section 17-2).]

The human SNPs are displayed in gray using a color gradient based on minor allele frequency. The higher the minor allele frequency, the darker the display. By definition, the maximum minor allele frequency is 50%. When zoomed to base level, the major allele is displayed for each population.

The orthologous alleles from chimp and macaque are displayed in brown using a color gradient based on quality score. Quality scores range from 0 to 100 representing low to high quality. For orthologous alleles, the higher the quality, the darker the display. Quality scores are not available for chimp chromosomes chr21 and chrY; these were set to 98, consistent with the panTro2 browser quality track.

Filters are provided for the data attributes described above. Additionally, a filter is provided for observed heterozgosity (average of all populations' observed heterozygosities). Filters are applied to all subtracks, even if a subtrack is not displayed.

Notes on orthologous allele filters:

  • If a SNP's major allele is different between populations, no overall major allele for human is determined, thus the "matches major human allele" and "matches minor human allele" filters for orthologous alleles do not apply.
  • If a SNP is monomorphic in all populations, the minor allele is not verified in the HapMap dataset. In these cases, the filter to match orthologous alleles to the minor human allele will yield no results.

Credits

This track is based on International HapMap Project release 27 data, provided by the HapMap Data Coordination Center.

References

HapMap Project

The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007 Oct 18;449(7164):851-61.

The International HapMap Consortium. A haplotype map of the human genome. Nature. 2005 Oct 27;437(7063):1299-320.

The International HapMap Consortium. The International HapMap Project. Nature. 2003 Dec 18;426(6968):789-96.

HapMap Data Coordination Center

Thorisson GA, Smith AV, Krishnan L, Stein LD. The International HapMap Project Web site. Genome Res. 2005 Nov;15(11):1592-3.

A Sampling of HapMap Literature

Gibson J, Morton NE, Collins A. Extended tracts of homozygosity in outbred human populations. Hum Mol Genet. 2006 Mar 1; 15(5):789-95.

Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH, Carson AR, Chen W et al. Global variation in copy number in the human genome. Nature. 2006 Nov 23;444(7118):444-454.

Spielman RS, Bastone LA, Burdick JT, Morley M, Ewens WJ, Cheung VG. Common genetic variants account for differences in gene expression among ethnic groups. Nature Genet. 2007 Feb;39(2):226-31.

Tenesa A, Navarro P, Hayes BJ, Duffy DL, Clarke GM, Goddard ME, Visscher PM. Recent human effective population size estimated from linkage disequilibrium. Genome Res. 2007 Apr;17(4):520-6.

Voight BF, Kudaravalli S, Wen X, Pritchard JK. A Map of Recent Positive Selection in the Human Genome. PLoS Biol. 2006 Mar;4(3):e72.

Weir BS, Cardon LR, Anderson AD, Nielsen DM, Hill WG. Measures of human population structure show heterogeneity among genomic regions. Genome Res. 2005 Nov;15(11):1468-76.

Data Source

The genotypes_chr*_*_r27_nr.b36_fwd.txt.gz files from
http://ftp.hapmap.org/genotypes/2009-02_phaseII+III/forward/ were processed to make this track.




Genetics In Process 17-2: The Hardy-Weinberg equilibrium

If the frequency of allele A is p in both the sperm and the eggs and the frequency of allele a is q = 1 ? p, then the consequences of random unions of sperm and eggs are shown in the adjoining figure. The probability that both the sperm and the egg will carry A is p × p = p2, so this will be the frequency of A?/?A homozygotes in the next generation. In like manner, the chance of heterozygotes A?/?a will be (p × q) + (q × pq) = 2pq, and the chance of homozygotes a?/?a will be q × q = q2 The three genotypes, after a generation of random mating, will be in the frequencies p2:2pq:q2. As the figure shows, the allelic frequency of A has not changed and is still p. Therefore, in the second generation, the frequencies of the three genotypes will again be p2:2pq:q2, and so on, forever.

Image ch17fu2.jpg

The Hardy-Weinberg equilibrium frequencies that result from random mating. The frequencies of A and a among both eggs and sperm are p and q(= 1 ? p), respectively. The total frequencies of the zygote genotypes are p2 for A?/?A, 2pq for A?/?a, and q2 for a?/?a. The frequency of the allele A in the zygotes is the frequency of A?/?A plus half the frequency of A?/?a, or p2 + pq = p(p + q) = p.

  评论这张
 
阅读(2093)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017