注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Bioinformatics home

 
 
 

日志

 
 

R and statistics  

2013-03-08 01:30:35|  分类: 生物信息编程 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
Install the RcmdrPlugin.IPSUR package
  install.packages("RcmdrPlugin.IPSUR", repos="http://cran.r-project.org", dep=TRUE)

Load the RcmdrPlugin.IPSUR package
    library(RcmdrPlugin.IPSUR)
   
Sample frequency    Background frequency
1600/3678 (43.5%)    5965/15940 (37.4%)
m: number of white balls in the urn: 5965
n: number of black balls in the urn: 15940-5965=9975
k: number of balls drawn from the urn: 3678

Pr=dhyper(1600 ,  m = 5965 ,  n = 9975 ,  k = 3678)
m    x
m+n  k


uu mm yy
35    48    70
45    68    80
55    68    40
25    88    40
xie <- read.table("D:/data.txt", header=TRUE, sep="", na.strings="NA", dec=".", strip.white=TRUE);
showData(xie, placement='-20+200', font=getRcmdr('logFont'), maxwidth=80,
  maxheight=30);
 
summary(xie);
correlation test
cor.test(xie$mm, xie$uu, alternative="two.sided", method="pearson")

t-test
t.test(xie$mm, alternative='two.sided', mu=0.0, conf.level=.95)


principal component analysis
.PC <- princomp(~mm+uu+yy, cor=TRUE, data=xie)
unclass(loadings(.PC))  # component loadings
.PC$sd^2  # component variances
summary(.PC) # proportions of variance
screeplot(.PC)
xie$PC1 <- .PC$scores[,1]
xie$PC2 <- .PC$scores[,2]

K-mean:
编辑数据
fix(xie)

.cluster <-  KMeans(model.matrix(~-1 + mm + uu + yy, xie), centers = 2, iter.max = 10, num.seeds = 10)
.cluster$size # Cluster Sizes
.cluster$centers # Cluster Centroids
.cluster$withinss # Within Cluster Sum of Squares
.cluster$tot.withinss # Total Within Sum of Squares
.cluster$betweenss # Between Cluster Sum of Squares
biplot(princomp(model.matrix(~-1 + mm + uu + yy, xie)), xlabs = as.character(.cluster$cluster))
xie$KMeans <- assignCluster(model.matrix(~-1 + mm + uu + yy, xie), xie, .cluster$cluster)


hierachical clustering
HClust.1 <- hclust(dist(model.matrix(~-1 + mm+uu+yy, xie)) , method= "complete")
plot(HClust.1, main= "Cluster Dendrogram for Solution HClust.1", xlab= "Observation Number in Data Set xie", sub="Method=complete; Distance=euclidian")


stripchart(xie$mm, method="stack", xlab="mm")
stripchart(xie$mm, method="jitter", xlab="mm")
xyplot(yy ~ mm, type="p", pch=16, auto.key=list(border=TRUE),
  par.settings=simpleTheme(pch=16), scales=list(x=list(relation='same'),
  y=list(relation='same')), data=xie)
 
 Hist(xie$uu, scale="frequency", breaks="Sturges", col="darkgray")
 plot(xie$uu, type="h")
 
linear regression
RegModel.1 <- lm(mm~uu, data=xie)
summary(RegModel.1)


generalized linear model
GLM.3 <- glm(group ~ mm/uu +yy, family=gaussian(identity), data=xie)

linear model
lm(formula = group ~ mm + uu/yy, data = xie)
summary(LinearModel.5)

normal distribution
quantile:
qnorm(c(0.95), mean=0, sd=1, lower.tail=TRUE)
[1] 1.644854

probability:
pnorm(c(1.68), mean=0, sd=1, lower.tail=FALSE)
[1] 0.04647866

T distribution
qt(c(0.95), df=6, lower.tail=TRUE)
pt(c(56), df=9, lower.tail=TRUE)

chi-square
qchisq(c(0.95), df=9, lower.tail=TRUE)
[1] 16.91898

pchisq(c(89), df=5, lower.tail=TRUE)
[1] 1

F distribution
qf(c(095), df1=6, df2=5, lower.tail=TRUE)
pf(c(56), df1=9, df2=6, lower.tail=TRUE)

binomial distribution
qbinom(c(0.6), size=5, prob=0.9, lower.tail=TRUE)
[1] 5

gamma distribution
pgamma(c(8), shape=2, scale=1, lower.tail=TRUE)
[1] 0.9969808


poisson distribution
qpois(c(0.9), lambda=8, lower.tail=TRUE)
[1] 12

.Table <- data.frame(Pr=dpois(1:19 ,  lambda = 8))
rownames(.Table) <- 1:19
.Table

ppois(c(8), lambda=10, lower.tail=TRUE)
[1] 0.3328197
  评论这张
 
阅读(741)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017