注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Bioinformatics home

 
 
 

日志

 
 

Neighbor-joining in Python: doing the plot  

2010-12-13 23:53:46|  分类: 默认分类 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |
Neighbor-joining in Python: doing the plot

As discussed last time, I'm using Python lists to hold the information for each node on a phylogenetic tree. This seems as if it will have significant advantages, especially when it comes to tree "surgery." I'll leave that for another post. What I want to do today is actually draw a tree rooted at each of the internal nodes. Here is our example:
example_tree = { '0':['A','B','1'],
'1':['C','0','3'],
'2':['E','D','3'],
'3':['F','1','2'] }

Code to traverse the tree was shown last time (here). I've modified it slightly to use a dictionary. Previously we had this representation for each node and its parent (in the traversal from a particular root):
A:0, C:1, B:0, E:2, D:2, F:3, 1:root, 0:1, 3:1, 2:3

Now these keys and values are in a dict. The results of the traversals from each internal node as root are:
A:0, C:1, B:0, E:2, D:2, F:3, 1:1, 0:1, 3:1, 2:3
A:0, C:1, B:0, E:2, D:2, F:3, 1:0, 0:0, 3:1, 2:3
A:0, C:1, B:0, E:2, D:2, F:3, 1:3, 0:1, 3:3, 2:3
A:0, C:1, B:0, E:2, D:2, F:3, 1:3, 0:1, 3:2, 2:2

The root node is now marked by having itself as its "parent."
The drawing code works by building a
repr
(representation) for each node from the farthest tips working up to the root. The method was inspired by PyCogent's
ancestors()
function, which returns a list of ancestors (naturally enough). The output with debug enabled looks like this:
ancestors
E ['2', '3', '1', '0']
D ['2', '3', '1', '0']
F ['3', '1', '0']
2 ['3', '1', '0']
C ['1', '0']
3 ['1', '0']
A ['0']
B ['0']
1 ['0']
0 []

I simply sort on the length (largest first) and work in that order. So E's
repr
is just 'E'. When we process the first i_node (2), it's
repr
will be
(E,D)2
. If we're using branch lengths, it will be:
(E:2.25,D:2.75)2
. The distance data is in a separate dictionary:
dist = { ('A','0'):1.0,
('B','0'):4.0,
('C','1'):2.0,
('D','2'):2.75,
('E','2'):2.25,
('F','3'):4.75,
('0','1'):1.0,
('1','3'):1.25,
('2','3'):0.75 }

Here's the resulting Newick tree:
(A:1.00,B:4.00,(C:2.00,(F:4.75,(E:2.25,D:2.75)2:0.75)3:1.25)1:1.00)0;

One of the plots is shown at the top, a second one is below. Although I had it working before, be advised there's a bug somewhere that blows up R when I try to plot the tree as 'unrooted.' So now my to_do list includes the challenge of fixing that. [UPDATE: It seems to only happen when calling R from Python using RPy---probably related to the use of a Python keyword type as an argument to the R function. ] Zipped project files are here. And a final note, I've included the upgma code from before. However, we processed the nodes with a different algorithm there, so the drawing code is different, and I haven't tried to convert it to the new method yet.
  评论这张
 
阅读(843)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017