注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Bioinformatics home

 
 
 

日志

 
 

multi-threading with PERL  

2014-04-08 01:38:25|  分类: 生物信息编程 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

cited from http://chicken.genouest.org/category/perl/ 


Multi-threading, forking… At first, it seems to be complicated… Actually, it can be quite simple ! First, one thing you should now: your PERL installation may not support threads (this option has been set during compilation), so check it out:


perl -V

Summary of my perl5 (revision 5 version 10 subversion 0) configuration:

  Platform:

    osname=linux, osvers=2.6.30.5-dsa-amd64, archname=x86_64-linux-gnu-thread-multi

    uname='linux brahms 2.6.30.5-dsa-amd64 #1 smp mon aug 17 02:18:43 cest 2009 x86_64 gnulinux '

    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.10 -Darchlib=/usr/lib/perl/5.10 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.10.0 -Dsitearch=/usr/local/lib/perl/5.10.0 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.10.0 -Dd_dosuid -des'     hint=recommended, useposix=true, d_sigaction=define

    useithreads=define, usemultiplicity=define

    useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef

    use64bitint=define, use64bitall=define, uselongdouble=undef

    usemymalloc=n, bincompat5005=undef

Here, you should look at the following options: useithreads=define. If your PERL install doesn’t have this option, I recommend you to install the forsk.pm module. During installation, the CPAN installer will ask you:


It appears your perl was not built with native ithreads.


Would you like to create references to forks, such that

using 'use threads' and 'use threads::shared' will quietly

load forks and forks::shared? [no]

I recommend you to choose yes at this point: this will permit you to develop programs with the use threads directive, whatever this options has been set during PERL compilation (note: be aware that the forks module is not as fast as the native threads options.


Then, how do we develop this threads? You can find a lot of tutorials with google, like this one, for instance. But, there’s some issues with these examples: there are all based on loops with the fixed number of iterations (do something 10 times and thread it). Okay, that’s nice for beginners. But, again, in the real world, it’s not in this way that things happen! Most of the time, you have a lot of basic operations to perform (let’s say about 500) and you don’t want to perform all the operations at the same time (as your system may go down). How to develop a system-friendly program that will only launch a limited number of threads and will perform all the tasks? That’s the point! And, again, trust me, I didn’t find any tutorials for this ? real-world ? case. Here I propose a program that may help you to achieve this. The algorithm is based on a while loop that compares the number of task to perform and the number of running and done threads. It can be easily adapt to any cases. Let’s say that you have 100 nucleotide sequences to analyze. These sequences are stored in a Hash table. Then, you can get the number of entry in your hash (your $nb_compute) and enter in the loop. Now, let’s stop talking and have a look at the code:


#!/opt/local/bin/perl -w

use threads;

use strict;

use warnings;


my @a = ();

my @b = ();


sub sleeping_sub ( $ $ $ );


print "Starting main program\n";


my $nb_process = 10;

my $nb_compute = 20;

my $i=0;

my @running = ();

my @Threads;

while (scalar @Threads < $nb_compute) {

  @running = threads->list(threads::running);

print "LOOP $i\n";

print "  - BEGIN LOOP >> NB running threads = ".(scalar @running)."\n";


if (scalar @running < $nb_process) {

  my $thread = threads->new( sub { sleeping_sub($i, \@a, \@b) });

push (@Threads, $thread);

my $tid = $thread->tid;

print "  - starting thread $tid\n";

}

@running = threads->list(threads::running);

print "  - AFTER STARTING >> NB running Threads = ".(scalar @running)."\n";

foreach my $thr (@Threads) {

if ($thr->is_running()) {

my $tid = $thr->tid;

print "  - Thread $tid running\n";

}

elsif ($thr->is_joinable()) {

my $tid = $thr->tid;

$thr->join;

print "  - Results for thread $tid:\n";

print "  - Thread $tid has been joined\n";

}

}


@running = threads->list(threads::running);

print "  - END LOOP >> NB Threads = ".(scalar @running)."\n";

$i++;

}

print "\nJOINING pending threads\n";

while (scalar @running != 0) {

foreach my $thr (@Threads) {

$thr->join if ($thr->is_joinable());

}

@running = threads->list(threads::running);

}


print "NB started threads = ".(scalar @Threads)."\n";

print "End of main program\n";


sub sleeping_sub ( $ $ $ ) {

sleep(4);

}


During the main loop, the program will start new threads if the number of running threads is lower than the number of max threads. Still during this loop, it will join pending threads. Then, at the end of the loop, you must be aware that some threads may still running, so, another loop will join the last running threads. You should note that the parameters of the sub are not used (it’s just for the example), but you can send parameters to your favorite sub and get the results, too. To get more details about the shared data, I recommend you to read the threads perldoc. I hope it will help.

  评论这张
 
阅读(590)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017