注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

Bioinformatics home

 
 
 

日志

 
 

Installing A Minimal UCSC Genome Browser Mirror In Ubuntu 10.04 64 Bits  

2011-07-27 10:28:16|  分类: 生物信息编程 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

http://enotacoes.wordpress.com/2009/09/03/installing-a-minimal-ucsc-mirror-in-ubuntu-jaunty-64-bits/


Filed under: bioinformaticsubuntu linux — Tags: genome browserminimal installationmm9ucsc mirror — noborujs @ 2:59 pm

updated Dec 24, 2010 for Ubuntu server 10.04

Notes

  • These instructions should work for Ubuntu server 8.04, 10.04 and 10.10 and 9.04 Desktop 64 bits. It will work on 32 bits, but make sure you compile your binaries, because the ones provided by UCSC are 64 bits.
  • Some people seem to be having a problem with the MySQL socket. I installed the browser in 10.04 and 10.10 from scratch using this tutorial and did not get the error, so I can’t be of much help to solve this issue. As a suggestion, write a PHP or some CGI script that connects to the MySQL database and see if you can debug from there looking at log files.

Goal and desired features

  1. install a minimal UCSC genome browser for a specific genome
  2. browser should not be at the root of the web dir (I wanted something like: www.example.edu/genomebrowser)
  3. restrict access only to the browser with .htaccess
  4. load custom tracks to be displayed permanently

Assuming:

  1. MySQL is up and running in default port
  2. MySQL datadir is at /home/mysql (just replace this for the default /var/lib/mysql if necessary)
  3. Apache’s DocumentRoot is /var/www

Genome browser requirements

  1. there is a /gbdb directory in /
  2. a directory with html files, ex. /var/www/genomebrowser
  3. a directory with cgi-bin files, ex. /var/www/genomebrowser/cgi-bin
  4. a directory for trash that is in the same directory as cgi-bin (ex. /var/www/genomebrowser/trash)
  5. a configuration file at /var/www/genomebrowser/cgi-bin/hg.conf owned by www-data
  6. XBitHack on in /etc/apache2/httpd.conf
  7. Options +Includes in /etc/apache2/httpd.conf for the directory where html files are located
  8. system has libssl.so.6 and libcrypto.so.6

Setting up a dedicated MySQL user

Enter the commands below in your mysql> shell, editting HOSTNAME and PASSWORD.

create user 'hguser'@'HOSTNAME' identified by 'PASSWORD'; flush privileges;

Test if you can connect to the MySQL database using the user above. You may need to provide the fully qualified hostname, depending on how your hostname is defined. I had to comment bind_address = 127.0.01 at /etc/mysql/my.cnf.

Installation: Part 1. Browser engine

Provide required libraries

apt-get install libssl0.9.8 ln -s /usr/lib/libssl.so.0.9.8 /usr/lib/libssl.so.6 ln -s /usr/lib/libcrypto.so.0.9.8 /usr/lib/libcrypto.so.6

Create a base dir for the genome browser and download html files

mkdir /var/www/genomebrowser rsync -avzP rsync://hgdownload.cse.ucsc.edu/htdocs/ /var/www/genomebrowser/

Create a customized dir for cgi-bin (not the root cgi-bin) and download cgi-bin files

mkdir -p /var/www/genomebrowser/cgi-bin rsync -avzP rsync://hgdownload.cse.ucsc.edu/cgi-bin/ /var/www/genomebrowser/cgi-bin/ chown -R www-data.www-data cgi-bin

Add this to /etc/apache2/httpd.conf

XBitHack on  <Directory /var/www/genomebrowser>    AllowOverride AuthConfig    Options +Includes  </Directory>   # the ScriptAlias directive is crucial  ScriptAlias /genomebrowser/cgi-bin /var/www/genomebrowser/cgi-bin  <Directory "/var/www/genomebrowser/cgi-bin">    AllowOverride None    Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch    Order allow,deny    Allow from all    AddHandler cgi-script cgi pl  </Directory>

Enable XbitHack

ln -s /etc/apache2/mods-available/include.load /etc/apache2/mods-enabled/

Restart Apache

/etc/init.d/apache2 restart

Create file /var/www/genomebrowser/cgi-bin/hg.conf

# Configuration file for the UCSC Human Genome server # # the format is in the form of name/value pairs, written as 'name=value' # # note that there is no space between the name and its value. Also, no blank lines should be in this file. # #--------------------------------------------------------------# # # db.host is the name of the MySQL host to connect to db.host=YOURHOST # # db.user is the username used when connecting to the host db.user=hguser # # this is the password to use with the above hostname db.password=PASSWORD # db.trackDb=trackDb  # central.host is the name of the host of the central MySQL # database where stuff common to all versions of the genome # and the user database is stored. central.db=hgcentral central.host=YOURHOST central.user=hguser central.password=PASSWORD central.domain=  backupcentral.db=hgcentral backupcentral.host=YOURHOST backupcentral.user=hguser backupcentral.password=PASSWORD backupcentral.domain=

Give ownership to www-data

sudo chown www-data /var/www/genomebrowser/cgi-bin/hg.conf

Create other dirs required by the browser

rm /var/www/genomebrowser/trash mkdir /var/www/genomebrowser/trash chown www-data.www-data /var/www/genomebrowser/trash

The following links are necessary because the browser assumes it is installed in the root dir of the server. Alternatively, you can setup these links at /etc/apache2/httpd.conf

ln -s /var/www/genomebrowser/images /var/www/images ln -s /var/www/genomebrowser /var/www/html

Providing javascript files required by the binaries. DO NOT copy js and style to /usr/local/apache/htdocs. The browser will display tracks out of alignment!

mkdir -p /usr/local/apache/htdocs/ ln -s /var/www/genomebrowser/js/ /usr/local/apache/htdocs/js ln -s /var/www/genomebrowser/style/ /usr/local/apache/htdocs/style

Test the installation by pointing your browser to http://localhost/genomebrowser

Setup Crontab To Clean Trash

Create a script at /etc/cron.daily (no . or _ allowed in the file name) with the following contents:

#!/bin/bash  find /var/www/genomebrowser/trash/ \! \( -regex "/var/www/genomebrowser/trash/ct/.*" \       -or -regex "/var/www/genomebrowser/trash/hgSs/.*" \) -type f -amin +5040 -exec rm -f {} \; find /var/www/genomebrowser/trash/    \( -regex "/var/www/genomebrowser/trash/ct/.*" \       -or -regex "/var/www/genomebrowser/trash/hgSs/.*" \) -type f -amin +10080 -exec rm -f {} \;

You can change the clean up schedule by changing +5040 and +10080 to the number of minutes that you want.

Installation: Part 2. MySQL tables required for functionality

For a minimal installation, the genome browser requires the following databases:

  • hgcentral
  • hgFixed

Download hgcentral to your MySQL directory

wget http://hgdownload.cse.ucsc.edu/admin/hgcentral.sql mysql -youraccountoptions -e "create database hgcentral" mysql -youraccountoptions hgcentral < hgcentral.sql mysql -youraccountoptions -e "grant all privileges on hgcentral.* to 'hguser'@'HOSTNAME'"

Create a dummy hgFixed

mysql -youraccountoptions -e "create database hgFixed" mysql -youraccountoptions -e "grant select on hgFixed.* to 'hguser'@'HOSTNAME'"

Your UCSC Genome Browser should be working (no data to display, though)

Installation: Part 3. Adding one genome

Shut down your MySQL database

kill -15 `ps aux | grep mysqld | grep 3306 | awk '{print $2}'`

Let’s use mm9 as an example

mkdir /home/mysql/mm9

The database of interest requires the following tables, at least:

  • chromInfo
  • cytoBandIdeo <- not required, but shows the chromosome in the top
  • extFile <- not strictly required for minimal functionality, but necessary for zooming in
  • grp
  • hgFindSpec
  • trackDb

Download these databases

rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/chromInfo.MYD /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/chromInfo.MYI /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/chromInfo.frm /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/cytoBandIdeo.MYD /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/cytoBandIdeo.MYI /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/cytoBandIdeo.frm /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/grp.MYD /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/grp.MYI /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/grp.frm /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/hgFindSpec.MYD /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/hgFindSpec.MYI /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/hgFindSpec.frm /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/trackDb.MYD /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/trackDb.MYI /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/trackDb.frm /home/mysql/mm9

Setup permissions

chown -R mysql.mysql /home/mysql/mm9

Restart your MySQL db

/usr/sbin/mysqld --basedir=/usr --datadir=/home/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid \     --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock &

Grant privileges on mm9

mysql -youraccountoptions -e "grant all privileges on mm9.* to 'hguser'@'HOSTNAME'"

Download gbdb data

mkdir -p /home/genomebrowser/gbdb/mm9 rsync -avzP --delete --max-delete=20 rsync://hgdownload.cse.ucsc.edu/gbdb/mm9/ /home/genomebrowser/gbdb/mm9/

Provide directory gbdb to the browser

ln -s /home/genomebrowser/gbdb /gbdb

With this, your browser should be able to display mm9 data. The first time I access the browser, I add ?db=mm9 after hgGateway (http://localhost/cgi-bin/hgGateway?db=mm9) otherwise, the browser tries to show data for hg19. After the first time, the browser somehow learned that the only database available is mm9.

Adding UCSC Tracks (MySQL Tables)

Select the data you want from ftp://hgdownload.cse.ucsc.edu/mysql/mm9/ and download to /home/mysql/mm9.

For example, I wanted my browser to display:

  • RefSeq
  • UCSC genes
  • SNPs and repeats
  • Conservation

I downloaded these MySQL tables.

Additional Functionality

Some tables can be left out but you will probably want them to obtain extra functionality. For example, when you click on a gene or feature in the genome viewer, you expect to be able to retrieve more info.

For this, you need to install the proteome database. After shutting down your MySQL server:

mkdir /home/mysql/proteins070202

That’s it, a dummy proteins070202 does the trick.

Displaying info about RefSeq genes

rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/description.frm /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/description.MYD /home/mysql/mm9 rsync -avzP  rsync://hgdownload.cse.ucsc.edu/mysql/mm9/description.MYI /home/mysql/mm9

Displaying info about UCSC Genes

Get the UniProt database

wget -nc -c ftp://hgdownload.cse.ucsc.edu/mysql/uniProt/*

Get the proteome database

wget -nc -c ftp://hgdownload.cse.ucsc.edu/mysql/proteome/*.*

Summary Of My Setup

MySQL databases:

  • hgcentral (380 kb)
  • hgFixed (empty database)
  • mm9 (selected tables: 5.8 G)
  • proteins070202 (empty database)
  • proteome (full database: 8.9 G)
  • uniProt (full database: 8 G)

Other files:

  • gbdb (62 G)

Adding custom tables

How the UCSC Genome Browser displays tracks in the browser:

1. the table mm9.grp describes groups of tracks (Mapping and Sequencing Tracks, Genes and … tracks)

If you want a new group instead of adding your custom track to a pre-existing one, you must create a new group in that table:

mysql -youraccountinfo -e "insert into grp set name='$name', label='$label', priority=$priority;"

2. the table mm9.tracksDb describes how tracks are displayed in their groups

The new table must have an entry in tracksDb describing the grp they belong to, color, name, etc.

3. use the hgLoadBed utility to upload .bed files. Notice that this script reads MySQL configuration data from ~/.hg.conf

/var/www/genomebrowser/cgi-bin/loader/hgLoadBed mm9 mynewtable myfile.bed

You can write some Perl scripts to automate the process or use UCSC’s utilities.

Troubleshooting

Can’t connect to local MySQL server through socket ‘/var/lib/mysql/mysql.sock’ (13)

  • create a symbolic link to where the mysql.sock file is: /var/run/mysqld/mysqld.sock

(8)Exec format error: exec of ‘/var/www/genomebrowser/cgi-bin/hgGateway’ failed

  • Check that you have hg.conf under cgi-bin (with hgGateway)
  • Check that hg.conf is owned by www-data
  • manually run hgGateway. If if fails with “-bash: ./hgGateway: cannot execute binary file”, chances are you have a 32 bit system and you are trying to run a 64 bit binary. Install Ubuntu 64 bits or compile the browser from source. To check your system, use arch or uname -a, it should be x86_64

Can’t find hg19, but you want to display another organism

  • in the URL of your mirror give the code name of organism name that you found after db= as argument. For mouse: hgGateway?db=mm9, for Drosophila: hgGateway?org=dm3, etc

See also

http://genome.ucsc.edu/admin/mirror.html

http://genomewiki.ucsc.edu/index.php/Minimal_Browser_Installation

http://bradbot.genomecenter.ucdavis.edu/wiki/index.php/Genome_Browser

  评论这张
 
阅读(1644)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017