A wwwblast installation allows a researcher to use BLAST to search a sequence database using a graphical user interface. There are several differences when compared to using BLAST from the command line (as is described in Wikiomics:BLAST_tutorial for instance):
- After setup, researchers use the program by accessing a web page, not using the command line. This can make it more suitable for computer-shy biologists.
- The output includes a diagrammatic overview of the BLAST hits' coverage of the query sequence, whereas command line BLAST does not generate this.
- It is more difficult to use when dealing with large numbers of sequences, and is not amenable to parsing using a [BioPerl] parser, for instance.
Setting up a wwwblast server with the Apache web server
Install the pre-requisites
wwwblast requires csh, and apache to be installed. On OSX, these are installed by default.
Download and extract
Download the wwwblast program from the NCBI FTP site.
Extract the contents of the downloaded archive into the default apache directory (the DocumentRoot, in apache parlance):
- On OSX, this site is /Users/ben/Sites (replacing ben with your login name)
- On Ubuntu and other linux distributions, use /var/www
Afterwards, the folder /Users/<username>/Sites/blast should exist (or the equivalent on other operating systems). Technically there is no reason it cannot be in any directory in the DocumentRoot e.g. /Users/ben/Sites/myblasts/blast, but for the sake of simplicity that is assumed here. For the rest of the tutorial the directory /User/ben/Sites/blast is assumed to be base directory of the blast installation, and 'ben' is assumed to be the user logged into the computer.
Turning on the webserver
Enabling the Apache web server is different on different platforms. On OSX, enable the personal web sites option (System Preferences - Network/Sharing - Personal Web Sharing). On Ubuntu, install the apache2 package.
When this step is correctly carried out, the webpage http://localhost should work.
Modifying the apache configuration file
Add the following text entry to the apache config file. On OSX this file is /etc/httpd/users/ben.conf, and on Ubuntu, create a new file /etc/apache2/conf.d/blast.conf
# Added below to get wwwblast to work AddHandler cgi-script .cgi <Directory "/Users/ben/Sites/blast"> Options FollowSymLinks +ExecCGI +Indexes </Directory>
After saving the file, restart the apache webserver. The simplest way is to restart the computer.
After restarting, you should be able to run a blast against the default test databases at http://localhost/blast/blast.html. You can use the sequence TACTGTTATCGATCCGGTCGAAAAACTGCTGGCAGTGGGGCATTACCTCGAATCTACCGTCGATATTGCT to test, against test_na_db using blastn. You should get a hit, not a half-blank page.
Getting the BLAST overview image to work
If there is no image showing how the blast sequence worked, and instead in its place was the an error message similar to below though with the numbers being different:
fail to open file TmpGifs/1804289383790.gif
This can be fixed by giving every user on the computer write access to the TmpGifs folder. In a terminal as an administrator:
$ chmod o+w /Users/ben/Sites/blast/TmpGifs/
There should be no output from this command. If the permissions have been changed correctly, the overview image should work.
Creating custom databases
By default, wwwblast only comes with a few databases that are generally not useful. Instead the researcher wants to BLAST against sequences generated in their own lab, for instance. To make a custom sequence database from a FASTA file, 3 steps are required: converting the fasta file into a binary blast database, modifying blast.rc, and changing the blast.html so that the database can be selected from the drop-down menu.
Converting a fasta file into a binary BLAST database
A binary BLAST database is a collection of multiple files (.nhr, .nin and .nsq files for nucleotide databases). They must be created from a fasta file in a terminal, using the BLAST+ toolkit, available from NCBI. The legacy BLAST toolkit can be used to achieve the same goal, though the command line syntax differs.
After copying the fasta file (called for example 'my_nucleotide_sequences.fasta') to the db directory of the wwwblast installation (e.g. /Users/ben/Sites/blast/db), enter the following in a terminal:
$ cd /Users/ben/Sites/blast/db $ makeblastdb -in my_nucleotide_sequences.fasta -dbtype nucl Building a new DB, current time: 09/23/2010 14:12:18 New DB name: my_nucleotide_sequences.fasta New DB title: my_nucleotide_sequences.fasta Sequence type: Nucleotide Keep Linkouts: T Keep MBits: T Maximum file size: 1073741824B Adding sequences from FASTA; added 1620 sequences in 0.207906 seconds.
For amino acid sequence fasta files, use '-dbtype prot' instead of '-dbtype nucl'