BLAST is one of the most important programs in Bioinformatics. It allows one to search a protein or DNA sequence against a large database of sequences very rapidly. Similar sequences are identified and the significance of this similarity is calculated. A high degree of similarity can be used to infer homology.
When you perform a BLAST search, it is critical that you look at the e-value for each of the 'hits' that your search returns. The e-value (or 'expectation value') is defined as the number of hits you expect to see, in this database, with this score or better.
An e-value of ≤0.01 is generally indicative of homology. Proteins that are homologues can have much higher e-values (e.g. 5.0), but the BLAST search gives no statistical support to indicate that this is the case.
Click the headings below for instructions on running BLAST and examining the results
Since this is a long sequence and the 'nr/nt' sequence collection is now extremely large, this search can take a very long time using the normal 'blastn', but 'megablast' is much faster (though less sensitive).
Instead of wasting the NCBI computer resources and your time, you can also follow a link to precalculated results.
View the results here
When the search has completed, the results page will first show a tabular list of the hits together with their scores.
Below this is the list of hits together with their scores, query coverage, E-value, Identity and Accession.
If you have used the actual server, rather than the pre-calculated results, you can also click Graphic Summary which gives a graphical representation of the hits. Your query is shown as a bold turquoise line with nucleotide base numbers printed along it. Below this are coloured lines representing the 'hits' in the database. The colour of these lines represents the score for the hit (as explained in the key at the top). Some lines will be broken into parts with different scores for different sections.Since this is a perfect match to your sequence, this will open a new page with information about the gene we have been using.