BRIGMANUAL
BRIGMANUAL
95 Manual
Nabil Alikhan
1
CONTENTS 2
Contents
1 Introduction 3
2 Licence 9
3 Installation 9
3.1 Installing BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . 9
9 Configuration options 46
9.1 Saving and reopening your work . . . . . . . . . . . . . . . . . . 46
9.2 BLAST options . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
9.3 Setting BRIG options . . . . . . . . . . . . . . . . . . . . . . . . 48
9.4 Setting Image options . . . . . . . . . . . . . . . . . . . . . . . . 50
9.5 Loading a preset image template . . . . . . . . . . . . . . . . . . 52
1 INTRODUCTION 3
1 Introduction
The BLAST Ring Image Generator (BRIG) is a cross-platform desktop applica-
tion written in Java 1.6. It uses CGView[5] for image rendering and the Basic
Local Alignment Search Tool (BLAST) for genome comparisons. It has a graph-
ical user interface programmed on the Swing framework, which takes the user
step-by-step through the configuration of a circular image generation. Figure 1 is
an example of an image BRIG can create.
Figure 2 shows a magnified view of the same example image showing similar-
ity between a central reference genome in the centre against other query sequences
as a set of concentric rings, where colour indicates a BLAST match of a particular
percentage identity. BRIG does not represent sequences that are not present in the
reference genome The image shows:
• GC skew,
• GC content,
The manual also has detailed instructions for how to install and configure BRIG:
• For instructions on how to configure BRIG and save BRIG settings, see
Section 9 on 46.
1 INTRODUCTION 6
Figure 4: Reference: A list of translated genes that make up the Locus of Entero-
cyte Effacement (LEE), which encodes a Type III secretion system. Query: Raw
sequencing reads simulated from several complete LEE+ published genomes (nu-
cleotide sequence) and E. coli K12, (negative control; LEE-). You can clearly see
gene presence/absence, and divergence (the colour represents sequence identity
on a sliding scale, the greyer it gets; the lower the percentage identity). To make
an image like this please refer to Section 6 on page 18.
1 INTRODUCTION 8
2 Licence
This program is free software: you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software Foun-
dation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but without any
warranty; without even the implied warranty of merchantability or fitness for a
particular purpose. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program. If not, see <http://www.gnu.org/licenses/>.
Please note that these restrictions do not apply to the third party libraries
bundled with this software.
3 Installation
There’s no real ”Installation” process for BRIG itself. However, BLAST+[2] or
BLAST legacy[1] must already be installed and BRIG needs to be able to locate
the BLAST executables (See Section 3.1).
To run BRIG users need to:
Users who wish to run BRIG from the command-line need to:
2. Run “java -Xmx1500M -jar BRIG.jar”. Where -Xmx specifies the amount
of memory allocated to BRIG.
BLAST legacy comes as a compressed package, which will unzip the BLAST
binaries where ever the package is. We advise users to first create a BLAST direc-
tory (in either the home or applications directory), copy the downloaded BLAST
package to that directory and unzip the package.
BRIG supports both BLAST+ & BLAST Legacy. Users can specify the loca-
tion of their BLAST installation in the BRIG options menu which is:
Main window >Preferences >BRIG options.
The window is shown in Figure 6. If BRIG cannot find BLAST it will prompt
users at runtime.
PRO TIP 1: BRIG uses BLAST, do not use wwwblast or netblast with BRIG.
PRO TIP 2: If BOTH BLAST+ and legacy versions are in the same location,
BRIG will prefer BLAST+.
Figure 6: You can change where BRIG looks for BLAST in the BRIG options
window. For more information about BRIG options see Section 9.2 on page 48.
4 WARNING WHEN USING BLAST 11
PRO TIP 3: BLAST filters may cause gaps in alignments, which will show up
as blank regions in BRIG images.
BLAST filters (BLAST legacy -F flag or BLAST+ -dust/-seg no flag) filter the
query sequence for low-complexity sequences by default. This includes sequences
that are highly repetitive or contain the same nucleotide for long lengths of the
sequences. Low-complexity filtering is generally a good idea, but it may break
long matches into several smaller matches.
This is often shown in BRIG images as truncations or gaps in alignments, it
is particularly obvious in very small reference sequences where alignments are
shown on a gene-by-gene level.
To prevent this, either turn off filtering or use soft masking.
PRO TIP 4: BLAST’s bitscore filtering may cause different results in BRIG if
users swap the query and reference sequences, particularly if these are very
different sizes.
BLAST uses statistical thresholds to filter out “bad alignments”; alignment matches
that appear random to BLAST. One of these thresholds is the e-value, which is the
probability of the alignment occurring by chance, given the complexity of the
match, sequence composition and the size of the database. It is more likely in
a larger sequence that an alignment could occur by chance, so BLAST is more
critical of these matches.
This can create different expected values if BLAST is used with the same
reference sequence against databases of different sizes and may potentially filter
out significant matches or include poor scoring ones.
4 WARNING WHEN USING BLAST 12
Because of this, users might notice different results in BRIG images if they
swap the order of the database and reference sequences around in the BLAST,
especially if the two sequences are quite different in size. The differences are
often due to a few very low-scoring hits.
Users should consider what an appropriate e-value threshold is for the compar-
isons that they run. Remember, that BLAST runs with an e-value of 10 by default,
we recommend that users change this value. Users can set the final threshold
(e-value) with the -e flag in BLAST legacy or -evalue flag in BLAST+.
PRO TIP 5: BLAST does not handle spaces in filenames, BRIG will prompt
users if they have spaces in file locations.
5 VISUALISING WHOLE GENOME COMPARISONS 13
3. Press “add to data pool”, this should load several items into the pool list,
there should be nine files.
5 VISUALISING WHOLE GENOME COMPARISONS 14
6. Click next
PRO TIP 6: Users can add individual files to the data pool too.
2. Select the required sequences from the data pool and click on “add data” to
add to the ring list.
3. Choose a colour
5. Click on “add new ring” and repeat steps for each new ring required.
The values required for each ring are detailed in the table below. Notice that that
sequences can be collated into a single ring, like the example of K12 & HS. The
ring will show BLAST matches from both HS and K12.
Legend text Required sequences Colour
GC Content GC Content Ignore
GC Skew GC Skew Ignore
Coverage BRIGExample.graph 153,0,0
O157:H7 E coli O157H7Sakai.gbk 0,0,153
HS and K12 E coli HS.fna 0,153,0
E coli K12MG1655.fna
CFT073 and UTI89 E coli CFT073.fna 153,0,153
E coli UTI89.fna
5 VISUALISING WHOLE GENOME COMPARISONS 16
PRO TIP 7: Rings can be reordered by dragging them in the Ring List pane.
PRO TIP 8: You can set default threshold values in “BRIG options”. See
section 9.2 (page 48) for more details.
PRO TIP 9: When using a Genbank/EMBL file as a reference, users can
choose whether to use the protein or nucleotide sequence.
2. Hit submit.
3. The image will be created in the specified output directory and should look
something like Figure 7.
BRIG will format Genbank files, run BLAST, parse the results and render the im-
age. The final image (Figure 7) shows GC Content and Skew, the Genome cover-
age, contig boundaries, and the BLAST results against the other E. coli genomes.
The results for HS and K12 have been collated into a single ring, likewise for
UTI89 and CFT073.
5 VISUALISING WHOLE GENOME COMPARISONS 17
PRO TIP 10: Image settings, like size, fonts, etc can be configured in: Main
window >Preferences >Image options..
6 WORKING WITH A MULTI-FASTA REFERENCE 18
1. Set the reference sequence as “Ecoli vir.fna”. Users can use the browse
button to traverse the file system.
3. Press “add to data pool”, this should load several items into the pool list.
6. Click “Next”.
6 WORKING WITH A MULTI-FASTA REFERENCE 19
PRO TIP 11: The Spacer field determines the number of base pairs to leave
between FASTA sequences.
The next step is to add the gene annotations, which will be fetched from the Multi-
FASTA headers:
1. Click Add custom features in the second BRIG window to bring up custom
annotation window (Figure 9).
2. Double click “Ring 5”.
3. Set “input data” as Multi-FASTA.
4. Set “colour” as alternating red-blue
5. Click add.
6 WORKING WITH A MULTI-FASTA REFERENCE 20
This step colours the gaps between FASTA entries, the gaps are calculated
from the Multi-FASTA file (Figure 10). For each genome ring, do the following:
4. Click add.
The results should be similar to Figure 10 in the left hand pane. Close the window
when this is done.
PRO TIP 12: A spacer value can be set when using protein sequences from a
Genbank/EMBL file.
3. Set the image title as “Various E. coli virulence genes” and press submit.
The output image should be something like Figure 11.The alternating red-blue op-
tion has automatically alternated the red and blue colours for the gene labels. This
option is available whenever a multi-FASTA file is used as a reference sequence.
This same option could be used to show contig or genome scaffold boundaries.
This image shows some real biological information very clearly.
1. CFT073 (UPEC) and K12 MG1655 (Commensal) do not carry the Locus of
Enterocyte Effacement. These virulence factors are specific to EHEC and
EPEC.
PRO TIP 13: You use protein sequences as a multi-FASTA reference and use
blastx to improve alignment accuracy for divergent sequences.
6 WORKING WITH A MULTI-FASTA REFERENCE 24
Figure 11: Output image from Multi-FASTA walkthrough. This was generated
using BLAST+[2], BLAST legacy[1] will produce slightly different results.
7 VISUALISING GRAPHS AND GENOME ASSEMBLIES 25
PRO TIP 14: Graph files cannot be shown on same ring as sequence files
(protein or nucleotide).
2. Create the graph file from the graph files modules: Main window >Mod-
ules >Create graph files.
(a) Set drop down to coverage graph, fill in fields (Figure 7.1).
(b) Set Assembly file as “Mu50.sam” from the BRIG examples/Chapter7-
sam-examples folder.
(c) Set Output folder as the location of the Chapter7-sam-examples folder.
(d) Window size as “1”..
(e) Click Create Graph. This will add the graph file to the data pool when
it has finished.
7 VISUALISING GRAPHS AND GENOME ASSEMBLIES 27
Close the coverage graph window and return to the first main window.
3. Press “add to data pool”, this should load several items into the pool list.
Click next to move to the next window to configure the rings and add in annota-
tions.
2. Ring 1 Settings:
3. Ring 2 Settings:
4. Ring 3 Settings:
5. Ring 4 Settings:
This will load all the coding sequences from the Genbank file. These annotations
will be drawn as arrows, indicating orientation. Close this window and click next
on the second window.
1. Set title as “S. aureus Mu50 plasmid”.
2. Click Submit.
This will generate the final image, it should look like Figure 12.
7 VISUALISING GRAPHS AND GENOME ASSEMBLIES 30
Figure 12: S. aureus Mu 50 plasmid, showing read mapping from simulated 454
reads, CDSs, and genome comparisons to other S. aureus plasmids, pSK57 &
SAP014A. Alignments were performed with BLAST+
7 VISUALISING GRAPHS AND GENOME ASSEMBLIES 31
First, produce the coverage graph file based off the assembly (ace file):
2. Create the graph file from the graph files module: Main window >Modules
>Create graph files.
4. Click “Create Graph”. This will add the graph file to the data pool when it
has finished.
7 VISUALISING GRAPHS AND GENOME ASSEMBLIES 33
Next, map the coverage generated in the previous graph file to the modified genome
sequence.
3. Click “Create Graph”. This will add the graph file to the data pool when is
has finished.
7 VISUALISING GRAPHS AND GENOME ASSEMBLIES 34
Close the “Create custom graph” window and return to the main window.
3. Press “add to data pool”, this should load several items into the pool list.
Click next to move to the next window and configure the rings.
1. Create 4 rings.
2. Ring 1 Settings:
3. Ring 2 Settings:
4. Ring 3 Settings:
5. Ring 4 Settings:
The rings are now step up with the correct colour, data and labels. The next step
is to mark the CDS on Ring 4 as “custom features”. Click “Add Custom features”
to open a new window.
1. Double-click on Ring 4.
7. Click add.
7 VISUALISING GRAPHS AND GENOME ASSEMBLIES 37
This will load all the coding sequences from the GenBank file. These annota-
tions will be drawn as clockwise or counter-clockwise arrows, indicating ori-
entation. Close this window and click next on the main window. From the
third/confirmation window:
2. Click submit.
This will generate the final image, which should look like Figure 13.
7 VISUALISING GRAPHS AND GENOME ASSEMBLIES 38
6. Click next
8 WALKTHROUGHS ON CREATING CUSTOM ANNOTATIONS 40
2. Select the required sequences from the data pool and click on “add data” to
add to the ring list.
3. Choose a colour
5. Click on “add new ring” and repeat steps for each new ring required.
The values required for each ring are detailed in the table below.
Legend text Required sequences Colour
K12 MG1665 E coli K12MG1655.fna 0,153,255
CFT073 E coli CFT073.fna 204,0,255
UTI89 E coli UTI89.fna 153,0,102
8 WALKTHROUGHS ON CREATING CUSTOM ANNOTATIONS 41
1. Create a new ring (Ring 4) and double-click Ring 4 in the ring list.
11. Remain in the “Add Custom features” for the next step.
8 WALKTHROUGHS ON CREATING CUSTOM ANNOTATIONS 42
PRO TIP 15: The same process applies for EMBL files.
By default, CDS on the sense strand will be drawn as clockwise arrows and
counter-clockwise arrows for anti-sense. All other features will be shown as block
arcs, by default. This can be overriden by changing “Draw feature as” from “de-
fault”.
1. Create a new ring (Ring 5) and double-click Ring 5 in the ring list.
6. Press Add.
If there are a lot of entries, this make take a few seconds. All the values should
show up in the left hand pane. Individual changes can be made by double clicking
and editing an entry. Users can load any kind of results or annotations into BRIG
using this approach.
2. Press submit.
3. The image will be created in the specified output directory and should look
something like Figure 14.
8 WALKTHROUGHS ON CREATING CUSTOM ANNOTATIONS 44
Figure 14: Output image from tab-delimited annotations. This figure shows E.
coli O157:H7 Sakai as the central reference sequence, with genome comparisons
to K12, CFT07 and UTI89 and the SP sites annotated in black.
8 WALKTHROUGHS ON CREATING CUSTOM ANNOTATIONS 45
• Lines that contain column headers, or comments must start with a “#”. See
the first line in Figure 8.2.
• The first two columns MUST be Start and Stop values and they must have
a value or BRIG will ignore them.
• Acceptable Colour values are: default, red, aqua, black, blue, fuchsia, gray,
green, lime, maroon, navy, olive, orange, purple, silver, teal, white, yellow.
The easiest way to make BRIG tab-delimited files is to set up a spreadsheet with
exactly the same headers as below (Order and text is important) and then fill in
the columns with data, leave the field blank to use default colours and decorations.
Leave the label field blank if no label is required.
9 CONFIGURATION OPTIONS 46
9 Configuration options
9.1 Saving and reopening your work
Users can save their work in BRIG from the File menu (Figure 15) and can
open these save files from File menu >Open.... There are three options for sav-
ing:
• “Save As...” will save all the image settings, BRIG settings, rings configu-
ration, files users have imported into BRIG.
• “Save Profile...” will save just the global image settings & BRIG settings
like image dimensions and font size and colours. It will not save the data
pool or any ring settings. These files are designed to be used as templates
for other images.
• “Bundle Session...” will copy all files used in the BRIG session and save all
the current settings into a single directory. This directory can be compressed
and kept as an archive or sent to someone else who can open the .xml file
and work on the image with all of the original settings.
PRO TIP 16: BRIG handles all the file input and output into BLAST, so DO
NOT use -o, -d, -p, -i, -m in BLAST legacy or -out, -db, -query, -outfmt in
BLAST+.
9 CONFIGURATION OPTIONS 47
The list below highlights some BLAST parameters for BLAST+[2] and BLAST
legacy[1]; more common Parameters are in bold. For more information please
read the blastall or BLAST+ command-line usage messages (or http://www.
ncbi.nlm.nih.gov/books/NBK1763/).
BLAST+ options:
• -dust or -seg Filter query sequence (DUST with blastn, SEG with oth-
ers) [String] default = yes
PRO TIP 17: BRIG runs BLAST+ with task as blastn, unless overidden by
the user.
• -F Filter query sequence (DUST with blastn, SEG with others) [String]
default = T
• -W Word size, default if zero (blastn 11, megablast 28, all others 3) [Integer]
default = 0
Figure 16: Set custom BLAST options on the first window (shown) or the third
window
• Genbank file extensions: Users can specify the file extensions they use as
GenBank files (with commas between extensions). e.g .gbk.
• FASTA file extensions: Users can specify the file extensions they use as
FASTA files (with commas between extensions). e.g .fa,.fna,.fas.
• EMBL file extensions: Users can specify the file extensions they use as
EMBL files (with commas between extensions). e.g .embl.
9 CONFIGURATION OPTIONS 49
• BLAST binary folder: Users must specify the location of BLAST executa-
bles, leave this blank if BLAST is on their PATH.
• Default spacer space: Users can set a default spacer value that BRIG uses
when using Multi-FASTA files.
Figure 17: First BRIG options window, accessible from Preferences >BRIG
options
• Graph label thickness: Graph rings has another smaller ring to show la-
belled regions, users can set how many times smaller these rings are com-
pared to normal rings here.
• Graph ring thickness multiplier: Graph rings need to be larger than other
rings, users can set how many times larger Graph rings are compared to
normal rings here.
• Default upper & lower threshold: The colour of BLAST matches in BRIG
are shaded in a sliding scale between 100% and the lower threshold percent-
age identity. These values and their corresponding colour are shown in the
legend. The upper threshold is used to highlight a particular percentage
9 CONFIGURATION OPTIONS 50
Figure 18: Second BRIG options window, accessible from Preferences >BRIG
options
identity in the legend and has no bearing on the colour of BLAST matches.
• Default min. threshold: Minimum percent identity for BLAST results that
BRIG will include on an image.
– Height and Width:These values change the height and width of image
by specifying the number of pixels.
– Title font and color: These values change the font type, size and
colour.
– Show shading?: Set to false to turn off the embossing on annotations.
9 CONFIGURATION OPTIONS 51
– Tick density: This value changes how often ticks appears on the ruler,
requires a decimal number between 0 and 1.
– Long tick colour: This value changes the colour of long (or short
ticks).
– Backbone radius: Sets the radius size of the ring circle. Increase this
to make a larger ring and decrease to make it smaller. N.B this will not
automatically scale with the Image height and width.
– Legend and Warning font: Font of all legend and warning messages,
set globally.
9 CONFIGURATION OPTIONS 52
PRO TIP 18: Loading a preset template will not overwrite the ring colour
settings. This is only to get the right proportions for image sizes, fonts and
ring sizes.
References
[1] A LTCHUL , S., G ISH , W., M ILLER , W., M YERS , E., AND L IPMAN , D. Ba-
sic Local Alignment Search Tool. Journal of Molecular Biology 215, 3 (OCT
5 1990), 403–410.
[3] DARLING , A., M AU , B., B LATTNER , F., AND P ERNA , N. Mauve: Multiple
alignment of conserved genomic sequence with rearrangements. GENOME
RESEARCH 14, 7 (JUL 2004), 1394–1403.
[4] R ICHTER , D. C., OTT, F., AUCH , A. F., S CHMID , R., AND H USON , D. H.
MetaSim-A Sequencing Simulator for Genomics and Metagenomics. PLOS
ONE 3, 10 (OCT 8 2008).
[5] S TOTHARD , P., AND W ISHART, D. Circular genome visualization and ex-
ploration using CGView. BIOINFORMATICS 21, 4 (FEB 15 2005), 537–539.