Download
The SeqFreed package is available via the SourceForge.net project page.
Installation
SeqFreed comes without installer yet. Unpack the tarball with
gunzip seqfreed-x.y.tar.gz
tar -xvf seqfreed-x.y.tar
For using Seqfreeds database functionality you need a copy of at least one genomic GenBank [1] file in the folder seqfreed-x.y/data/.
Genomic GenBank files (*.gbk) can be downloaded from
- ftp://ftp.ncbi.nih.gov/genomes/
Store your personal auxiliary files in the folder seqfreed-x.y/aux/.
[1] http://www.ncbi.nlm.nih.gov/Genbank/
Dependencies
You need Python 2.3 and PyGTK 2.6 or newer.
Running SeqFreed
cd seqfreed-x.y
./seqfreed.x.y
Indexing GenBank files
Seqfreed needs to create an index of your GenBank file that is written to a seperate file with the ending ".idx". The content of the GenBank file will not be modified. In the lower notebook change to the "Options" tab and press "Index" there. If you have several GenBank files in /data all of them will be indexed at once.
Using the database
On top of the "Database" tab you find a selection bar for the GenBank file you want to work with. All other actions within the "Database" tab will refer to the selection in this bar.
To the left you find an entry window. Here you enter the names or locus tags of the genes you want to retrieve information from, e.g. "b1491", "clpP", "oxyr", "C23H3.2" ... Note that you can enter multiple entries at once (one per line) and that case does not matter.
To the right of the entry window you see a selection bar, where you can choose the type of information you are interested in ("Protein", "DNA", "Product", "Function", "EC" etc. ...).
If your selection was "Protein" or "DNA" then Seqfreed will read sequence data from the GenBank file. Another option system allows you to further specify the sequence region you are interested in.
Protein
- full length - the whole translated coding sequence (CDS)
- N-terminus - the N-terminal amino acid sequence (see also "Quantifiers")
- C-terminus - the C-terminal amino acid sequence (see also "Quantifiers")
DNA
- CDS -full length - the nucleotide sequence of the whole coding sequence (CDS)
- 5'- CDS - the 5'-part of the CDS (see also "Quantifiers")
- CDS -3' - the 3'-part of the CDS (see also "Quantifiers")
- upstream - the upstream region of the CDS (see also "Quantifiers")
- downstream - the downstream region of the CDS (see also "Quantifiers")
- from/to - enter absolute coordinates here. If the second coordinate is greater than the first, Seqfreed will return the reversed complement of that region (this way you can easily access sequence data from genes that are located "counter clockwise" on the genome). When this option is chosen, identifiers given in the entry window to the left will play no role, of course.
Quantifiers
- number of nt/aa - If you have chosen the option for a partial sequence or a up-/downstream region, then you have to enter the number of amino acids or nucleotides you want to see. Enter a number in the field "number of nt/aa".
- offset - You can also define an offset for partial sequences or up-/downstream regions, e.g. for "100 nucleotides upstream region with a distance of 50 nucleotides relative to the start of the CDS" you would enter "100" in the field "number of nt/aa" and "50" in the "offset" field. The "offset" option probably is used only very rarely.
The toolbar
All buttons in the toolbar display tooltips when you mouse-over them. It is all quiet self-explanatory, so just read the tooltips and play around. The selection bar to the right gives you quick access to auxiliary files that you stored in the directory "/auxil". Codon tables, plasmid maps, text or image ... all your permanently used helper files are accessible via a single click.
The display unit
Use the toolbar buttons to switch between single- and double-page mode and normal- and "full-screen"-mode repectively. For standard 'copy', 'paste', etc. the context menu pops up when the right mouse button is clicked. 'One-step-copy' and 'one-step-append' is available via the toolbar. Change the text fontsize or the zoom factor of images with the respective toolbar buttons.
The command line entry bar
Enter your command line here. Seqfreed will pass the content of the current active page to your program and write the output to this page.
Though unobtrusive, this unit makes up Seqfreeds core functionality. It gives you access to all the bioinformatic tools installed on your machine. Let's consider the "Emboss" package with its dozens of programs. Lets say you have a nucleotide sequence in the current page of the active notebook that you preferably retrieved via SeqFreeds database unit. Now you could enter "remap -filter" and Seqfreed will call the remap program and pass it the sequence data. The output will be displayed in the same window the source data was before (if you want to keep the input data, just copy it with a single click on the respective toolbarbutton to the other notebook). If the output is an image, Seqfreed will display that and you can even save it to a file. Since there is nothing magic behind this entry bar and it just passes your command to the shell you have to consider two things:
-- make sure that you pass the right options to the program you are calling, so it *reads from stdin* and *writes to stdout*. These options depend on the program you are calling, so you have to read its helpfile. In the case of "Emboss" programs e.g. you apply the "-filter" option to pass input data via stdin.
-- pass the path for the program you are calling. Either it is already in your PATH variable or you give the absolute pathname with the command. Seqfreed will do no guessing here and just behave as if you invoked the command from the shell.
When you use the command line entry bar you will recognize its
entry completion functionality. You can "train" Seqfreed to "guess" your commands, so you dont have to enter them over and over again. For this, consider the buttons on the right hand side of the bar. If you want to retain a command permanently then click the button with the green check mark. The current line in the bar will be stored in Seqfreeds configuration file. If you want to remove a line from the actual command line history, just click to the button with the cross. If you want to remove that line from permanently from the config file then click the check mark button with an empty line afterwards. So, the action "cross button" + "check mark button" will remove a command from your permanent entry completion history. Sounds a bit complicated, but just play around with it and you will see that you can customize the command line interface so it is effortless to work with it.
The menubar
Here you find the standard functionality you know from other programs you already worked with.