Using wgVISTA web-interface
- Input to the server
We ask for you email address so that we can notify you when the results are ready.
You can submit your sequences to the server two ways:
Upload them from your computer as a plain text file in Fasta format using the "Browse" button. The first word in header lines (lines which start with the ">" character) will be used as the name of the corresponding sequence. The sequence can be in upper or lower case letters. If the "Soft-masked" box is checked, the lower case nucleotides will be interpreted as repeats, otherwise they will be converted to upper case.
Sample sequence in Fasta format (you will find more details on the format at the NCBI site):
>contig1 ATCACGCTCTTTGTACACTCCGCCATCTCTCTCT CTCTCGAGCAGATCTCTCTCGGGAATATCGACAA ... >contig2 ATCACGCTCTTTGTACACTCCGCCATCTCTCTCT CTCTCGAGCAGATCTCTCTCGGGAATATCGACAA ... >contig3 ATCACGCTCTTTGTACACTCCGCCATCTCTCTCT CTCTCGAGCAGATCTCTCTCGGGAATATCGACAA ...
Note: at this time we accept only the letters CAGTN and X in your sequence. Please make sure to submit a sequence as plain text, not a Word or HTML file.
You can specify the genome's GenBank accession number (or a list of accession numbers separated by spaces or commas), which will be used to automatically retrieve the sequence(s) from the GenBank database and process on our server.
Note: In both cases, the size of the genome should not exceed 10 Megabases.
These options allow you to customize your VISTA analysis. You can select to use translated anchoring in Shuffle-LAGAN, which can improve the alignment of distant species, and specify the RankVISTA probability threshold. You also can use independently obtained gene annotations, select an appropriate repeat-masking option, and give specific names to the analyzed genomes. If you do not fill in these additional options, we will use their default values.
For each genome you can select:
Select names for your genomes. We suggest that you use something meaningful, such as the name of an organism, the number of your experiment, or your database identifier. When you use a GenBank identifier to input your sequence, by default we will use it as a name of the genome.
If a gene annotation of the sequence is available, you can submit it in a plain text file to be displayed on the plot.
Each line in the file should have five tab-separated columns:
1) gene name;
2) sequence name;
3) strand ("+" or "-");
4) gene start;
5) gene end.
gene1 contig1 + 10 100 gene2 contig1 - 1000 1200 gene3 contig2 + 1100 2100 gene4 contig2 - 4100 5100
If you leave the "Annotation" field empty, the program will automatically retrieve gene annotations for sequences specified by their GenBank accession numbers from the GenBank.
If the "Soft-masked" box is checked, the lower case nucleotides in the submitted sequence will be interpreted as repeats, otherwise they will be converted to upper case.
Several minutes after submitting your sequences you will receive an email from firstname.lastname@example.org with web links to the VISTA Browser and the location from where you can download the raw alignments. The results will be stored on the server for one month. Detailed help and instructions for the VISTA Browser are available at http://pipeline.lbl.gov/vgb2help.shtml