**** UPDATE **** More recent versions of YGOB contain further updated S. castellii annotations Users are now advised to use this data, which can be found here: http://wolfe.gen.tcd.ie/ygob/data/latest/ The two 'Scastellii_LATEST' files also link to this data. **** UPDATE **** ----------------------------------------------------------------------------------------- Wolfe Lab Saccharomyces castellii Re-annotation 4th April 2006 kevin.byrne@tcd.ie http://wolfe.gen.tcd.ie/ygob ----------------------------------------------------------------------------------------- This folder contains files detailing our re-annotation of the S. castellii genome. S. castellii was sequenced to 4x coverage by Cliften et al. (2003) but due to inconsistencies in the original gene coordinates we have performed a complete re-annotation. Where our revised gene predictions could be matched to a gene from the Cliften et al. annotation we have retained their original name. Genes not identified by Cliften et al. were named by adding a 'd' suffix to the gene to the left i.e. a new gene between Scas_600.10 and Scas_600.11 is named Scas_600.10d. Genes whose coordinates were changed relative to the Cliften et al. annotation are denoted with a '*' suffix in archive datasets but this convention is no longer being supported. We also merged contigs into super-contigs when two ends of the same gene were on different contigs and there was clear evidence from synteny that the contigs were linked. Runs of 100 'N's have been inserted between merged contigs. For details of the annotation pipeline or access to the complete annotation (with tRNAs etc.) email scannedr at tcd dot ie ----------------------------------------------------------------------------------------- Current S. castellii datasets (04/04/06) + Scastellii_040406.tab Columns are from left to right: NAME (in format Scas_[Cliften contig number].[orf number]), START (all co-ordinates are absolute and inclusive), STOP, LENGTH, STRAND (+1/-1), EXONS, (number of coding fragments), SUPERCONTIG. If the ORF has more than one exon these are detailed in subsequent lines starting with "=>" and with the three columns: START, STOP, LENGTH. All co-ordinates refer to the super-contigs provided in the next file. + Scastellii_040406_super-contigs.fsa Contigs from Cliften et al. that have not been merged are unchanged and all have numbers less than 1000. Merged super-contigs are numbered from 1000 up and are joined by 100 'N's between Cliften et al. contigs. The contigs that make up the super-contig are listed in order and orientation in square brackets following the super-contig name. + Scastellii_040406_orfs.fsa DNA sequences for the ORFs listed in "Scastellii_040406.tab". + Scastellii_040406_prots.fsa Protein sequences for the ORFs listed in "Scastellii_040406.tab". ----------------------------------------------------------------------------------------- Archive S. castellii datasets + Scastellii_Nature_prots.fsa This file contains the S. castellii re-annotated protein set that we used for the analyses in Byrne and Wolfe (2005) and Scannell et al. (2006). This is the data used in the Yeast Gene Order Browser (YGOB; wolfe.gen.tcd.ie/ygob). ----------------------------------------------------------------------------------------- REFERENCES [1] Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Cliften P, Sudarsanam P, Desikan A, Fulton L, Fulton B, Majors J, Waterston R, Cohen BA, Johnston M. Science. 2003 Jul 4;301(5629):71-6. [2] The Yeast Gene Order Browser: combining curated homology and syntenic context reveals gene fate in polyploid species. Byrne KP, Wolfe KH. Genome Res. 2005 Oct;15(10):1456-61. [3] Multiple rounds of speciation associated with reciprocal gene loss in polyploid yeasts. Scannell DR*, Byrne KP*, Gordon JL, Wong S, Wolfe KH. Nature. 2006 Mar 16;440(7082):341-5. -----------------------------------------------------------------------------------------