Monday, March 15, 2010

EnsEMBL release 57 (March-10 2010)

please read the original full page at:
below is only a selection of features concerning human and mouse

What's New in Release 57
Data updates

Human gene update (Human): The human gene set has been cleaned up and about 1000 genes have been removed.
Mouse Havana Merge (Mouse)
The mouse gene set has been updated using the new HavanaMerge code.
Human: updates to otherfeatures database (Human): Two updates have been made to the human otherfeatures database: The human EST alignments have been rerun. This means that recently sequenced ESTs should be available. The NCBI gene set, currently attached as a DAS track in Ensembl, now live in the otherfeatures database. The gene set is a more recent set than that currently seen in the DAS track and viewing will be faster now that the genes are stored in the otherfeatures database.

Updated ontology database (all species): The ontology database, ensembl_ontology_57, has been updated with the latest data from GO and SO.
Human eFG data update (Human): The human eFG DB has been updated with some new histone modification data sets.
lincRNAs (Human, Mouse): Ensembl human and mouse databases now include lincRNA (large intervening non-coding RNA) data.
Clarification of MT analyses (Human, Mouse): Imported MT genomes now have a distinct analysis/logic_name to distinguish them from the Ensembl annotated gene set. Their source has also been changed to reflect this.
Fish multiple alignments (Zebrafish, Fugu, Tetraodon, Medaka, Stickleback): Ensembl Compara now includes a multiple alignment of all five species: Danio, Gasterosteus, Oryzias, Takifugu and Tetraodon.
cDNA updates (Human, Mouse): An updated version of the cDNA database is now available for human and mouse.

Human variation data updates (Human)

GWAS data from a paper in BMC Medical Genetics: "An Open Access Database of Genome-wide Association Results"
A set of old/retired rsIDs has been added in variation_synonym table for corresponding current rsIDs for human.
Import of EGA data from a genomewide association study of variations linked to stroke and ischemic stroke ("Genomewide Association Studies of Stroke", Ikram et al., N Engl J Med. 2009 Apr 23;360(17):1718-28).
New variation consequence predictions for the new human gene set import additional data from the NHGRI GWAS catalog
First CNV data set from:
Redon 2006 "Global variation in copy number in the human genome" PMID:17122850
Wang 2009 "The diploid genome sequence of an Asian individual" PMID:18987735
Fixes in human database:
flanking sequences that should have been reversed
merged rsIDs that map to the same location
corrected typo in Watson source
reimported Affymetrix data
watson and venter's read_coverage redone to include MT chromosome
New variation data (multiple species)

Cleaning joined human genes and unlinked xrefs (Human): A number of "fused" genes and unlinked xrefs were removed from Ensembl Human, including the following transcripts:
ENST00000361469, which caused JAK3 and INSL3 genes to get fused and be called JAK3
ENST00000428942, which caused TMEM91 and BCKDHA to join, labelling BCKDHA as TMEM91
We would like to thank our users who reported these errors. Please contact us at if you see any others.

*NEW* Homo sapiens Structural Variation dataset now available

SNP Effect Predictor (all species): By popular demand, we have added a web-based tool which can calculate the consequence type of SNPs. Upload your data in our GTF-like SNP format and it will be available to download as text or HTML.

(this Post content was reproduced from:, Via .)