| RNA Linkpages | |
| MedDB (Medical University of Vienna, Austria) |
MedDB,
created at the Medical University of Vienna, Austria, provides a very good,
concise, compact overview
of bioinformatics resources, listed in well-structured categories.
MedDB also contains two sub-sections of the section "Genomics" called "RNA",
and "Splicing".
Please refer also to this main section of MedDB. |
| RNA World Website (IMB, Jena, Germany) |
The RNA World Website maintained at the IMB, Jena, Germany, is a good starting point for information on RNA structures as well as RNA related information in general. Different sections include databases and web tools, software, online books and tutorials, meetings, and more. |
| RNA Structure | |
| Vienna RNA Package (TBI, University of Vienna) including: Vienna RNA secondary structure server RNAfold Alifold RNAinverse |
1. The Vienna RNA Package
was developed for the prediction and comparison of RNA secondary
structures at the Theoretical Biochemistry Group (TBI) of the University of
Vienna, Austria. The package is free software and can be downloaded as
C source code that should be easy to compile on almost any flavor of Unix
and Linux. Note: This package developed for UNIX
command-line use; there are no graphical user interfaces. 2. Nevertheless, the Vienna RNA secondary structure server offers access to the most popular features of the Vienna RNA Package via easy to use web interfaces. Note that these servers in general have to limit request sizes for performance reasons. 2.1. RNAfold: This is the web interface to the RNAfold program. This server will predict secondary structures of single stranded RNA or DNA sequences. Thus, RNAfold provides both the most basic and most widely used function. The output presents the predicted mfe (minimum free energy) structure both as a string in bracket notation and links to the plots generated for visualization. Plots are produced in Postscript format. A suitable alternative is the new standard for Scalable Vector Graphics, SVG. For this purpose, the browser has to be equipped with a SVG plugin (typically from Adobe). NOTE: The UCSC Genome Bioinformatics site provides pre-computed structures of 5'- and 3'-UTR regions of all RNAs, which were produced using RNAfold. The estimated folding energy is in kcal/mol. The more negative the energy, the more secondary structure the RNA is likely to have. As there are no stable URLs of individual gene entries in UCSC, you may follow this example: Open the UCSC Gene Sorter and search for the human gene PTGS2. You will retrieve a table where you can access the specific gene entry via the link "Description" (last column). There are several sections in this PTGS2-specific file, one of them is "mRNA Secondary Structure of 3' and 5' UTRs". There are several display formats for the predicted structures: - "Picture": produces a PDF-file of the structure. You need to have a program installed capable of displaying PDF-files like Adobe Acrobat. - "PostScript": produces a PS-format of the structure. You need to have a program installed capable of displaying PS-files like GSview. - "Text": produces a "string in bracket notation"-format of the structure. 2.2. Alifold: This server will predict the consensus secondary structures for a set of aligned single stranded RNA or DNA sequences. Usage is almost identical to that of the RNAfold service. Instead of a single input sequence, a precomputed sequence alignment (which must be in ClustalW format) is uploaded via the input form. Results are again visualized in Postscript plots. 2.3. RNAinverse: This program searches for RNA sequences that will fold into a predicted secondary structure, which is the inverse of the structure prediction problem. The input consists of the target structure in bracket notation. Two different fold algorithms may be used, you can either just search for sequences with your target as minimum free energy (mfe) structure, or sequences which strongly prefer your target structure (via calculation of the partition function). |
| Zuker Group (RPI, Troy, New York) including: Mfold Mfold web server |
The Zuker Group
is the group of Michael
Zuker, professor of mathematical sciences at the RPI, Rensselaer Polytechnic
Institute at Troy, New York.
Michael Zuker
is best known for his work on algorithms for predicting RNA and DNA
secondary structure. 1. Mfold is a widely known and used program for RNA folding prediction. Mfold can be downloaded for UNIX and Linux. 2. Mfold web server is the web interface for the mfold program for RNA folding prediction. Please note that there is also a separate web interface of mfold for DNA folding prediction. |
| RNA Processing
|
|
| NOTE: RNA Processing defines the conversion of a precursor RNA into its mature form. As there are several different mechanisms involved, like Splicing, Capping, and Polyadenylation, also different bioinformatics resources have emerged which address these topics. Please note that the processes of Splicing and Polyadenylation are tightly coupled to the generation of alternative transcripts which in turn are often expressed in a selective tissue-specific manner. Thus, alternative splicing and polyadenylation databases are listed in section "ESTs, Splicing, Polyadenylation" in main section "Expression". |
| UTRs | |
| NOTE:
The untranslated regions (UTRs)
of mRNAs often contain specific regulatory motifs which can influence
mRNA stability, localization, and translation. Please note that
specific motifs might be described elsewhere, like those related to polyadenylation, which are listed in section
"ESTs, Splicing, Polyadenylation" in main section "Expression". |
|
| ARED (KFSH&RC, Riyadh, Saudi Arabia) |
ARED, the AU-Rich Element
Database, stores information on adenylate-uridylate rich elements (AREs)
in human mRNAs. ARED is maintained at the King Faisal Specialist Hospital
& Research Centre (KFSH&RC)
in Riyadh, Saudi Arabia. ARED contains GenBank entries where the 3'UTR
matches the ARE motif, a 13-bp pattern WWWUAUUUAUWW (W=A/U), which was
computationally derived from a list of functionally labile
ARE-containing mRNAs. NOTE: ARED demonstrates that ARE-mRNAs represent as much as 5-8% of human genes, but ARED contains computationally predicted ARE-mRNAs, there is no evidence how many of them are actually regulated by this mechanism. AREs are known to be recognized by specific proteins and / or small regulatory RNAs which dramatically influence the stability of the mRNA. Most of them are negative regulators (like ZFP36) which promote mRNA decay, but also positive regulators exist which stabilize the target mRNA (like HuR). Known examples of ARE-dependent regulation are the mRNAs of TNFalpha, PTGS2 (COX2), CSF2 (GMCSF), and IL3. Thus, several diseases like chronic inflammatory conditions exist which are known to be caused by stabilized ARE-mRNAs. Query: There are already 3 different versions of the ARED database. While v1 and v2 only support single queries (gene names, IDs, mRNA acc., RefSeq, UniGene, etc.), v3 also supports batch queries using e.g. a list of gene names from a microarray experiment. NOTE: The list has to be pasted in column-format (like copied from Excel), not as space-delimited text ! Note: When using the "Advanced search" option, you may also browse the (long) lists of ARE-mRNAs by selecting an ARE cluster and leaving all other fields empty. Output: In case you submitted a list of genes ARED will produce a table which presents all genes with predicted AREs in their 3'UTRs stored in ARED database. Note that the actual sequences are NOT shown, only the "Class" and the "Cluster" of the respective AREs. ARE-mRNAs are clustered according to the length of the individual AREs: Cluster 1 mRNAs contain 5 continuous AREs, Cluster 2 contain 4, and Cluster 5 contain 1 ARE in a 13-bp ARE context. Note: The result table may be saved as tab-delimited txt-file, which can easily be opened in Excel. Note: The authors of ARED state that the database is available as single GenBank flat file (i.e. nucleotide sequence with annotation) upon request. |
| UTResource (ITB, Bari, Italy) including: UTRdb UTRSite UTRScan PatSearch |
UTResource
is a portal for
internet resources for sequence
analysis of 5' and 3' untranslated regions of eukaryotic mRNAs. It contains links to the following databases/programs: UTRdb: A specialized sequence collection, deprived from redundancy, of 5' and 3' UTR sequences from eukaryotic mRNAs. UTRSite: A collection of functional sequence patterns located in 5' or 3' UTR sequences. UTRdb and UTRSite allow SRS-based queries of UTR databases. This means that the user can do keyword searches like "AUUUA". UTRScan: Looks for UTR functional elements by searching through user submitted query sequences for the patterns defined in the UTRsite collection. PatSearch: The PatSearch program is a pattern matching tool, that can find a well defined pattern against a given sequence(s) or database (primary or specialized) divisions. The user can define a pattern and select a database to search against. NOTE: the databases comprise not only UTR databases of diverse species but many others like EST, GSS, HTGS, Swissprot, TREMBL,... NOTE: In order to use the programs UTRScan and PatSearch, a registration form has to be completed. The programs can be accessed after Email confirmation. |
| Non-coding RNA
|
|
| NOTE: There are several types of non-coding RNA, meaning RNA which does not code for a protein sequence, like ribosomal RNA, t-RNA, and snoRNA. As short RNA molecules gain increasing attention as important regulators of gene expression, also bioinformatics resources arise which cover this topic. Small interfering RNA (siRNA or RNAi) and microRNA (miRNA) have been identified as sequence-specific posttranscriptional regulators of expression. While siRNA is generated from dsRNA produced by viruses or activated transposons, miRNA is transcribed from miRNA genes located in the genome. | |
| miRBase (Sanger) |
miRBase is the
new home for microRNA data, incorporating the database and gene naming roles
previously provided by the miRNA Registry, and including the new
miRBase Target database. miRBase contains 3 main sections: 1. miRBase Sequences contains all published miRNA sequences, genomic locations and associated annotation. Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR). Both hairpin and mature sequences are available for searching using BLAST and SSEARCH, and entries can also be retreived by name, keyword, references and annotation. All sequence and annotation data are also available for download. Note that the predicted stem-loop sequences in the database are not strictly precursor miRNAs (pre-miRNAs), but include the pre-miRNA and some flanking sequence from the presumed primary transcript. 2. miRBase Targets is a web resource developed by the Enright Lab at the Wellcome Trust Sanger Institute containing computationally predicted targets for microRNAs across many species. The miRNA sequences are obtained from the miRBase Sequence database and most genomic sequence from EnsEMBL. This resource aims to provide the most up-to-date and accurate predictions of miRNA targets and hence this resource will be updated regularly to incorporate new miRNAs or EnsEMBL sequences. 3. miRBase Registry provides a confidential service assigning official names for novel miRNA genes prior to publication of their discovery. Typical miRBase accessions: refer to section miRBase IDs. |
| PicTar (New York University) |
PicTar
is an algorithm for
the identification of microRNA targets. This searchable website
provides details (3' UTR alignments with predicted sites, links to
various public databases etc) regarding microRNA target predictions in
vertebrates and microRNA
target predictions across seven Drosophila species. PicTar can
be used BOTH for predicting the targets of a certain microRNA OR for
predicting the microRNAs which may target a specific mRNA of interest. Query: The user may choose: 1. an organism, a dataset (note that evolutionary conservation is considered !), a certain microRNA ID for which the potential targets shall be presented, 2. OR a certain gene ID for which the potential matching microRNAs shall be predicted. Output: 1. A list of potential target genes ranked by a specific PicTar score, with links to RefSeq and to the custom view of UCSC Genome Browser, displaying the PicTar miRNA prediction sites. 2. A multiple species alignment of the cDNA of the chosen gene, highlighting the positions of individual predicted miRNA sites. |
| Rfam (Sanger and Washington University) |
Rfam is a joint
project involving researchers
based at the Wellcome Trust Sanger
Institute, and Washington
University, St. Louis (also providing a Rfam mirror site). Rfam is a large
collection of multiple
sequence alignments and covariance models covering many common non-coding
RNA families. For each family in Rfam you can:
View and download multiple sequence alignments, read family annotation,
examine species distribution of family members, and follow links to
other databases. Rfam provides: - a keyword search allowing to query using any keyword, like "miR-16". - a sequence search facility to analyze a DNA query sequence to find Rfam family matches. - In conjunction with the INFERNAL software suite, Rfam can be used to annotate sequences (including complete genomes) for homologues to known non-coding RNAs. Please read important information about using Rfam for genome annotation. Rfam provides pre-calculated lists of putative RNAs in over 100 complete genomes, Rfam makes use of a large amount of available data, especially published multiple sequence alignments, and repackages these data in a single searchable and sustainable resource. Rfam makes every effort to credit individual sources on family pages, which are also listed here. Typical Rfam accessions: refer to section Rfam IDs. |
| siDESIGN (Dharmacon, Inc., Chicago) |
siDESIGN
is an advanced, user-friendly siRNA design tool, provided for free
from Dharmacon, a company
based on commercializing siRNA technology. siDESIGN is described to
significantly improve the likelihood of identifying functional siRNA
when compared to other publicly available design tools. The siDESIGN
Center builds on early guidelines and adds eight additional criteria
developed by Dharmacon scientists. The siDESIGN Center offers the flexibility of defining specific target regions, adjusting certain design criteria, and selecting BLAST. Ranked lists of candidate siRNA sequences are provided along with siRNA sequences for all designs. |
| siRNA selector (Wistar Inst., Philadelphia) |
siRNA selector
scans a target gene for candidate siRNA sequences that satisfy
user-adjustable rules. Small interfering RNA (siRNA) guides
sequence-specific degradation of the homologous mRNA, thus producing
"knock-down" cells. The program evaluates: 1. siRNA functionality by using empirical rules developed by Dharmacon, Amgen and University of Massachusetts. 2. siRNA specificity by blasting each sequence against UniGene. In addition, non-specific scrambled controls can be designed using "scrambled" option. |
| TargetScan (MIT) |
TargetScan
is a portal at MIT storing several datasets of predictions of
microRNA targets, either targeting only the 3'-UTRs or also
targeting the ORF regions. TargetScan
can be used BOTH for
predicting the targets of a certain microRNA OR for predicting the
microRNAs which may target a specific mRNA of interest. Query: The user may choose: 1. a microRNA family (like "miR-15/16/195") in order to predict the targets of this family 2. OR enter a certain human EntrezGene ID (like 5743 for human PTGS2) for which the potential matching microRNAs shall be predicted. NOTE: In fact, a search for the gene name (PTGS2) was successful here but NOT using "5743"! Output: 1. A list of potential target genes ranked by an EFDR (estimated false discovery rate) score, with links to NCBI sequence database and to UCSC Genome Browser. Note that this list contains quite detailed summaries of the individual genes functions. 2. A tabular list of matching microRNA families to the mRNA of interest, with links to Rfam and to UCSC Genome Browser. |