Bioinformatics World    
         
 Main Index -> RNA
                -> RNA Linkpages
                -> RNA Structure
                -> RNA Processing
                -> UTRs
                -> Non-coding RNA
 
              
                     
Navigate    AtoZ   Search this Site   Site Journal    FAQ Index   Main Index   Appendix       
                 
RNA Linkpages
MedDB
(Medical University of Vienna, Austria)
MedDB, created at the Medical University of Vienna, Austria, provides a very good, concise, compact overview of bioinformatics resources, listed in well-structured categories. MedDB also contains two sub-sections of the section "Genomics" called "RNA", and "Splicing".
Please refer also to this main section of MedDB. 
RNA World Website
(IMB, Jena, Germany)
The RNA World Website maintained at the IMB, Jena, Germany, is a good starting point for information on RNA structures as well as RNA related information in general. Different sections include databases and web tools, software, online books and tutorials, meetings, and more.  
             
           
RNA Structure
Vienna RNA Package
(TBI, University of Vienna)

including:
Vienna RNA secondary structure server

RNAfold

Alifold

RNAinverse




1. The Vienna RNA Package was developed for the prediction and comparison of RNA secondary structures at the Theoretical Biochemistry Group (TBI) of the University of Vienna, Austria. The package is free software and can be downloaded as C source code that should be easy to compile on almost any flavor of Unix and Linux. Note: This package developed for UNIX command-line use; there are no graphical user interfaces.

2. Nevertheless, the Vienna RNA secondary structure server offers access to the most popular features of the Vienna RNA Package via easy to use web interfaces. Note that these servers in general have to limit request sizes for performance reasons.

2.1. RNAfold: This is the web interface to the RNAfold program. This server will predict secondary structures of single stranded RNA or DNA sequences. Thus, RNAfold provides both the most basic and most widely used function. The output presents the predicted mfe (minimum free energy) structure both as a string in bracket notation and links to the plots generated for visualization. Plots are produced in Postscript format. A suitable alternative is the new standard for Scalable Vector Graphics, SVG. For this purpose, the browser has to be equipped with a SVG plugin (typically from Adobe).

NOTE: The UCSC Genome Bioinformatics site provides pre-computed structures of 5'- and 3'-UTR regions of all RNAs, which were produced using RNAfold. The estimated folding energy is in kcal/mol. The more negative the energy, the more secondary structure the RNA is likely to have.
As there are no stable URLs of individual gene entries in UCSC, you may follow this example: Open the UCSC Gene Sorter and search for the human gene PTGS2. You will retrieve a table where you can access the specific gene entry via the link "Description" (last column). There are several sections in this PTGS2-specific file, one of them is "mRNA Secondary Structure of 3' and 5' UTRs". There are several display formats for the predicted structures:
- "Picture": produces a PDF-file of the structure. You need to have a program installed capable of displaying PDF-files like Adobe Acrobat.
- "PostScript": produces a PS-format of the structure. You need to have a program installed capable of displaying PS-files like GSview.
- "Text": produces a "string in bracket notation"-format of the structure.

2.2. Alifold: This server will predict the consensus secondary structures for a set of aligned single stranded RNA or DNA sequences. Usage is almost identical to that of the RNAfold service. Instead of a single input sequence, a precomputed sequence alignment (which must be in ClustalW format) is uploaded via the input form. Results are again visualized in Postscript plots.

2.3. RNAinverse: This program searches for RNA sequences that will fold into a predicted secondary structure, which is the inverse of the structure prediction problem. The input consists of the target structure in bracket notation. Two different fold algorithms may be used, you can either just search for sequences with your target as minimum free energy (mfe) structure, or sequences which strongly prefer your target structure (via calculation of the partition function).
Zuker Group
(RPI, Troy, New York)

including:
Mfold

Mfold web server
The Zuker Group is the group of Michael Zuker, professor of mathematical sciences at the RPI, Rensselaer Polytechnic Institute at Troy, New York. Michael Zuker is best known for his work on algorithms for predicting RNA and DNA secondary structure.

1. Mfold is a widely known and used program for RNA folding prediction. Mfold can be downloaded for UNIX and Linux.

2. Mfold web server is the web interface for the mfold program for RNA folding prediction. Please note that there is also a separate web interface of mfold for DNA folding prediction.


RNA Processing
NOTE: RNA Processing defines the conversion of a precursor RNA into its mature form. As there are several different mechanisms involved, like Splicing, Capping, and Polyadenylation, also different bioinformatics resources have emerged which address these topics. Please note that the processes of Splicing and Polyadenylation are tightly coupled to the generation of alternative transcripts which in turn are often expressed in a selective tissue-specific manner. Thus, alternative splicing and polyadenylation databases are listed in section "ESTs, Splicing, Polyadenylation" in main section "Expression". 


UTRs
NOTE: The untranslated regions (UTRs) of mRNAs often contain specific regulatory motifs which can influence mRNA stability, localization, and translation. Please note that specific motifs might be described elsewhere, like those related to polyadenylation, which are listed in section "ESTs, Splicing, Polyadenylation" in main section "Expression".
ARED
(KFSH&RC, Riyadh, Saudi Arabia)
ARED, the AU-Rich Element Database, stores information on adenylate-uridylate rich elements (AREs) in human mRNAs. ARED is maintained at the King Faisal Specialist Hospital & Research Centre (KFSH&RC) in Riyadh, Saudi Arabia. ARED contains GenBank entries where the 3'UTR matches the ARE motif, a 13-bp pattern WWWUAUUUAUWW (W=A/U), which was computationally derived from a list of functionally labile ARE-containing mRNAs.
NOTE:
ARED demonstrates that ARE-mRNAs represent as much as 5-8% of human genes, but ARED contains computationally predicted ARE-mRNAs, there is no evidence how many of them are actually regulated by this mechanism.

AREs are known to be recognized by specific proteins and / or small regulatory RNAs which dramatically influence the stability of the mRNA. Most of them are negative regulators (like ZFP36) which promote mRNA decay, but also positive regulators exist which stabilize the target mRNA (like HuR). Known examples of ARE-dependent regulation are the mRNAs of TNFalpha, PTGS2 (COX2), CSF2 (GMCSF), and IL3. Thus, several diseases like chronic inflammatory conditions exist which are known to be caused by stabilized ARE-mRNAs.

Query:
There are already 3 different versions of the ARED database. While v1 and v2 only support single queries (gene names, IDs, mRNA acc., RefSeq, UniGene, etc.), v3 also supports batch queries using e.g. a list of gene names from a microarray experiment. NOTE: The list has to be pasted in column-format (like copied from Excel), not as space-delimited text !
Note: When using the "Advanced search" option, you may also browse the (long) lists of ARE-mRNAs by selecting an ARE cluster and leaving all other fields empty.

Output:
In case you submitted a list of genes ARED will produce a table which presents all genes with predicted AREs in their 3'UTRs stored in ARED database. Note that the actual sequences are NOT shown, only the "Class" and the "Cluster" of the respective AREs. ARE-mRNAs are clustered according to the length of the individual AREs: Cluster 1 mRNAs contain 5 continuous AREs, Cluster 2 contain 4, and Cluster 5 contain 1 ARE in a 13-bp ARE context.
Note: The result table may be saved as tab-delimited txt-file, which can easily be opened in Excel.
Note: The authors of ARED state that the database is available as single GenBank flat file (i.e. nucleotide sequence with annotation) upon request.
UTResource
(ITB, Bari, Italy)

including:
UTRdb

UTRSite

UTRScan

PatSearch
UTResource is a portal for internet resources for sequence analysis of 5' and 3' untranslated regions of eukaryotic mRNAs.

It contains links to the following databases/programs:
UTRdb: A specialized sequence collection, deprived from redundancy, of 5' and 3' UTR sequences from eukaryotic mRNAs.
UTRSite: A collection of functional sequence patterns located in 5' or 3' UTR sequences.
UTRdb and UTRSite allow SRS-based queries of UTR databases. This means that the user can do keyword searches like "AUUUA".

UTRScan: Looks for UTR functional elements by searching through user submitted query sequences for the patterns defined in the UTRsite collection.
PatSearch: The PatSearch program is a pattern matching tool, that can find a well defined pattern against a given sequence(s) or database (primary or specialized) divisions. The user can define a pattern and select a database to search against. 
NOTE: the databases comprise not only UTR databases of diverse species but many others like EST, GSS, HTGS, Swissprot, TREMBL,...
NOTE: In order to use the programs UTRScan and PatSearch, a registration form has to be completed. The programs can be accessed after Email confirmation.


Non-coding RNA
NOTE: There are  several types of non-coding RNA, meaning RNA which does not code for a protein sequence, like ribosomal RNA, t-RNA, and snoRNA. As short RNA molecules gain increasing attention as important regulators of gene expression, also bioinformatics resources arise which cover this topic. Small interfering RNA (siRNA or RNAi) and microRNA (miRNA) have been identified as sequence-specific posttranscriptional regulators of expression. While siRNA is generated from dsRNA produced by viruses or activated transposons, miRNA is transcribed from miRNA genes located in the genome. 
miRBase
(Sanger)
miRBase is the new home for microRNA data, incorporating the database and gene naming roles previously provided by the miRNA Registry, and including the new miRBase Target database.

miRBase contains 3 main sections:
1. miRBase Sequences contains all published miRNA sequences, genomic locations and associated annotation. Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR). Both hairpin and mature sequences are available for searching using BLAST and SSEARCH, and entries can also be retreived by name, keyword, references and annotation. All sequence and annotation data are also available for download.
Note that the predicted stem-loop sequences in the database are not strictly precursor miRNAs (pre-miRNAs), but include the pre-miRNA and some flanking sequence from the presumed primary transcript.
2. miRBase Targets is a web resource developed by the Enright Lab at the Wellcome Trust Sanger Institute containing computationally predicted targets for microRNAs across many species. The miRNA sequences are obtained from the miRBase Sequence database and most genomic sequence from EnsEMBL. This resource aims to provide the most up-to-date and accurate predictions of miRNA targets and hence this resource will be updated regularly to incorporate new miRNAs or EnsEMBL sequences.
3. miRBase Registry provides a confidential service assigning official names for novel miRNA genes prior to publication of their discovery.

Typical miRBase accessions: refer to section miRBase IDs.
PicTar
(New York University)
PicTar is an algorithm for the identification of microRNA targets. This searchable website provides details (3' UTR alignments with predicted sites, links to various public databases etc) regarding microRNA target predictions in vertebrates and microRNA target predictions across seven Drosophila species. PicTar can be used BOTH for predicting the targets of a certain microRNA OR for predicting the microRNAs which may target a specific mRNA of interest.

Query:
The user may choose:
1. an organism, a dataset (note that evolutionary conservation is considered !), a certain microRNA ID for which the potential targets shall be presented,
2. OR a certain gene ID for which the potential matching microRNAs shall be predicted.

Output:
1. A list of potential target genes ranked by a specific PicTar score, with links to RefSeq and to the custom view of UCSC Genome Browser, displaying the PicTar miRNA prediction sites.
2. A multiple species alignment of the cDNA of the chosen gene, highlighting the positions of individual predicted miRNA sites.
Rfam
(Sanger and Washington University)
Rfam is a joint project involving researchers based at the Wellcome Trust Sanger Institute, and Washington University, St. Louis (also providing a Rfam mirror site). Rfam is a large collection of multiple sequence alignments and covariance models covering many common non-coding RNA families. For each family in Rfam you can: View and download multiple sequence alignments, read family annotation, examine species distribution of family members, and follow links to other databases.
Rfam provides:
- a keyword search allowing to query using any keyword, like "miR-16".
- a sequence search facility to analyze a DNA query sequence to find Rfam family matches. 
- In conjunction with the INFERNAL software suite, Rfam can be used to annotate sequences (including complete genomes) for homologues to known non-coding RNAs. Please read important information about using Rfam for genome annotation. Rfam provides pre-calculated lists of putative RNAs in over 100 complete genomes,

Rfam makes use of a large amount of available data, especially published multiple sequence alignments, and repackages these data in a single searchable and sustainable resource. Rfam makes every effort to credit individual sources on family pages, which are also listed here.

Typical Rfam accessions: refer to section Rfam IDs.
siDESIGN
(Dharmacon, Inc., Chicago)
siDESIGN is an advanced, user-friendly siRNA design tool, provided for free from Dharmacon, a company based on commercializing siRNA technology. siDESIGN is described to significantly improve the likelihood of identifying functional siRNA when compared to other publicly available design tools. The siDESIGN Center builds on early guidelines and adds eight additional criteria developed by Dharmacon scientists.

The siDESIGN Center offers the flexibility of defining specific target regions, adjusting certain design criteria, and selecting BLAST. Ranked lists of candidate siRNA sequences are provided along with siRNA sequences for all designs.
siRNA selector
(Wistar Inst., Philadelphia)
siRNA selector scans a target gene for candidate siRNA sequences that satisfy user-adjustable rules. Small interfering RNA (siRNA) guides sequence-specific degradation of the homologous mRNA, thus producing "knock-down" cells.
The program evaluates:
1. siRNA functionality by using empirical rules developed by Dharmacon, Amgen and University of Massachusetts.
2. siRNA specificity by blasting each sequence against UniGene.
In addition, non-specific scrambled controls can be designed using "scrambled" option.
TargetScan
(MIT)

TargetScan is a portal at MIT storing several datasets of predictions of microRNA targets, either targeting only the 3'-UTRs or also targeting the ORF regions. TargetScan can be used BOTH for predicting the targets of a certain microRNA OR for predicting the microRNAs which may target a specific mRNA of interest.

Query:
The user may choose:
1. a microRNA family (like "miR-15/16/195") in order to predict the targets of this family
2. OR enter a certain human EntrezGene ID (like 5743 for human PTGS2) for which the potential matching microRNAs shall be predicted. NOTE: In fact, a search for the gene name (PTGS2) was successful here but NOT using "5743"!

Output:
1. A list of potential target genes ranked by an EFDR (estimated false discovery rate) score, with links to NCBI sequence database and to UCSC Genome Browser. Note that this list contains quite detailed summaries of the individual genes functions.
2. A tabular list of matching microRNA families to the mRNA of interest, with links to Rfam and to UCSC Genome Browser.