| Structures Linkpages |
|
| Linkpage
1 - Molecular Modeling and Visualization (SWBIC) |
The SWBIC
provides an excellent
collection to resources concerning "Molecular Modeling and
Visualization". You will find descriptions/tutorials for
programs like Chime, RasMol, Protein Explorer, as well as
programs integrated in web pages like Cn3D, or Swiss PdbViewer. |
| Structure Databases | |
| MMDB (NCBI) |
The NCBI Structure group
maintains MMDB,
a
database of macromolecular 3D structures, as well as tools
for their visualization and comparative analysis. MMDB, the Molecular
Modeling Database, contains experimentally determined biopolymer
structures obtained from the Protein Data Bank (PDB). MMDB is a subset
of three-dimensional structures obtained from the PDB, excluding
theoretical models.The goal of Entrez's 3D-structure database is to
make structure information easily accessible to molecular biologists.
For this purpose, Entrez provides: links between protein
sequence and structure databases, precomputed sequence and structure
neighbours, and structure and sequence/structure alignment
visualization. 1. Query MMDB: 1.1. The structure database may be queried directly, using specific fields such as author names, or text terms occurring anywhere in the structure description. Entry points for queries are the Search Bar at the top of all Structure Group WWW pages or the WWW-Entrez interface to the 3-D structure database. 1.2. Alternatively you can use a PDB 4-character code or a numerical MMDB-Id to retrieve structure summary pages directly. Example: 1j46 (PDB); 17220 (MMDB) 2. MMDB entry: 2.1. "Structure Summary" page: A MMDB entry provides basic information like description, PubMed reference, and taxonomy. In addition, several links are available: PDB entry, Structure Neighbours (VAST), protein domain data (CDD) and links to protein and nucleotide sequences in Entrez. Note that especially this latter point is very useful and somewhat better solved than at the RCSB PDB website. 2.2. VAST (Structure Neighbours): (see also VAST main section) VAST is an algorithm which allows the comparison of 3-dimensional protein structures and therefore the generation of "Structural neighbours" datasets. On average, there are over 600 structure neighbours for each 3D domain in MMDB. To help identify neighbours that provide useful annotation, Entrez's "VAST Summary" provides a series of controls for selecting and sorting structure neighbours. 2.3. Structure Visualization: 2 options are available: Note that although RCSB PDB offers more different types of viewers, Cn3D is a very powerful and easy-to-use application ! 2.3.1. Rasmol: needs local installation 2.3.2. Cn3D: needs local installation Please refer to section "Visualization Software". 3. Examples: MMDB can also be used very efficiently if you want to retrieve structures of drugs in complex with their targets. Note: The following examples are used in section "Small Molecules Databases" for test purposes to compare different resources! - All 3 test gene names (PTGS2, TP53, SELE) are found in Entrez Structure displaying various protein structures. Note: The term "SELE" retrieves no hits, so it is better to use the alternate gene symbol "ELAM". - All 3 test drug names (Aspirin, Diclofenac, Celebrex) are found in Entrez Structure. Note: Celebrex is only found via its synonym "Celecoxib". - All 3 test disease names (Atherosclerosis, Alzheimer, Inflammation) are found in Entrez Structure, displaying lists of protein structures related to the diseases. Typical MMDB accessions: refer to section MMDB IDs. |
| MSD - Macromolecular Structure Database (EBI) |
MSD, which was started in
1996, is the EBI Macromolecular Structure Database - the European
project for the collection, management and distribution of data about
macromolecular structures, derived in part from the Protein Data Bank (PDB).
MSD is one of the three member organizations that participate in Worldwide Protein Data Bank (wwPDB),
a collaborative effort to provide a single, consistent PDB archive,
which is publicly available and provides easy-to-use data retrieval and
analysis. For this purpose, EBI performs "cleaning procedures" of the
source database PDB, to ensure data uniformity across the whole
archive. In general, MSD is extensively linked to other EBI databases
like InterPro, GO, and Swiss-Prot, together with links to SCOP, CATH,
Pfam, and PROSITE. 1. Search MSD: 1.1. MSDlite: - Several ID types are supported as input: PDB, PubMed, SCOP, CATH, UniProt, EC-number, Pfam, InterPro, GO. - Experiment type: X-ray, NMR, theoretical models, and more. - Text Search - Keyword Search - Sequence: By pasting a plain sequence into the text area, you can run a FASTA search against the sequences of all structures in the PDB. Your search will then be restricted to only those sequences which closely match you input sequence. The sequence must be atleast 20 charactes long. NOTE: The MSD Site Index lists many more Search Options for MSD data. 2. MSD Entry: 2.1. Summary: overview like source, expression system, chains 2.2. Sequence: this is a very clear, user-friendly view of the sequences which are part of a structure. In addition, secondary structure prediction results are displayed, which are derived from implementation of the programs DSSP and PROMOTIF. 2.3. Similarity: - compare the PDB entry to the ENTIRE PDB using MSDfold - view related PDB entries - links to SCOP and CATH entries 2.4. Visualization: the available tools for structure visualization are: - AstexViewer: Java program - Jena image library: links to a Jmol view of the structure (Java program) - Rasmol: needs local installation |
| PDB - Protein
Data Bank (RCSB) |
The PDB, Protein Data
Bank, is the single worldwide archive of structural data of
biological macromolecules. Thus, the PDB is the most comprehensive
place to look for structures of proteins, DNA, RNA, and
polysaccharides. PDB was established at Brookhaven National
Laboratories (BNL) in 1971 as an archive for biological macromolecular
crystal structures. Beginning in the 1980s the number of deposited
structures increased dramatically, peaking today with the large-scale
data submissions of the many structural genomics initiatives.
In 1998, the RCSB (Research
Collaboratory for Structural Bioinformatics) became responsible of the
PDB with the aim to maintain a stable esource facilitating structural
data deposit and analysis. Since 2003, RCSB PDB is one of the three
members of the Worldwide Protein Data
Bank (wwPDB) consortium whose mission is to ensure that the PDB
archive remains an international resource with uniform data.(please
refer to the wwPDB section). 1. Structural assignment techniques represented in PDB: - X-ray crystal structure determination - NMR-Nuclear Magnetic Resonance - Cryoelectron microscopy - Theoretical Modeling 2. PDB data submission and processing: A key component of creating PDB is data processing, the efficient capture and curation of data. Data processing consists of data deposition, annotation, and validation. Data are submitted via the AutoDep Input Tool (ADIT). ADIT is built on top of the mmCIF dictionary which is an ontology of 1700 terms that define the macromolecular structure and the crystallographic experiment, and a data processing program called MAXIT (MAcromolecular EXchange Input Tool). . 3. PDB file content: PDB files contain the primary data: atomic coordinates, structure factors and NMR restraints. Primary data contain also general information required for all deposited structures and information specific to the method of structure determination. 4. Search the RCSB PDB (selection): 4.1. Simple Search: simply enter a keyword, PDB ID or author name. 4.2. Advanced Search: Allows searches of all types - database fields, browsable ontologies, and text searches 4.3. Sequence Search: To search using a sequence, or by similarity to the sequence of a given PDB structure 4.4. Search Ligands: To search based on ligand or ligand substructure 4.5. Search Structural Genomics Targtes: Links to TargetDB and a listing of Structural Genomics Target info 5. Browse the RCSB PDB (selection): 5.1. Browse by Gene Ontology: The GO Annotation project has mapped PDB IDs and corresponding chain IDs to the GO terms. 3 different GO branches available: Biological Process, Cellular Component, and Molecular Function. 5.2. Browse by Enzyme Classification: Here you can browse an EC name, view the number of associated PDB structures, and search for the specific associated structures. 5.3. Browse by Disease: Here you can browse by Disease name, view the number of associated PDB structures, and search for the specific associated structures. Diseases may be searched using their names. 6. View PDB structures: There are several tools to view PDB files: KiNG Viewer, Jmol Viewer, WebMol Viewer, Protein Workshop: Java applets, no plugins required. Rasmol Viewer, Swiss-PDB Viewer: special plugins required, thus local installation needed ! Note: please refer also to section "Visualization Software" ! 7. RCSB PDB Structural Genomics Information Portal: Structural genomics is a worldwide initiative aimed at quickly determining a large number of protein structures using X-ray crystallographic and NMR technologies. These structures are being determined in a high throughput mode as the study of protein structure is central in molecular biology and in the understanding of disease. 7.1. Structural genomics initiatives: 7.1.1. Worldwide Structural Genomics Initiatives: Worldwide contributing centers with links to their resources, as well as reports for each center including target lists, target status progress, targets in the PDB, and sequence redundancy analyses. 7.2. Targets: 7.2.1. TargetDB: Information on the progress of the production and solution of structures. 7.2.2. PepcDB: PepcDB extends the content of TargetDB with status history, stop conditions, reusable text protocols and contact information collected from the PSI Centers. 7.3. Structures: 7.3.1. Functional Distributions: The distributions of the function is compared among SG structures, PDB structures, genomes, and homology models. Typical PDB accessions: refer to section PDB IDs. |
| wwPDB (RCSB, EBI, PDBj) |
The Worldwide Protein Data Bank (wwPDB)
consists of three member organizations that act as deposition,
data processing and distribution centers for PDB data. The founding
members are RCSB PDB
(USA), MSD-EBI
(Europe) and PDBj
(Japan). The mission of the wwPDB is to maintain a single Protein Data Bank Archive of macromolecular structural data that is freely and publicly available to the global community. This site provides information about services provided by the individual member organizations and about projects undertaken by the wwPDB |
| Structure Alignment | |
| VAST
- Vector Alignment Search Tool (NCBI) |
Protein
structure
neighbors in Entrez are determined by direct comparison of
3-dimensional protein structures with the VAST
algorithm. Each
of the
more than 18,000 domains in MMDB is compared to every other one. From
the MMDB structure summary pages, retrieved via Entrez,
structure neighbors are available for protein chains and individual
structural domains. VAST Search is a service that allows searching for structural neighbors starting with a set of 3D-coordinates specified by the user. This service is meant to be used with newly determined protein structures which are not yet part of MMDB. Structure neighbors for proteins already in MMDB have been pre-computed and can simply be looked up from MMDB's structure summary pages ! |
| Visualization Software | |
| Chime (MDL) |
Chime is
a
plugin, developed by MDL, which is
necessary to view and manipulate molecules directly in a web
browser (! also if using programs like PE !). Supported file types
include PDB. |
| Cn3D (NCBI) |
Cn3D
is a helper
application
for your web browser (comparable with Chime for Protein Explorer)
that allows you to view 3-dimensional structures from NCBI's Entrez
retrieval service. Cn3D runs on Windows, MacOS, and Unix. Cn3D
simultaneously displays structure,
sequence, an alignment. Cn3D is also a very nice tool to visualize an alignment of a protein sequence (in FASTA-format) "onto" an existing PDB structure molecule entry ! This allows the mapping of conserved residues onto the 3D structure, which might reveal functional sites and might show how the protein binds to ligands or other macromolecules. Thus, Cn3D is particularly useful in comparative analysis. NOTE: Please refer also to FAQ STRUC1 for a detailed description of Cn3D ! NOTE: Many other programs and databases also use Cn3D, like BIND, a database of protein-protein interactions. |
| Protein
Explorer (Massachusetts University) |
Protein
Explorer (PE),
provided by the University
of Massachusetts, is free
software for
visualizing the three-dimensional
structures of protein, DNA, and RNA macromolecules, and their
interactions
and binding of ligands, inhibitors, and drugs. It is arguably the
easiest-to-use software of its kind. Note: PE is also available (and may be faster) at the San Diego mirror site. PE supports nearly all of the RasMol and Chime commands, thereby providing a user-friendly menu-based interface for experts and novices alike to display PDB-files. PE also covers many special functions like the multiple-sequence alignment tool. |
| RasMol (Massachusetts University) |
RasMol, provided by the University of Massachusetts, is the "mother" of freely available, interactive programs for viewing and manipulation of molecules by an extensive set of RasMol commands but also via pull-down menus; RasMol is much harder to use effectively and considerably less powerful than Protein Explorer ! |
| Swiss-Pdb Viewer
(DeepView) and Swiss-Model (ExPASy) |
Swiss-PdbViewer
(DeepView) is an application that
provides
a user friendly interface allowing to analyze several proteins
at the same time. The proteins can be superimposed in order to deduce structural
alignments and compare their active sites or any other relevant
parts. Amino acid mutations, H-bonds, angles and distances between
atoms are easy to obtain thanks to the intuitive graphic and menu
interface. Note: Swiss-Pdb Viewer can be downloaded
for PC, Mac, or Linux. |
| 3D Prediction | |
| Linkpage 1 - ExPASy (SIB) |
ExPASy (Expert Protein
Analysis System) is a
proteomics server provided by the SIB
(Swiss
Institute of Bioinformatics). Among many other links, it provides also
a very good list
of
links concerning protein tertiary structure
prediction. Please also have alook at the ExPASy main section. |
| CPHmodels (CBS, Denmark) |
The CPHmodels
World Wide Web server predict protein structure using
comparative (homology) modelling. |
| DisEMBL
- protein disorder prediction (EMBL) |
DisEMBL is a
computational tool for prediction of disordered/unstructured
regions within a protein sequence. Although no clear definition
of disorder exists, the program uses 3 categories of disorder: - Loops/Coils: Residues as alpha-helix or beta-strand are considered as ordered, and all other states as loops (also known as coils). Loops / coils are not necessarily disordered, however protein disorder is only found within loops. - Hot Loops constitute a subset of the above, namely those loops with a high degree of mobility, i.e. coils with high temperature factors. - Missing coordinates in X-Ray structure as defined by REMARK-465 entries in PDB. Non assigned electron densities most often reflect intrinsic disorder, and have been used early on in disorder prediction. DisEMBL is easy to query, you either enter a valid SwissProt ID or AC, or a protein sequence. Predictions are shown according to each of the three definitions above. The predicted probabilities are shown as curves along the sequence and scores should always be compared to the corresponding random expectation value (dotted lines). NOTE: Avoiding potentially disordered segments in protein expression constructs can increase expression, foldability and stability of the expressed protein. DisEMBL is thus useful for target selection and the design of constructs as needed for many biochemical studies, particularly structural biology and structural genomics projects. NOTE: Disordered regions in proteins often contain short linear peptide motifs (e.g. SH3-ligands and targeting signals) that are important for protein function. Linear peptide sites are catalogued by ELM , also refer to ELM chapter. NOTE: Although the GlobPlot server also predicts protein disorder, the two methods complement each other as they offer different approaches/features. |
| GlobPlot - protein disorder
prediction (EMBL) |
GlobPlot
is a computational tool that allows the user to plot the tendency
within the query protein for order/globularity and disorder. Definition: Protein disorder can be described as the lack of regular secondary structure and a high degree of flexibility in the polypeptide chain. Ordered regions are often termed globular, and typically contain regular secondary structures packed into a compact globule. GlobPlot is easy to query, you either enter a valid SwissProt ID or AC, or a protein sequence. NOTE: Disordered regions are of growing interest, also reflected by the increasing number of IUPs (intrinsically unstructured / disordered proteins), like Tau, Bcl-2, and Prions. IUPs contain unfolded regions in the native state. |