Release

ProPhylER 1.0 is live now.

News

January 5 2010
The ProPhylER paper is now published in Genome Research

March 12 2010
Searching by name is now supported on the search page

March 12 2010
Searching with hg 18 coordinates for evaluating coding SNPs is now supported

Contacts

prophyler [at] prophyler.org
arend [at] stanford.edu

Resource Links

Ensembl
Uniprot
PDB
WuBlast
Probcons
Semphy
Jmol
Java

Other Links

Sidow Lab
Stanford Pathology Dept
Stanford Genetics Dept
Stanford School of Medicine

Funded by

NIH/NHGRI

Searching ProPhylER

ProPhylER contains only Eukaryotic sequences:

  1. Protein sequences from Uniprot
  2. Predicted proteins from the Ensembl genomes listed on the search page

You can search ProPhylER with an Amino Acid Sequence (by BLAST) or with a Database Identifier.

You must have either a sufficiently long sequence snippet or an Ensembl, Uniprot, PDB, or ProPhylER ID for your protein. Searches by gene or protein name are not currently implemented.

Searching by BLAST

The BLAST search is tuned to be fast and to find your exact protein or a very close match to it. It is not designed to give you matches to distantly related proteins.

If this is the first time you are using ProPhylER, or if you have used ProPhylER before but are now interested in a different protein, we recommend searches by BLAST because you don't have to dig for a database identifier.

Pros of Blast Search:

  1. If you are used to working with a genbank or refseq sequence or the like, using the BLAST search is the safest and most effective way of getting you the ProPhylER cluster that contains your protein.
  2. If you are interested in a protein from a genome that is not among the Ensembl genomes we have used, but that is likely to have a very close homolog in ProPhylER, the BLAST search may find that homolog, which will save you some effort to identify the homolog by other means. An example would be a dog sequence, which may well match its human or mouse ortholog in ProPhylER.

Cons of Blast Search:

  1. It is slower than an Identifier Search.
  2. In contrast to an Identifier Search, BLAST Searches are not guaranteed to yield a unique cluster. While unlikely, it is possible that your query sequence matches sequences in two or more clusters (for example, when you search with a Hox homeodomain). The Search Results page will give you all matching clusters, ranked by match score, and you have to decide which one is correct.
  3. Your query is likely to match more than one sequence in a cluster. The Results page will show a single cluster, but it is possible that ProPhylER will pick a different reference sequence from the one you expected for displaying the data in the Interface. Thus, you may need to choose a different reference sequence in the Selector pane of the Interface (which is described here).

Searching by Uniprot Accession or Ensembl Identifier

If you know the Uniprot accession code or the Ensembl gene or peptide ID, you can enter that in the appropriate field on the search page. The search page has examples of the identifiers ProPhylER supports.

Pros of ID Search:

  1. It is unambiguous (almost).
  2. It will specify the 'right' reference sequence for display in the Interface.
  3. It is fast.

Cons of ID Search:

  1. It is actually not completely unambiguous because of splice isoforms. When you enter an Ensembl ID of a gene or protein that has different isoforms due to alternative splicing, ProPhylER will give you a successful search but the sequence used in the alignment may be different from the one you expected. (There is unfortunately no way around this.) In fact, this is less of a Con for searching by ID than a fundamental limitation of ProPhylER.
  2. Your search may fail even though a BLAST search with the same sequence may be successful. Why? Because databases sometimes update their identifiers, and ProPhylER has an inevitable lag time to update them. In fact, such updates are an as-yet unsolved challenge in ProPhylER that we hope to address in the near future.

Searching by PDB ID

If you have a favorite crystal structure, you can enter the PDB accession code and ProPhylER will do some digging for you: it goes to PDB, gets the file, parses out the sequences, Blasts the ProPhylER sequences, and finds the best matches. These matches are presented as links to the Crystal Painter as well as the Interface. Note that certain structures may contain more than one distinct protein, so the Results page may contain one link to the Crystal Painter but several distinct links to different Interface sessions.

Searching by ProPhylER Cluster ID

A successful ProPhylER ID search will only display a link to an Interface session. ProPhylER will not look for PDB matches in this search mode (because a cluster contains many sequences and it would take too long to Blast PDB with all of them).

This search mode is most efficient but you need to know the ProPhylER cluster ID from a previous session. Also note that ProPhylER will choose a default reference sequence (usually a human sequence).


Home | Overview | Stats | Search | Help | Documentation | People | Site Map

Last updated 8/25/08