![]() | |
| Functional & Structural Space (FSS) |
What is FSSHow to use FSSAcknowledgementAbout FSSFunctional & Structural Space (FSS) displays the gap between protein function and structure. The functional space of protein sequences is represented by familes and domains. The structural space of protein sequences is represented by PDB folds. The space between function and structure is calculated based on Pfam family & domain database, PDB biological macromolecular structure database and JCSG tracking and annotation database.Methodology The program hmmpfam from HMMER package is used to search Pfam families and domains within PDB protein sequences. We use Pfam HMM scores to evaluate if a protein sequence has a significant hit of some Pfam model. A simple color system is used to represent the Pfam HMM score system (green:trusted cutoff, yellow:gathering threshold, red:noise cut off, no: no hit). We can consider a hit very significant if it has a bitscore better than gathering threshold. Structural Coverage For each Pfam family and domain, we scan all of PDB sequences to check if this family is covered by PDB structures. If there is at least one PDB sequence that has a bitscore better than gathering threshold of the Pfam family, we consider that this family is covered by PDB structures. However, it is possible that only a fragment of a PDB sequence is solved in structure. There are also the missing residues in PDB structure file. The percent of structural coverage (%covp) reflects the difference between PDB real sequence submitted and atom sequence extracted from PDB coordinate data. In case of the low %covp, it is possible that the sequence range covering a Pfam model has not the corresponding structure in PDB structure file. covp (%) = (number of aligned atom residues - gaps)/(length of real sequence) * 100% Target Coverage In case of target analysis, a protein sequence that hits a Pfam model uncovered by PDB strcuture would be a good target for the goal to fill the gap between structure and function. How To Use FSSA FSS user interface is provided to retrieve Pfam families and domains. You can retrieve Pfam families by filters as PDB or JCSG target coverage or search a model using model accession or name. The operators AND (two words separated by white space) and OR (two words separated by ',') are supported in fields Keywords and Model name/acc.From the model list, you can click the color balls to check the sequence information such as seed information, structural coverage and target coverage for every model. AcknowledgementThe PDB sequences and structure data are downloaded from PDB FTP server.Domain and family search is done using HMMER program and PFAM database with expect (E-value cut-off) at 0.01. The Pfam protein families database. A. Bateman, E. Birney, R. Durbin, S.R. Eddy, K.L. Howe, and E.L.L. Sonnhammer Nucleic Acids Research, 28:263-266, 2000. Protein sequence information contains the annotation contents from both of JCSG and SWISS-PROT. The SWISS-PROT annotation page is accessed from SWALL database. If you have any comment or question, please contact Jie Ouyang at Joint Ceneter for Structural Genomics (JCSG) Bioinformatics Core. |