||Protein Sequence Comparative Analysis (PSCA)|
Protein Sequence Information|
Protein sequence information contains the annotation contents from both of JCSG and SWISS-PROT. The annotation page from SWISS-PROT and TrEMBL databases is accessed from SWALL (SPTR) on the EBI SRS server. Enzymatic and metabolic information of enzyme targets is accessible from KEGG.
A group of JCSG tools provides wide supports for target annotation. Data Acquisition Prioritization System DASP is to prioritize crystallized proteins for data acquisition at the Joint Center for Structural Genomics Structure Determination Core. Functional & Structural Space FSS and Target PDB Monitor TPM monitor the functional and structural coverage of protein targets.
Homologous search of protein sequences is done using BLAST and PSI-BLAST from NCBI BLAST with threshold (E-value cut-off) at 0.001. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997 Sep 1;25(17):3389-402.
The clustering of sequences is done using CD-HI program developed by Godzik Laboratory. Clustering of highly homologous sequences to reduce the size of large protein database, Weizhong Li, Lukasz Jaroszewski & Adam Godzik Bioinformatics, (2001) 17:282-283.
Domains and Families
Domain and family search is done using HMMER program and Pfam database with threshold (E-value cut-off) at 0.01. The Pfam protein families database. A. Bateman, E. Birney, R. Durbin, S.R. Eddy, K.L. Howe, and E.L.L. Sonnhammer Nucleic Acids Research, 28:263-266, 2000. Domain and family information is also collected from Interpro database. Mulder N.J., et al. The InterPro Database, 2003 brings increased coverage and new features. Nucl. Acids. Res. 31:315-318, 2003.
Secondary Structure Feature
Secondary structure prediction is done using JNET:A Neural Network Protein Secondary Structure Prediction Method. Cuff J. A and Barton G.J (1999) Application of enhanced multiple sequence alignment profiles to improve protein secondary structure prediction, Proteins 40:502-511.
Transmembrane helix prediction is done using TMHMM. TMHMM is a method for prediction transmembrane helices based on a hidden Markov model and developed by Anders Krogh and Erik Sonnhammer.
PDB Fold Similarity
The PDB sequences and structure data are downloaded from PDB FTP server. The Blast search is done against the pdbnr database that is a non-redundant PDB sequence database consisting of unique PDB sequences from the representative PDB chains. Each representative chain has the best structural coverage in the group of the same sequences.
Fold & Function Assignment System
Fold & Function Assignment System (FFAS) is developed and maintained by Godzik Laboratory. Jaroszewski L, Rychlewski L, Godzik A.Improving the quality of twilight-zone alignments. Protein Sci 2000 Aug;9(8):1487-96
Entrez PubMed provides access to MEDLINE citations. The references are collected from sequencce itself, function and structure annotation, NR PSI-Blast homologs. Protein sequence information is collected from NCBI GenBank and SWALL (SPTR) on the EBI SRS server. Functional and structural information is collected from EBI InterPro, Pfam, Interpro and PDB. A user interface supports the classification and selection of references.