## :: pfat.py :: ## Calvin Chen (cvchen@ucsd.edu) ## Joint Center for Structural Genomics ## @ San Diego Supercomputer Center, UC San Diego ____About_____ P-FAT is a tool that helps in refinement of crystallographic protein structures by detecting discrepancies between the protein sequence and its partly refined structure. Using BLAST (bl2seq), P-FAT aligns a sequence provided in fasta format with a polypeptide chain from a .pdb file. Usually the fasta-formatted sequence is the complete sequence of a chain in the .pdb file, the .pdb chain being only partially built. The primary alignment is displayed in a convenient "clickable" format, allowing easy corrections of misplaced residues and detection of gaps in the partially refined protein structure. The alignment is formatted so that the entire fasta sequence is shown, also that gaps in the .pdb chain and mismatches are conveniently denoted. Optionally, the user can automatically re-number the .pdb chain consistent with the full sequence provided in fasta-format. Also, P-FAT has an option to place correctly numbered alanines in the gaps between already built fragments of the protein structure, which can then be manually placed according to electron density. This option is convienient for users of "O" refinement program. Both of these options produce separate .pdb files. -------------------- In Short, this tool: -------------------- 1. Aligns peptide chain in .pdb file with a fasta sequence (with bl2seq). * Displays only the alignment with the highest e-value. * Re-numbers .pdb (Sbjct) sequence in the alignment based on the residue numbering in the .pdb file * Makse the alignment "clickable". * Denotes mismatches with red residues in alignment. * Denote gaps in numbering of .pdb chain with bold underlined residues in alignment. * Displays unaligned N and C-terminal fragments the fasta sequence, denoted with bold italics blue Query/Sbjct labels. 2. Has two optional functions to create modified versions of the .pdb file: 1. Re-number residues of the selected .pdb chain to have the position number of the FASTA residue it aligned with. 2. Bridge gaps in the .pdb chain by filling the gap with alanine residues (sterically, the alanines are in a straight line between the residues at each end of the gap). * If both options are selected, two separate .pdb files are generated. ______Setup_____ -The shebang line needs to contain the correct path to your local python interpreter. -The path to your local BLAST (folder with bl2seq in it) needs to be set for the variable 'blastPath' _______Use______ Read the _____About_____ and ______Setup_____ sections first. Run like this: python pfat.py A .pdb file and fasta file need to be provided. If either file is not in the same directory as pfat.py, then the path (preferrably absolute) to the file[s] should be part of the filname specification. Select a chain from the .pdb file to align with the fasta sequence. The output of pfat.py is in HTML format, so specify the name of the output .html file, with a path if you want the file somewhere other than the same directory as pfat.py. You can select among the two optional functions to have pfat.py do. Each of these produces a newly modified version of the .pdb file you originally gave to pfat.py. If optional functions are selected for use, provide an output filename for each and specify the path to these files if you want the output .pdb files to be in a different directory than pfat.py.