From Jacobson Lab Wiki

(Redirected from Loading PDB files)
Jump to: navigation, search

load is a top level plop command used to load various input files. The syntax is

 load <pdb|native|replacement|sequence> inputfilename [options]

pdb, native, sequence, and replacement, read in different file formats or store the input file differently in memory.


[edit] load pdb

This is used to load a structure in RCSB Protein Data Bank format.

[edit] load native

[edit] load replacement

[edit] load sequence

Plop will also read a peptide sequence without using a PDB file, using "load seq[uence] inputfilename". The input file consists of a list of the residues in the peptide, one per line (all caps). Any valid "residues" recognized by PLOP can be used. For example, to load "alanine dipeptide", the command might be

 load seq ala_dipep.res

where the file "ala_dipep.res" has the following lines:


Note that ACE and NMA are "capping groups". The "load sequence" command just puts the peptide in an extended structure initially; it's up to you to optimize it to a more reasonable conformation.

[edit] Options

These options can be used with any of pdb, native, sequence, and replacement; however they may not make sense to use with all.

There are several options that can be specified (default values are bolded when known/applicable):

   breaks    	               - (yes|no)[range start end]
   build_tail                 -
   build                      - (yes/no) build missing atoms
   cell_buffer                - cell buffer distance, 20 is d
   chain character            - load only specified chain
   cyclic [chain identifier]  - specify that a chain is cyclic
   exp[eriment]               - 
   het[atm] yes/no            - load hetams?
   highest_occ                - yes => use the highest occupancy coordinates, no => use the first
   ions yes/no                - load ions?
   link atom1 atom2           - form a link between atom1 and atom2 (broken)
   model integer              - load specified model number
   mse2met yes/no             - convert selenomethoinine to methionine
   mutat[ion]                 -
   oldtemp[late]              -
   opls versionnumber         -
   opt yes/no                 - 
   pka frompdb/null           - 
   planarity                  -
   same_opt[ions]             -
   same_stru[cture]           -
   sample_omega               - more extensive omega sampling
   seqres yes/no              - 
   sym xtal/bio[logical]/none - 
   temp[late]                 -
   uni                        -
   wat[er] yes/no             - load waters?

The structure file in pdb format must contain and END statement at the end.
The options "ions" and "wat[er]" specify whether or not monoatomic ions or waters in the PDB file are loaded.
The options "ions" and "wat[er]" specify whether or not monoatomic ions or waters in the PDB file are loaded.
Option "het[atm]" determines whether other HETATM groups (i.e., ligands) are loaded; for this to be successful, there must be template files available for the ligand groups.
If the PDB file contains more than one protein chain, the default is to load all of them. However, the "chain" option can be used to specify the loading of only a single chain. For PDB files with multiple MODELs (i.e., NMR or theoretical structures), the "model" option can specify which one to load (default is #1).
Note: Loading from SEQRES lines is currently (July 2012) extremely buggy and may result in highly erroneous atom placement.
PDB files contain the sequence of the protein(s) in 2 locations: in the SEQRES records, and in the ATOM records. The ATOM records need not contain every residue in the protein, if certain residues (often tails, but sometimes floppy loops) cannot be located in the electron density. Using "seqres yes" directs the programs to load the protein sequence from the SEQRES records, and if a corresponding residue is not located in the ATOM lines, it is built in an arbitrary conformation (this can be fixed by the user later). Otherwise, residues not listed in the ATOM records are simply omitted.
Even if all residues appear in the ATOM records, not every atom of the molecule generally does. In x-ray crystal structures, hydrogen atoms are usually omitted since they cannot be located in the electron density except at very high resolution. Disordered side chains may also be omitted. The option "opt yes" specifies that all atoms of the system that are not found in the ATOM records be subjected to some level of initial optimization. This includes an attempt to rebuild any missing side chains in reasonable positions, optimization of the positions of polar, rotatable hydrogens (e.g., OH groups), and energy minimization of all atoms not found in the PDB file. If you are going to use a PDB file many times, it may be helpful to use "opt yes" once, save the results, and then read in the pre-prepared file with "opt no", to save time.
(note: minimization using "opt yes" in the presence of severe steric clashes can lead to bad structures)
Finally, PDB files always contain information about the crystal unit cell, and sometimes information about biological symmetry (dimers, tetramers, etc.). PLOP is designed to permit symmetric replication of the protein to mimic these types of symmetry. You specify that you want to apply symmetry when you load the protein, and then all subsequent command will be carried out on the symmetrically replicated system. To reproduce the crystal packing environment, use the option "sym xtal". This should essentially always work. To replicate biological symmetry, use "sym biol". Unfortunately, this frequently will not work, without modifying the PDB file. There is a mandated format for PDB files to report biological symmetry (in the REMARK section), but it is frequently ignored.
(Q: is sym broken in the current version?)
Cyclic peptides are now supported. If the cyclic peptide is chain A, then use the option “cyclic A” to indicate to PLOP that it should link the first and last residues together.
Disulfide bonds are normally identified automatically by PLOP, but you can manually force linkages by using the “link” option, e.g., “link A:23:SG A:54:SG” to force a linkage between two Cys residues. This has also been used to, e.g., link a His residue to the iron on a heme group.
Seleno-methionine is recognized as a valid residue type, but if you want to automatically change all the Se atoms to S, use “mse2met”.
Personal tools