Inspect alternative coordinates, chain breaks, bad residue numbering, non-standard/unknow amino acids, etc. Return a 'clean' pdb object with fixed residue numbering and optionally relabeled chain IDs, corrected amino acid names, removed water, ligand, or hydrogen atoms. All changes are recorded in a log in the returned object.

clean.pdb(pdb, consecutive = TRUE, force.renumber = FALSE,
  fix.chain = FALSE, fix.aa = FALSE, rm.wat = FALSE, rm.lig = FALSE,
  rm.h = FALSE, verbose = FALSE)

Arguments

pdb

an object of class pdb as obtained from function read.pdb.

consecutive

logical, if TRUE renumbering will result in consecutive residue numbers spanning all chains. Otherwise new residue numbers will begin at 1 for each chain.

force.renumber

logical, if TRUE atom and residue records are renumbered even if no 'insert' code is found in the pdb object.

fix.chain

logical, if TRUE chains are relabeled based on chain breaks detected.

fix.aa

logical, if TRUE non-standard amino acid names are converted into equivalent standard names.

rm.wat

logical, if TRUE water atoms are removed.

rm.lig

logical, if TRUE ligand atoms are removed.

rm.h

logical, if TRUE hydrogen atoms are removed.

verbose

logical, if TRUE details of the conversion process are printed.

Value

a 'pdb' object with an additional $log component storing all the processing messages.

Details

call for its effects.

Author

Xin-Qiu Yao & Barry Grant

See also

read.pdb

Examples

# \donttest{ # PDB server connection required - testing excluded pdb <- read.pdb("1a7l")
#> Note: Accessing on-line PDB file
#> Warning: /var/folders/xf/qznxnpf91vb1wm4xwgnbt0xr0000gn/T//Rtmp4WslmZ/1a7l.pdb exists. Skipping download
clean.pdb(pdb)
#> #> Call: clean.pdb(pdb = pdb) #> #> Total Models#: 1 #> Total Atoms#: 8750, XYZs#: 26250 Chains#: 6 (values: A B C D E F) #> #> Protein Atoms#: 8634 (residues/Calpha atoms#: 1114) #> Nucleic acid Atoms#: 0 (residues/phosphate atoms#: 0) #> #> Non-protein/nucleic Atoms#: 116 (residues: 54) #> Non-protein/nucleic resid values: [ GLC (6), HOH (48) ] #> #> Protein sequence: #> EGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHPDKLEEKFPQVAATGDGPDIIFWA #> HDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDAVRYNGKLIAYPIAVEALSLIYNKDLLP #> NPPKTWEEIPALDKELKAKGKSALMFNLQEPYFTWPLIAADGGYAFKYENGKYDIKDVGV #> DNAGAKAGLTFLVDLIKNKHMNADTDYSIAEAAFNKGETAMTING...<cut>...KDGS #> #> + attr: atom, xyz, seqres, helix, sheet, #> calpha, remark, call, log
# }