read.fasta.pdb.Rd
Read aligned PDB structures and store their C-alpha atom data, including xyz coordinates, residue numbers, residue type and B-factors.
read.fasta.pdb(aln, prefix = "", pdbext = "", fix.ali = FALSE, pdblist=NULL, ncore = 1, nseg.scale = 1, progress = NULL, ...)
aln | an alignment data structure obtained with
|
---|---|
prefix | prefix to aln$id to locate PDB files. |
pdbext | the file name extention of the PDB files. |
fix.ali | logical, if TRUE check consistence between |
pdblist | an optional list of |
ncore | number of CPU cores used to do the calculation.
|
nseg.scale | split input data into specified number of segments
prior to running multiple core calculation. See |
progress | progress bar for use with shiny web app. |
... | other parameters for |
The input aln
, produced with read.fasta
, must
have identifers (i.e. sequence names) that match the PDB file
names. For example the sequence corresponding to the structure
“1bg2.pdb” should have the identifer ‘1bg2’. See
examples below.
Sequence miss-matches will generate errors. Thus, care should be taken to ensure that the sequences in the alignment match the sequences in their associated PDB files.
Returns a list of class "pdbs"
with the following five
components:
numeric matrix of aligned C-alpha coordinates.
character matrix of aligned residue numbers.
numeric matrix of aligned B-factor values.
character matrix of aligned chain identifiers.
character vector of PDB sequence/structure names.
character matrix of aligned sequences.
character matrix of aligned 3-letter residue names.
character matrix of aligned helix and strand secondary structure elements as defined in each PDB file.
the matched call.
Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.
Barry Grant
The sequence character ‘X’ is useful for masking unusual or unknown residues, as it can match any other residue type.
read.fasta
, read.pdb
,
core.find
, fit.xyz
,
read.all
, pymol.pdbs
# \donttest{ # Redundant testing excluded # Read sequence alignment file <- system.file("examples/kif1a.fa",package="bio3d") aln <- read.fasta(file) # Read aligned PDBs pdbs <- read.fasta.pdb(aln)#> pdb/seq: 1 name: http://www.rcsb.org/pdb/files/1bg2.pdb #> pdb/seq: 2 name: http://www.rcsb.org/pdb/files/1i6i.pdb #> PDB has ALT records, taking A only, rm.alt=TRUE #> pdb/seq: 3 name: http://www.rcsb.org/pdb/files/1i5s.pdb #> PDB has ALT records, taking A only, rm.alt=TRUE #> pdb/seq: 4 name: http://www.rcsb.org/pdb/files/2ncd.pdb#> [1] "1bg2.pdb" "1i6i.pdb" "1i5s.pdb" "2ncd.pdb"# Alignment positions 335 to 339 pdbs$ali[,335:339]#> [,1] [,2] [,3] [,4] [,5] #> http://www.rcsb.org/pdb/files/1bg2.pdb "L" "L" "F" "G" "Q" #> http://www.rcsb.org/pdb/files/1i6i.pdb "L" "R" "Y" "A" "D" #> http://www.rcsb.org/pdb/files/1i5s.pdb "L" "R" "Y" "A" "D" #> http://www.rcsb.org/pdb/files/2ncd.pdb "L" "R" "F" "A" "A"pdbs$resid[,335:339]#> [,1] [,2] [,3] [,4] [,5] #> http://www.rcsb.org/pdb/files/1bg2.pdb "LEU" "LEU" "PHE" "GLY" "GLN" #> http://www.rcsb.org/pdb/files/1i6i.pdb "LEU" "ARG" "TYR" "ALA" "ASP" #> http://www.rcsb.org/pdb/files/1i5s.pdb "LEU" "ARG" "TYR" "ALA" "ASP" #> http://www.rcsb.org/pdb/files/2ncd.pdb "LEU" "ARG" "PHE" "ALA" "ALA"pdbs$resno[,335:339]#> [,1] [,2] [,3] [,4] [,5] #> http://www.rcsb.org/pdb/files/1bg2.pdb 316 317 318 319 320 #> http://www.rcsb.org/pdb/files/1i6i.pdb 345 346 347 348 349 #> http://www.rcsb.org/pdb/files/1i5s.pdb 345 346 347 348 349 #> http://www.rcsb.org/pdb/files/2ncd.pdb 661 662 663 664 665pdbs$b[,335:339]#> [,1] [,2] [,3] [,4] [,5] #> http://www.rcsb.org/pdb/files/1bg2.pdb 14.99 15.46 9.61 12.16 26.17 #> http://www.rcsb.org/pdb/files/1i6i.pdb 19.27 22.65 23.66 20.76 26.02 #> http://www.rcsb.org/pdb/files/1i5s.pdb 16.08 19.46 16.91 20.48 26.71 #> http://www.rcsb.org/pdb/files/2ncd.pdb 14.01 22.77 31.48 44.88 54.12# Alignment C-alpha coordinates for these positions pdbs$xyz[, atom2xyz(335:339)]#> [,1] [,2] [,3] [,4] [,5] #> http://www.rcsb.org/pdb/files/1bg2.pdb 18.880 -4.358 46.825 19.801 -7.246 #> http://www.rcsb.org/pdb/files/1i6i.pdb 4.206 34.360 4.539 3.439 38.005 #> http://www.rcsb.org/pdb/files/1i5s.pdb 16.417 -33.136 4.337 17.452 -36.750 #> http://www.rcsb.org/pdb/files/2ncd.pdb 0.185 37.134 46.308 -0.375 34.180 #> [,6] [,7] [,8] [,9] [,10] #> http://www.rcsb.org/pdb/files/1bg2.pdb 44.546 21.449 -4.913 42.020 23.509 #> http://www.rcsb.org/pdb/files/1i6i.pdb 5.403 1.775 37.124 8.670 4.660 #> http://www.rcsb.org/pdb/files/1i5s.pdb 5.131 19.143 -35.570 8.292 15.967 #> http://www.rcsb.org/pdb/files/2ncd.pdb 48.642 -2.386 36.408 50.962 -4.710 #> [,11] [,12] [,13] [,14] [,15] #> http://www.rcsb.org/pdb/files/1bg2.pdb -3.354 44.785 24.545 -6.830 45.885 #> http://www.rcsb.org/pdb/files/1i6i.pdb 34.826 9.548 7.057 37.666 8.763 #> http://www.rcsb.org/pdb/files/1i5s.pdb -33.755 9.354 13.829 -36.723 8.388 #> http://www.rcsb.org/pdb/files/2ncd.pdb 36.941 48.048 -4.498 33.173 47.625# See 'fit.xyz()' function for actual coordinate superposition # e.g. fit to first structure # xyz <- fit.xyz(pdbs$xyz[1,], pdbs) # xyz[, atom2xyz(335:339)] # }