Coordinate superposition with the Kabsch algorithm.

fit.xyz(fixed, mobile,
        fixed.inds  = NULL,
        mobile.inds = NULL,
        verbose=FALSE,
        prefix= "", pdbext = "",
        outpath = "fitlsq", full.pdbs=FALSE, 
        ncore = 1, nseg.scale = 1, ...)

rot.lsq(xx, yy,
        xfit = rep(TRUE, length(xx)), yfit = xfit,
        verbose = FALSE)

Arguments

fixed

numeric vector of xyz coordinates.

mobile

numeric vector, numeric matrix, or an object with an xyz component containing one or more coordinate sets.

fixed.inds

a vector of indices that selects the elements of fixed upon which fitting should be based.

mobile.inds

a vector of indices that selects the elements of mobile upon which fitting should be based.

full.pdbs

logical, if TRUE “full” coordinate files (i.e. all atoms) are written to the location specified by outpath.

prefix

prefix to mobile$id to locate “full” input PDB files. Only required if full.pdbs is TRUE.

pdbext

the file name extension of the input PDB files.

outpath

character string specifing the output directory when full.pdbs is TRUE.

xx

numeric vector corresponding to the moving ‘subject’ coordinate set.

yy

numeric vector corresponding to the fixed ‘target’ coordinate set.

xfit

logical vector with the same length as xx, with TRUE elements corresponding to the subset of positions upon which fitting is to be performed.

yfit

logical vector with the same length as yy, with TRUE elements corresponding to the subset of positions upon which fitting is to be performed.

verbose

logical, if TRUE more details are printed.

...

other parameters for read.pdb.

ncore

number of CPU cores used to do the calculation. ncore>1 requires package ‘parallel’ installed.

nseg.scale

split input data into specified number of segments prior to running multiple core calculation.

Details

The function fit.xyz is a wrapper for the function rot.lsq, which performs the actual coordinate superposition. The function rot.lsq is an implementation of the Kabsch algorithm (Kabsch, 1978) and evaluates the optimal rotation matrix to minimize the RMSD between two structures.

Since the Kabsch algorithm assumes that the number of points are the same in the two input structures, care should be taken to ensure that consistent atom sets are selected with fixed.inds and mobile.inds.

Optionally, “full” PDB file superposition and output can be accomplished by setting
full.pdbs=TRUE. In that case, the input (mobile) passed to fit.xyz should be a list object obtained with the function read.fasta.pdb, since the components id, resno and xyz are required to establish correspondences. See the examples below.

In dealing with large vector and matrix, running on multiple cores, especially when ncore>>1, may ask for a large portion of system memory. To avoid the overuse of memory, input data is first split into segments (for xyz matrix, the splitting is along the row). The number of data segments is equal to nseg.scale*nseg.base, where nseg.base is an integer determined by the dimension of the data.

Value

Returns moved coordinates.

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

Kabsch Acta Cryst (1978) A34, 827--828.

Author

Barry Grant with rot.lsq contributions from Leo Caves

See also

rmsd, read.pdb, read.fasta.pdb, read.dcd

Examples

# \donttest{ # PDB server connection required - testing excluded ##--- Read an alignment & Fit aligned structures aln <- read.fasta(system.file("examples/kif1a.fa",package="bio3d")) pdbs <- read.fasta.pdb(aln)
#> pdb/seq: 1 name: http://www.rcsb.org/pdb/files/1bg2.pdb #> pdb/seq: 2 name: http://www.rcsb.org/pdb/files/1i6i.pdb #> PDB has ALT records, taking A only, rm.alt=TRUE #> pdb/seq: 3 name: http://www.rcsb.org/pdb/files/1i5s.pdb #> PDB has ALT records, taking A only, rm.alt=TRUE #> pdb/seq: 4 name: http://www.rcsb.org/pdb/files/2ncd.pdb
gaps <- gap.inspect(pdbs$xyz) xyz <- fit.xyz( fixed = pdbs$xyz[1,], mobile = pdbs$xyz, fixed.inds = gaps$f.inds, mobile.inds = gaps$f.inds ) #rmsd( xyz[, gaps$f.inds] ) #rmsd( pdbs$xyz[, gaps$f.inds] ) ##-- Superpose again this time outputing PDBs xyz <- fit.xyz( fixed = pdbs$xyz[1,], mobile = pdbs, fixed.inds = gaps$f.inds, mobile.inds = gaps$f.inds, outpath = "rough_fit", full.pdbs = TRUE)
#> PDB has ALT records, taking A only, rm.alt=TRUE #> PDB has ALT records, taking A only, rm.alt=TRUE
##--- Fit two PDBs A <- read.pdb("1bg2")
#> Note: Accessing on-line PDB file
A.ind <- atom.select(A, resno=c(256:269), elety='CA') B <- read.pdb("2kin")
#> Note: Accessing on-line PDB file #> PDB has ALT records, taking A only, rm.alt=TRUE
B.ind <- atom.select(B, resno=c(257:270), elety='CA') xyz <- fit.xyz(fixed=A$xyz, mobile=B$xyz, fixed.inds=A.ind$xyz, mobile.inds=B.ind$xyz) # Write out moved PDB C <- B; C$xyz = xyz write.pdb(pdb=C, file = "moved.pdb") # }