Sieve.html

The program: Sieve searches through the given directory for files ending in `` .pdb''. For each such file, it reads through its output file (which is not overwritten, but only appended to) to see if there already is an entry for that protein. If so, it passes over to the next one. If not, it computes the measures for this new protein if it can, and appends a line to the output file if it could.
The output file is only opened for reading and writing, but not during any computation. Once a line is appended to the output file, the output stream is flushed (any buffered but unwritten data is written). This means that the program can be aborted and restarted without losing more than the computation in progress (i.e. one single protein). It also means that one can first set the program to treat a set of proteins without any perturbation of atomic coordinates (i.e. no sixth argument). It will compute the measures of those it can, but not produce an output line for those which caused numerical problems. One can then start again with a small perturbation to treat the remainder.

pdb.file chainID #C-alphas_missing #C-alphas and then 29 structural measures, ordered as in Table 3 in our paper below, for example
1cd1C2.pdb C 0 95 -2.2006067934 23.21.....

Note: We have not considered backbones if more than 3 C-alpha atoms are missing. This is because, Sieve connects the carbon alpha atoms it finds and big gaps in the backbone thus may give a "backbone" that is very different from what the true backbone was supposed to be. To compute the number, #C-alphas_missing, Sieve just counts the number of carbon alpha atoms and compare this with the starting and ending residue number. In the case of pdb-files with non consecutive numbering, this may give strange results.

Citing the use of this resource: P. Røgen & R. Sinclair, Computing a new Family of Shape Descriptors for Protein Structures , J. Chem. Inf. Comput. Sci. 43, 1740--1747, 2003.

Contacting the author:
Peter Røgen Peter.Roegen@mat.dtu.dk and
Robert Sinclair R.Sinclair@ms.unimelb.edu.au .

Acknowledgment: Peter Røgen was supported by Carlsbergfondet.

Bibliography
P. Røgen & R. Sinclair, Computing a new Family of Shape Descriptors for Protein Structures , J. Chem. Inf. Comput. Sci. 43, 1740--1747, 2003.

Sieve - a tool for 3D protein structure description, comparison and classification