GI: Gauss Integrals - a tool for 3D protein structure description, comparison and classification

Description: GI reads through a directory and calculates generalized Gauss integrals of orders one two and three of all chains of protein pdb-files it finds.
A newer version: is avaliable here.
Download: The GI.c software may as of March 11’th 2011 be downloaded and used under the GNU General Public License version 3.
Compile: GI is compiled entering  >cc GI.c -lm -O3 -o GI
Run: To run GI enter >GI /path/to/pdb_file_directory/ output.file error.file
Output: The columns of output.file are

pdb.file   chainID   #C-alphas   #C-alphas_missing  and then 30 structural measures for example
1AA0.pdb 0 113 0 1.201e+00 1.940e+00 2.703e+00 1.592e+00 ...

SGM: The  Scaled Gauss (Pseudo) Metric is given by the usual metric on the last 30 columns.


Note: We have not considered backbones if more than 3 C-alpha atoms are missing. This is because, GI connects the C-alpha atoms it finds and big gaps in the backbone thus may give a "backbone" that is very different from what the true backbone was supposed to be.  To compute the number, #C-alphas_missing, GI just counts the number of C-alpha atoms and compare this with the starting and ending residue number. In the case of pdb-files with non consecutive numbering, this may give strange results.

 

 

 

 

Citing the use of this resource: P. Røgen & B. Fain, Automatic classification of protein structure by using Gauss integrals, PNAS, 100(1), 119-124, 2003.

Data: The data used for the PNAS paper may be downloaded here.

Contacting the author: Peter Røgen  Peter.Roegen@mat.dtu.dk and Boris Fain bfain@stanford.edu.

Acknowledgment: Peter Røgen was supported by Carlsbergfondet.
Bibliography: of Gauss integrals applied for protein structure description.