schemarecomb.pdb_structure¶
Parsing and manipulation of Protein Data Bank structure files.
This module provides the definition of schemarecomb.PDBStructure and
the accessory classes AminoAcid and Atom that represent the
eponymous entities within a PDB structure.
PDBStructures contain a list of AminoAcid objects, which act as containers for Atom objects that are read from “ATOM” lines in a PDB structure file.
Parsing and modifying protein structure data is necessary for recombinant library design because SCHEMA energy calculations require the distance between amino acids within the protein.
These structures are obtained from the Protein Data Bank at https://www.rcsb.org and loaded as Python objects. Nearly all information is discarded except for ATOM lines, which specify data about atoms within the structure.
The most confusing part about PDB structure manipulation is the renumbering of atoms to match SCHEMA-RASPP parent alignment. Relieving user confusion about this process is a primary goal of this module. Note that atom/residue indexing in PDB files begins at 1, while Python’s indexing starts at 0. For consistency with the rest of the package, these classes index the residue number starting from 0. As a result, PDB file reading and writing PDB must convert between indexing. For example, the “ALA” lines in the pdb file below indicate that the 15th amino acid in the structure is alanine. When read with this module, this alanine will be labeled with an index of 14. This is consistent with sequence number: if pdb_seq is the Python String representing the amino acid sequence of the structure , this residue would be pdb_seq[14].
PDB files generally have this structure (example structure 1GNX):
...<other structure data>...
ATOM 1 N ALA A 15 -1.611 17.176 10.792 1.00 36.46 N
ATOM 2 CA ALA A 15 -1.871 18.610 11.107 1.00 36.85 C
ATOM 3 C ALA A 15 -2.021 18.795 12.611 1.00 36.41 C
ATOM 4 O ALA A 15 -2.983 18.321 13.215 1.00 38.36 O
ATOM 5 CB ALA A 15 -3.131 19.081 10.392 1.00 35.10 C
ATOM 6 N LEU A 16 -1.064 19.496 13.206 1.00 34.22 N
ATOM 7 CA LEU A 16 -1.061 19.738 14.642 1.00 29.97 C
ATOM 8 C LEU A 16 -1.711 21.073 14.992 1.00 30.29 C
ATOM 9 O LEU A 16 -1.462 22.089 14.341 1.00 30.34 O
ATOM 10 CB LEU A 16 0.380 19.716 15.152 1.00 26.33 C
ATOM 11 CG LEU A 16 1.228 18.548 14.639 1.00 24.12 C
ATOM 12 CD1 LEU A 16 2.681 18.761 15.026 1.00 22.75 C
ATOM 13 CD2 LEU A 16 0.698 17.230 15.195 1.00 23.37 C
ATOM 14 N THR A 17 -2.541 21.066 16.028 1.00 30.68 N
ATOM 15 CA THR A 17 -3.217 22.278 16.472 1.00 29.15 C
ATOM 16 C THR A 17 -2.576 22.784 17.756 1.00 28.03 C
ATOM 17 O THR A 17 -2.337 22.012 18.683 1.00 28.04 O
ATOM 18 CB THR A 17 -4.716 22.014 16.733 1.00 30.86 C
ATOM 19 OG1 THR A 17 -5.357 21.666 15.501 1.00 32.64 O
ATOM 20 CG2 THR A 17 -5.385 23.246 17.319 1.00 30.50 C
...<other atoms>...
...<other structure data>...
Classes¶
|
Amino acid within a PDB structure. |
|
Atom within a PDB structure. |