schemarecomb.PDBStructure

class schemarecomb.PDBStructure(amino_acids, unrenumbered_amino_acids=None, renumbering_seq=None)

Structure from the Protein Data Bank. Used in downstream calculations.

This class is a Python representation of a PDB file, e.g. 1GNX.pdb, which hold protein structure data and are downloaded from https://www.rcsb.org. This download may be done automatically with the schemarecomb.ParentSequences.get_PDB() method or manually. In the latter case, PDBStructure object must also be created manually, e.g. using the from_pdb_file() class method.

See pdb_structure for lower-level information about this class, including the PDB file format and the helper classes AminoAcid and Atom.

Parameters
  • amino_acids (list[AminoAcid]) – Amino acids in the PDB structure.

  • unrenumbered_amino_acids (Optional[list[AminoAcid]]) – The amino acids not included in renumbering. None if PDBStructure is not renumbered. Must be None if renumbering_seq is None.

  • renumbering_seq (Optional[str]) – Sequence used to renumber structure. None if PDBStructure is not renumbered. Must not None if unrenumbered_amino_acids is None.

Attributes
  • amino_acids (list[AminoAcid]) – Residues in the PDB structure. There is no guarantee that the AminoAcid objects are ordered by index.

  • unrenumbered_amino_acids (list[AminoAcid]) – Original amino_acids not included in the renumbering. Present if and only if structure is renumbered.

  • renumbering_seq (str) – Sequence that was used to renumber structure. Present if and only if structure is renumbered.

  • seq (str) – Amino acid sequence of structure, including gaps based on the indices of the amino_acids. Note that this attribute will change if renumbering occurs.

  • contacts (list[tuple[int, int]]) – Indices of contacting residues, where a pair of residues are contacting if the shortest distance between them is less than 4.5 angstroms.

Raises
  • ValueError – During initialization, if input amino_acids have non-unique indicies, unrenumbered_amino_acids is an empty list, renumbering_seq does not match amino_acids, or one but not both of unrenumbered_amino_acids and renumbering_seq is None.

  • AttributeError – If PDBStructure is not renumbered and unrenumbered_amino_acids or renumbering_seq is accessed.

derenumber()

Revert any changes made by the renumber method back to original.

Raises

AttributeError – If PDBStructure is not renumbered.

Return type

None

classmethod from_json(in_json)

Construct from JSON string.

Parameters

in_json (str) – JSON string representing a PDBStructure instance. Usually generated by the to_json method.

Return type

_PDBStructure

Returns

PDBStructure constructed from JSON string.

Example:

>>> from schemarecomb import PDBStructure
>>> from schemarecomb.pdb_structure import AminoAcid
>>>
>>> # Get a PDBStructure from a PDB file.
>>> pdb_fn = 'tests/fixtures/bgl3_full/1GNX.pdb'
>>> pdb1 = PDBStructure.from_pdb_file(pdb_fn)
>>>
>>> # Makes a temporary directory that can be cleaned up later. You
>>> # can ignore this and just use a string for your filename.
>>> tmpdir = getfixture('tmpdir')
>>> json_filename = tmpdir / 'pdb_structure.json'
>>>
>>> # Convert pdb into a JSON string and save it.
>>> json_str = pdb1.to_json()
>>> with open(json_filename, 'w') as f:
...     f.write(json_str)
...
305616
>>> # Load the JSON string and make a new PDBStructure.
>>> with open(json_filename) as f:
...     in_str = f.read()
...
>>> pdb2 = PDBStructure.from_json(in_str)
>>>
>>> # pdb1 and pdb2 are the same.
>>> aas1 = pdb1.amino_acids
>>> aas2 = pdb2.amino_acids
>>> aa_zip = list(zip(aas1, aas2))
>>> all(aa1.name == aa2.name for aa1, aa2 in aa_zip)
True
>>> all(aa1.index == aa2.index for aa1, aa2 in aa_zip)
True
>>> all(len(aa1.atoms) == len(aa2.atoms) for aa1, aa2 in aa_zip)
True
classmethod from_pdb_file(f, chain='A')

Construct from PDB file without renumbering.

Reference the pdb_structure module to see the structure of a PDB file.

Parameters
  • f (Union[str, Path, TextIO]) – Filename of PDB structure file or file-like PDB structure.

  • chain (str) – Chain to include in constructed object.

Return type

_PDBStructure

Returns

PDBStructure initialized from PDB file.

renumber(p0_aligned)

Renumber pdb structure to match a ParentSequences.

Parameters

p0_aligned (str) – Sequence to align to. Usually the first parent from a ParentSequences.

Return type

None

to_json()

Convert structure to JSON string.

See the from_json method for an example.

Return type

str

Returns

JSON string representing the PDBStructure.