pdb_cpp package

Subpackages

Submodules

pdb_cpp.TMalign module

Python wrappers around the TM-align / secondary-structure core.

pdb_cpp.TMalign.compute_secondary_structure(coor, **kwargs)[source]

Compute secondary structure using the TM-align core.

You should cite the original TM-align paper if you use this function: Zhang, Y., & Skolnick, J. NAR (2005).

Parameters:
coorCoor

The Coor object containing the protein structure.

**kwargsdict

Additional keyword arguments for the compute_SS function.

Returns:
list

List of per-chain secondary-structure dictionaries per model.

pdb_cpp.alignment module

Sequence-alignment convenience wrappers.

The compiled core implements the actual alignment algorithms; this module keeps the Python surface small and provides packaged defaults such as BLOSUM62.

pdb_cpp.alignment.align_chain_permutation(coor_1, coor_2, back_names=None, matrix_file=None, frame_ref=0)[source]

Align structures by permuting chain order and selecting the best RMSD.

Parameters:
coor_1Coor

First coordinate object.

coor_2Coor

Second coordinate object.

back_nameslist[str], optional

Backbone atom names to use.

matrix_filestr, optional

Path to the scoring matrix file. If None, uses packaged BLOSUM62.

frame_refint, optional

Reference frame index in coor_2.

Returns:
tuple

RMSD list and index mappings from the best permutation.

pdb_cpp.alignment.align_seq(seq1, seq2, gap_cost=-11, gap_ext=-1, matrix_file=None)[source]

Align two sequences using a simple scoring system.

Parameters:
seq1str

First sequence.

seq2str

Second sequence.

gap_costint, optional

Cost for opening a gap.

gap_extint, optional

Cost for extending a gap.

matrix_filestr, optional

Path to the scoring matrix file. If None, uses packaged BLOSUM62.

Returns:
tuple

Tuple containing the aligned sequences (seq1, seq2) and score.

pdb_cpp.alignment.print_align_seq(seq_1, seq_2, line_len=80)[source]

Print the aligned sequences with a fixed line length.

Parameters:
seq_1str

First sequence.

seq_2str

Second sequence.

line_lenint, optional

Length of each output line.

Returns:
None

pdb_cpp.cli_dockq module

Command-line interface for DockQ scoring.

This module turns the analysis helpers into a small CLI that can print either human-readable reports or JSON output.

pdb_cpp.cli_dockq.main(argv: list[str] | None = None) int[source]

Run the DockQ command-line interface.

pdb_cpp.core module

class pdb_cpp.core.Alignment_cpp

Bases: pybind11_object

Attributes:
score

Alignment score

seq1

Aligned sequence 1

seq2

Aligned sequence 2

property score

Alignment score

property seq1

Aligned sequence 1

property seq2

Aligned sequence 2

class pdb_cpp.core.Coor(coor_in=None, pdb_id=None, format='', rcsb_structure='asymmetric_unit', assembly_id=1, cache_dir=None, force_download=False)

Bases: pybind11_object

Attributes:
active_model
alterloc_str
beta
chain
chain_str
conect
elem_str
insertres_str
len

Return the number of atoms in the selection.

model_num

Return the number of models in the selection.

models

Return all models of the selection.

name
name_str
num
occ
resid
resname
resname_str
uniq_resid
x
xyz
y
z

Methods

add_Model(self, arg0)

clear(self)

get_Models(self, arg0)

get_aa_DL_seq(self[, gap_in_seq, frame])

Get amino-acid sequences with D-residues encoded as lowercase

get_aa_na_seq(self[, gap_in_seq, frame])

Get amino-acid and nucleic-acid sequences per chain

get_aa_seq(self[, gap_in_seq, frame])

Get amino-acid sequences per chain

get_aa_sequences(self[, gap_in_seq, frame])

Get the amino acid sequence, optionally including gaps and specifying a frame index

get_aa_sequences_dl(self[, gap_in_seq, frame])

Get the amino acid sequence with D-residues encoded as lowercase

get_all_Models(self)

get_alterloc(self[, frame])

get_beta(self[, frame])

get_chain(self[, frame])

get_elem(self[, frame])

get_index_select(self, selection[, frame])

Get indices of atoms based on a selection string and an optional frame index

get_insertres(self[, frame])

get_name(self[, frame])

get_num(self[, frame])

get_occ(self[, frame])

get_resid(self[, frame])

get_resname(self[, frame])

get_uniq_chain(self)

get_uniq_chain_str(self)

get_uniqresid(self[, frame])

get_x(self[, frame])

get_y(self[, frame])

get_z(self[, frame])

model_size(self)

read(self, filename[, format])

Read a structure file; format can be 'pdb', 'cif', 'pqr', or 'gro' (default: infer from extension)

remove_incomplete_backbone_residues(self[, ...])

Remove residues with incomplete backbone atoms

select_atoms(self, selection[, frame])

Select atoms based on a selection string and an optional frame index

select_bool_index(self, arg0)

set_alterloc(self, arg0, arg1)

set_beta(self, arg0, arg1)

set_chain(self, arg0, arg1)

set_elem(self, arg0, arg1)

set_insertres(self, arg0, arg1)

set_name(self, arg0, arg1)

set_num(self, arg0, arg1)

set_occ(self, arg0, arg1)

set_resid(self, arg0, arg1)

set_resname(self, arg0, arg1)

set_uniqresid(self, arg0, arg1)

set_x(self, arg0, arg1)

set_y(self, arg0, arg1)

set_z(self, arg0, arg1)

size(self)

write(self, arg0)

property active_model
add_Model(self: pdb_cpp.core.Coor, arg0: pdb_cpp.core.Model) None
property alterloc_str
property beta
property chain
property chain_str
clear(self: pdb_cpp.core.Coor) None
property conect
property elem_str
get_Models(self: pdb_cpp.core.Coor, arg0: int) pdb_cpp.core.Model
get_aa_DL_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid sequences with D-residues encoded as lowercase

get_aa_na_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid and nucleic-acid sequences per chain

get_aa_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid sequences per chain

get_aa_sequences(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) list[str]

Get the amino acid sequence, optionally including gaps and specifying a frame index

get_aa_sequences_dl(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) list[str]

Get the amino acid sequence with D-residues encoded as lowercase

get_all_Models(self: pdb_cpp.core.Coor) list[pdb_cpp.core.Model]
get_alterloc(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_beta(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_chain(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_elem(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_index_select(self: pdb_cpp.core.Coor, selection: str, frame: int = 0) list[int]

Get indices of atoms based on a selection string and an optional frame index

get_insertres(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_name(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_num(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_occ(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_resid(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_resname(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_uniq_chain(self: pdb_cpp.core.Coor) list[Annotated[list[str], FixedSize(2)]]
get_uniq_chain_str(self: pdb_cpp.core.Coor) list[str]
get_uniqresid(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_x(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_y(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_z(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
property insertres_str
property len

Return the number of atoms in the selection.

Returns:
int

Number of atoms.

property model_num

Return the number of models in the selection.

Returns:
int

Number of models.

model_size(self: pdb_cpp.core.Coor) int
property models

Return all models of the selection.

Returns:
list[Model]

List of models.

property name
property name_str
property num
property occ
read(self: pdb_cpp.core.Coor, filename: str, format: str = '') bool

Read a structure file; format can be ‘pdb’, ‘cif’, ‘pqr’, or ‘gro’ (default: infer from extension)

remove_incomplete_backbone_residues(self: pdb_cpp.core.Coor, back_atom: list[str] = ['CA', 'C', 'N', 'O']) pdb_cpp.core.Coor

Remove residues with incomplete backbone atoms

property resid
property resname
property resname_str
select_atoms(self: pdb_cpp.core.Coor, selection: str, frame: int = 0) pdb_cpp.core.Coor

Select atoms based on a selection string and an optional frame index

select_bool_index(self: pdb_cpp.core.Coor, arg0: list[bool]) pdb_cpp.core.Coor
set_alterloc(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_beta(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_chain(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_elem(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_insertres(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_name(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_num(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_occ(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_resid(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_resname(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_uniqresid(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_x(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_y(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_z(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
size(self: pdb_cpp.core.Coor) int
property uniq_resid
write(self: pdb_cpp.core.Coor, arg0: str) bool
property x
property xyz
property y
property z
class pdb_cpp.core.HBond

Bases: pybind11_object

Attributes:
acceptor_chain
acceptor_name
acceptor_resid
acceptor_resname
acceptor_xyz
angle_DHA
dist_DA
dist_HA
donor_chain
donor_h_name
donor_h_xyz
donor_heavy_name
donor_heavy_xyz
donor_resid
donor_resname
property acceptor_chain
property acceptor_name
property acceptor_resid
property acceptor_resname
property acceptor_xyz
property angle_DHA
property dist_DA
property dist_HA
property donor_chain
property donor_h_name
property donor_h_xyz
property donor_heavy_name
property donor_heavy_xyz
property donor_resid
property donor_resname
class pdb_cpp.core.Model

Bases: pybind11_object

Attributes:
alterloc_str
beta
chain
chain_str
elem_str
insertres_str
len

Return the number of atoms in the selection.

name
name_str
num
occ
resid
resname
resname_str
uniq_resid
x
xyz
y
z

Methods

addAtom(self, arg0, arg1, arg2, arg3, arg4, ...)

clear(self)

get_alterloc(self)

get_beta(self)

get_centroid(*args, **kwargs)

Overloaded function.

get_chain(self)

get_elem(self)

get_field(self)

get_insertres(self)

get_name(self)

get_num(self)

get_occ(self)

get_resid(self)

get_resname(self)

get_uniqresid(self)

get_x(self)

get_y(self)

get_z(self)

select_atoms(self, arg0)

set_alterloc(self, arg0, arg1)

set_beta(self, arg0, arg1)

set_chain(self, arg0, arg1)

set_elem(self, arg0, arg1)

set_insertres(self, arg0, arg1)

set_name(self, arg0, arg1)

set_num(self, arg0, arg1)

set_occ(self, arg0, arg1)

set_resid(self, arg0, arg1)

set_resname(self, arg0, arg1)

set_uniqresid(self, arg0, arg1)

set_x(self, arg0, arg1)

set_y(self, arg0, arg1)

set_z(self, arg0, arg1)

size(self)

addAtom(self: pdb_cpp.core.Model, arg0: int, arg1: Annotated[list[str], FixedSize(5)], arg2: Annotated[list[str], FixedSize(5)], arg3: int, arg4: Annotated[list[str], FixedSize(2)], arg5: float, arg6: float, arg7: float, arg8: float, arg9: float, arg10: Annotated[list[str], FixedSize(2)], arg11: Annotated[list[str], FixedSize(5)], arg12: Annotated[list[str], FixedSize(2)], arg13: bool, arg14: int) bool
property alterloc_str
property beta
property chain
property chain_str
clear(self: pdb_cpp.core.Model) None
property elem_str
get_alterloc(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(2)]]
get_beta(self: pdb_cpp.core.Model) list[float]
get_centroid(*args, **kwargs)

Overloaded function.

  1. get_centroid(self: pdb_cpp.core.Model) -> Annotated[list[float], FixedSize(3)]

Calculate centroid of all atoms in the model

  1. get_centroid(self: pdb_cpp.core.Model, indices: list[int]) -> Annotated[list[float], FixedSize(3)]

Calculate centroid of atoms at specified indices

get_chain(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(2)]]
get_elem(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(5)]]
get_field(self: pdb_cpp.core.Model) list[bool]
get_insertres(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(2)]]
get_name(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(5)]]
get_num(self: pdb_cpp.core.Model) list[int]
get_occ(self: pdb_cpp.core.Model) list[float]
get_resid(self: pdb_cpp.core.Model) list[int]
get_resname(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(5)]]
get_uniqresid(self: pdb_cpp.core.Model) list[int]
get_x(self: pdb_cpp.core.Model) list[float]
get_y(self: pdb_cpp.core.Model) list[float]
get_z(self: pdb_cpp.core.Model) list[float]
property insertres_str
property len

Return the number of atoms in the selection.

Returns:
int

Number of atoms.

property name
property name_str
property num
property occ
property resid
property resname
property resname_str
select_atoms(self: pdb_cpp.core.Model, arg0: str) list[bool]
set_alterloc(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_beta(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
set_chain(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_elem(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_insertres(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_name(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_num(self: pdb_cpp.core.Model, arg0: int, arg1: int) None
set_occ(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
set_resid(self: pdb_cpp.core.Model, arg0: int, arg1: int) None
set_resname(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_uniqresid(self: pdb_cpp.core.Model, arg0: int, arg1: int) None
set_x(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
set_y(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
set_z(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
size(self: pdb_cpp.core.Model) int
property uniq_resid
property x
property xyz
property y
property z
class pdb_cpp.core.TMalignResult

Bases: pybind11_object

Attributes:
L_ali
Liden
TM1
TM2
TM_ali
rmsd
rotation
seqM
seqxA
seqyA
translation
property L_ali
property Liden
property TM1
property TM2
property TM_ali
property rmsd
property rotation
property seqM
property seqxA
property seqyA
property translation
pdb_cpp.core.align_chain_permutation(coor_1: pdb_cpp.core.Coor, coor_2: pdb_cpp.core.Coor, back_names: list[str] = ['C', 'N', 'O', 'CA'], matrix_file: str = '', frame_ref: int = 0) tuple[list[float], tuple[list[int], list[int]]]

Align structures by permuting chain order and selecting the best RMSD.

pdb_cpp.core.align_index_based(coor_1: pdb_cpp.core.Coor, coor_2: pdb_cpp.core.Coor, index_1: list[int], index_2: list[int], frame_ref: int = 0) tuple[list[float], list[int], list[int]]

Align two coordinate structures using pre-computed atom index pairs, skipping the sequence-alignment step. Accepts non-sequential / non-contiguous index lists. Returns (rmsds, index_1, index_2).

pdb_cpp.core.align_seq_based(coor_1: pdb_cpp.core.Coor, coor_2: pdb_cpp.core.Coor, chain_1: list[str] = ['A'], chain_2: list[str] = ['A'], back_names: list[str] = ['C', 'N', 'O', 'CA'], matrix_file: str = '', frame_ref: int = 0) tuple[list[float], list[int], list[int]]

Align two coordinate structures using sequence based alignment

pdb_cpp.core.compute_SS(coor: pdb_cpp.core.Coor, gap_in_seq: bool = False) list[list[str]]

Compute secondary structure for all models in a Coor object

pdb_cpp.core.compute_dihedrals(pts: numpy.ndarray[numpy.float32]) numpy.ndarray[numpy.float32]

Compute all consecutive dihedral angles from an ordered (N, 3) float array.

Parameters:
ptsndarray, shape (N, 3)

Ordered 3-D coordinates (e.g. consecutive CA positions).

Returns:
ndarray, shape (N-3,)

Dihedral angles in degrees. Returns an empty array when N < 4.

pdb_cpp.core.compute_hbonds(donor_model: pdb_cpp.core.Model, acceptor_model: pdb_cpp.core.Model, full_model: pdb_cpp.core.Model, dist_DA_cutoff: float = 3.5, dist_HA_cutoff: float = 2.5, angle_cutoff: float = 90.0) list[pdb_cpp.core.HBond]

Compute hydrogen bonds between two selections using Baker & Hubbard geometric criteria.

Parameters:
donor_modelModel

Model containing potential donor atoms (a subselection of a frame).

acceptor_modelModel

Model containing potential acceptor atoms (a subselection of a frame).

full_modelModel

The complete frame used to reconstruct backbone N-H positions.

dist_DA_cutofffloat, optional

Maximum donor-heavy to acceptor distance in Å (default 3.5).

dist_HA_cutofffloat, optional

Maximum hydrogen to acceptor distance in Å (default 2.5).

angle_cutofffloat, optional

Minimum D-H···A angle in degrees (default 90).

Returns:
list[HBond]

List of detected hydrogen bonds with full geometry information.

pdb_cpp.core.compute_sasa(model: pdb_cpp.core.Model, probe_radius: float = 1.399999976158142, n_points: int = 960, include_hydrogen: bool = False, by_atom: bool = False) dict

Compute solvent-accessible surface area for one Model with a Shrake-Rupley sampler

pdb_cpp.core.coor_align(coor_1: pdb_cpp.core.Coor, coor_2: pdb_cpp.core.Coor, index_1: list[int], index_2: list[int], frame_ref: int = 0) None

Align two coordinate structures using quaternion-based rotation

pdb_cpp.core.distance_matrix(xyz_a: numpy.ndarray[numpy.float32], xyz_b: numpy.ndarray[numpy.float32]) numpy.ndarray[numpy.float32]

Compute a pairwise distance matrix between two coordinate sets

pdb_cpp.core.get_common_atoms(coor_1: pdb_cpp.core.Coor, coor_2: pdb_cpp.core.Coor, chain_1: list[str] = ['A'], chain_2: list[str] = ['A'], back_names: list[str] = ['C', 'N', 'O', 'CA'], matrix_file: str = '') tuple[list[int], list[int]]

Get common atoms between two Coor objects based on sequence alignment

pdb_cpp.core.hy36decode(width: int, value: str) int

Decode a hybrid-36 string with fixed width.

pdb_cpp.core.hy36encode(width: int, value: int) str

Encode a number using hybrid-36 with fixed width.

pdb_cpp.core.rmsd(coor_1: pdb_cpp.core.Coor, coor_2: pdb_cpp.core.Coor, index_1: list[int], index_2: list[int], frame_ref: int = 0) list[float]

Compute RMSD values for all models in coor_1 against one reference model in coor_2.

pdb_cpp.core.sequence_align(seq1: str, seq2: str, matrix_file: str = '', GAP_COST: int = -11, GAP_EXT: int = -1) pdb_cpp.core.Alignment_cpp

Align two sequences using a substitution matrix and gap penalties

pdb_cpp.core.tmalign_ca(coor_1: pdb_cpp.core.Coor, coor_2: pdb_cpp.core.Coor, chain_1: list[str] = ['A'], chain_2: list[str] = ['A'], mm: int = 0, include_transform: bool = False) pdb_cpp.core.TMalignResult

Align CA atoms of selected chains using the TM-align core from USalign

pdb_cpp.geom module

pdb_cpp.geom.apply_transform(coords, rotation, translation)[source]

Apply a rigid transform to 3D coordinates.

The transform follows $x’ = R x + t$ with row-wise coordinate arrays, implemented as coords @ R.T + t.

Parameters:
coordsarray-like, shape (N, 3)

Input coordinates.

rotationarray-like, shape (3, 3)

Rotation matrix.

translationarray-like, shape (3,)

Translation vector.

Returns:
numpy.ndarray

Transformed coordinates with shape (N, 3).

pdb_cpp.geom.compute_dihedrals(pts: numpy.ndarray[numpy.float32]) numpy.ndarray[numpy.float32]

Compute all consecutive dihedral angles from an ordered (N, 3) float array.

Parameters:
ptsndarray, shape (N, 3)

Ordered 3-D coordinates (e.g. consecutive CA positions).

Returns:
ndarray, shape (N-3,)

Dihedral angles in degrees. Returns an empty array when N < 4.

pdb_cpp.geom.distance_matrix(xyz_a: numpy.ndarray[numpy.float32], xyz_b: numpy.ndarray[numpy.float32]) numpy.ndarray[numpy.float32]

Compute a pairwise distance matrix between two coordinate sets

pdb_cpp.rcsb module

Helpers for downloading and loading structures from the RCSB PDB.

pdb_cpp.rcsb.build_download_url(pdb_id, structure='asymmetric_unit', file_format='cif', assembly_id=1)[source]

Build an RCSB download URL for a structure file.

Parameters:
pdb_idstr

PDB identifier.

structurestr, default=”asymmetric_unit”

Either the deposited asymmetric unit or a biological assembly.

file_formatstr, default=”cif”

Download format. Supported values are "cif" and "pdb".

assembly_idint, default=1

Biological assembly identifier when structure is "biological_assembly".

Returns:
str

Download URL.

pdb_cpp.rcsb.download(*args, **kwargs)[source]

Alias for download_structure().

pdb_cpp.rcsb.download_structure(pdb_id, structure='asymmetric_unit', file_format='cif', assembly_id=1, cache_dir=None, force_download=False)[source]

Download and cache an RCSB structure file.

Parameters:
pdb_idstr

PDB identifier.

structurestr, default=”asymmetric_unit”

Either "asymmetric_unit" or "biological_assembly".

file_formatstr, default=”cif”

Download format. Supported values are "cif" and "pdb".

assembly_idint, default=1

Assembly identifier for biological assemblies.

cache_dirstr, optional

Cache directory. Defaults to a temporary directory managed by pdb_cpp.

force_downloadbool, default=False

Re-download the file even when it is already cached.

Returns:
str

Local path to the cached structure file.

pdb_cpp.rcsb.load(*args, **kwargs)[source]

Alias for load_structure().

pdb_cpp.rcsb.load_structure(pdb_id, structure='asymmetric_unit', file_format='cif', assembly_id=1, cache_dir=None, force_download=False)[source]

Download a structure from RCSB and return it as a Coor object.

pdb_cpp.select module

Selection helpers that wrap the coordinate-object methods.

pdb_cpp.select.remove_incomplete_backbone_residues(coor, back_atom=None)[source]

Remove residues with incomplete backbone atoms.

Parameters:
coorCoor

Coordinate object to clean.

back_atomlist[str], optional

Backbone atom names to require per residue.

Returns:
Coor

A new Coor object with incomplete residues removed.

pdb_cpp.sequence module

Thin wrappers around core sequence helpers.

class pdb_cpp.sequence.Coor(coor_in=None, pdb_id=None, format='', rcsb_structure='asymmetric_unit', assembly_id=1, cache_dir=None, force_download=False)

Bases: pybind11_object

Attributes:
active_model
alterloc_str
beta
chain
chain_str
conect
elem_str
insertres_str
len

Return the number of atoms in the selection.

model_num

Return the number of models in the selection.

models

Return all models of the selection.

name
name_str
num
occ
resid
resname
resname_str
uniq_resid
x
xyz
y
z

Methods

add_Model(self, arg0)

clear(self)

get_Models(self, arg0)

get_aa_DL_seq(self[, gap_in_seq, frame])

Get amino-acid sequences with D-residues encoded as lowercase

get_aa_na_seq(self[, gap_in_seq, frame])

Get amino-acid and nucleic-acid sequences per chain

get_aa_seq(self[, gap_in_seq, frame])

Get amino-acid sequences per chain

get_aa_sequences(self[, gap_in_seq, frame])

Get the amino acid sequence, optionally including gaps and specifying a frame index

get_aa_sequences_dl(self[, gap_in_seq, frame])

Get the amino acid sequence with D-residues encoded as lowercase

get_all_Models(self)

get_alterloc(self[, frame])

get_beta(self[, frame])

get_chain(self[, frame])

get_elem(self[, frame])

get_index_select(self, selection[, frame])

Get indices of atoms based on a selection string and an optional frame index

get_insertres(self[, frame])

get_name(self[, frame])

get_num(self[, frame])

get_occ(self[, frame])

get_resid(self[, frame])

get_resname(self[, frame])

get_uniq_chain(self)

get_uniq_chain_str(self)

get_uniqresid(self[, frame])

get_x(self[, frame])

get_y(self[, frame])

get_z(self[, frame])

model_size(self)

read(self, filename[, format])

Read a structure file; format can be 'pdb', 'cif', 'pqr', or 'gro' (default: infer from extension)

remove_incomplete_backbone_residues(self[, ...])

Remove residues with incomplete backbone atoms

select_atoms(self, selection[, frame])

Select atoms based on a selection string and an optional frame index

select_bool_index(self, arg0)

set_alterloc(self, arg0, arg1)

set_beta(self, arg0, arg1)

set_chain(self, arg0, arg1)

set_elem(self, arg0, arg1)

set_insertres(self, arg0, arg1)

set_name(self, arg0, arg1)

set_num(self, arg0, arg1)

set_occ(self, arg0, arg1)

set_resid(self, arg0, arg1)

set_resname(self, arg0, arg1)

set_uniqresid(self, arg0, arg1)

set_x(self, arg0, arg1)

set_y(self, arg0, arg1)

set_z(self, arg0, arg1)

size(self)

write(self, arg0)

property active_model
add_Model(self: pdb_cpp.core.Coor, arg0: pdb_cpp.core.Model) None
property alterloc_str
property beta
property chain
property chain_str
clear(self: pdb_cpp.core.Coor) None
property conect
property elem_str
get_Models(self: pdb_cpp.core.Coor, arg0: int) pdb_cpp.core.Model
get_aa_DL_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid sequences with D-residues encoded as lowercase

get_aa_na_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid and nucleic-acid sequences per chain

get_aa_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid sequences per chain

get_aa_sequences(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) list[str]

Get the amino acid sequence, optionally including gaps and specifying a frame index

get_aa_sequences_dl(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) list[str]

Get the amino acid sequence with D-residues encoded as lowercase

get_all_Models(self: pdb_cpp.core.Coor) list[pdb_cpp.core.Model]
get_alterloc(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_beta(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_chain(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_elem(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_index_select(self: pdb_cpp.core.Coor, selection: str, frame: int = 0) list[int]

Get indices of atoms based on a selection string and an optional frame index

get_insertres(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_name(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_num(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_occ(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_resid(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_resname(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_uniq_chain(self: pdb_cpp.core.Coor) list[Annotated[list[str], FixedSize(2)]]
get_uniq_chain_str(self: pdb_cpp.core.Coor) list[str]
get_uniqresid(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_x(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_y(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_z(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
property insertres_str
property len

Return the number of atoms in the selection.

Returns:
int

Number of atoms.

property model_num

Return the number of models in the selection.

Returns:
int

Number of models.

model_size(self: pdb_cpp.core.Coor) int
property models

Return all models of the selection.

Returns:
list[Model]

List of models.

property name
property name_str
property num
property occ
read(self: pdb_cpp.core.Coor, filename: str, format: str = '') bool

Read a structure file; format can be ‘pdb’, ‘cif’, ‘pqr’, or ‘gro’ (default: infer from extension)

remove_incomplete_backbone_residues(self: pdb_cpp.core.Coor, back_atom: list[str] = ['CA', 'C', 'N', 'O']) pdb_cpp.core.Coor

Remove residues with incomplete backbone atoms

property resid
property resname
property resname_str
select_atoms(self: pdb_cpp.core.Coor, selection: str, frame: int = 0) pdb_cpp.core.Coor

Select atoms based on a selection string and an optional frame index

select_bool_index(self: pdb_cpp.core.Coor, arg0: list[bool]) pdb_cpp.core.Coor
set_alterloc(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_beta(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_chain(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_elem(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_insertres(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_name(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_num(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_occ(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_resid(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_resname(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_uniqresid(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_x(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_y(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_z(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
size(self: pdb_cpp.core.Coor) int
property uniq_resid
write(self: pdb_cpp.core.Coor, arg0: str) bool
property x
property xyz
property y
property z
pdb_cpp.sequence.get_aa_DL_seq(coor, gap_in_seq=True, frame=0)[source]

Return the amino acid sequence with D-residues in lowercase.

Parameters:
coorCoor

Coordinate object.

gap_in_seqbool, optional

Whether to insert gaps for missing residues.

frameint, optional

Frame index to use for the selection.

Returns:
dict

Mapping of chain ID to sequence.

pdb_cpp.sequence.get_aa_seq(coor, gap_in_seq=True, frame=0)[source]

Return the amino acid sequence of the selection.

Parameters:
coorCoor

Coordinate object.

gap_in_seqbool, optional

Whether to insert gaps for missing residues.

frameint, optional

Frame index to use for the selection.

Returns:
dict

Mapping of chain ID to sequence.

Module contents

Public package entry point for pdb_cpp.

Importing this module exposes the primary coordinate classes and loads the runtime patches that add Python-friendly helpers onto the C++ bindings.

class pdb_cpp.Coor(coor_in=None, pdb_id=None, format='', rcsb_structure='asymmetric_unit', assembly_id=1, cache_dir=None, force_download=False)

Bases: pybind11_object

Attributes:
active_model
alterloc_str
beta
chain
chain_str
conect
elem_str
insertres_str
len

Return the number of atoms in the selection.

model_num

Return the number of models in the selection.

models

Return all models of the selection.

name
name_str
num
occ
resid
resname
resname_str
uniq_resid
x
xyz
y
z

Methods

add_Model(self, arg0)

clear(self)

get_Models(self, arg0)

get_aa_DL_seq(self[, gap_in_seq, frame])

Get amino-acid sequences with D-residues encoded as lowercase

get_aa_na_seq(self[, gap_in_seq, frame])

Get amino-acid and nucleic-acid sequences per chain

get_aa_seq(self[, gap_in_seq, frame])

Get amino-acid sequences per chain

get_aa_sequences(self[, gap_in_seq, frame])

Get the amino acid sequence, optionally including gaps and specifying a frame index

get_aa_sequences_dl(self[, gap_in_seq, frame])

Get the amino acid sequence with D-residues encoded as lowercase

get_all_Models(self)

get_alterloc(self[, frame])

get_beta(self[, frame])

get_chain(self[, frame])

get_elem(self[, frame])

get_index_select(self, selection[, frame])

Get indices of atoms based on a selection string and an optional frame index

get_insertres(self[, frame])

get_name(self[, frame])

get_num(self[, frame])

get_occ(self[, frame])

get_resid(self[, frame])

get_resname(self[, frame])

get_uniq_chain(self)

get_uniq_chain_str(self)

get_uniqresid(self[, frame])

get_x(self[, frame])

get_y(self[, frame])

get_z(self[, frame])

model_size(self)

read(self, filename[, format])

Read a structure file; format can be 'pdb', 'cif', 'pqr', or 'gro' (default: infer from extension)

remove_incomplete_backbone_residues(self[, ...])

Remove residues with incomplete backbone atoms

select_atoms(self, selection[, frame])

Select atoms based on a selection string and an optional frame index

select_bool_index(self, arg0)

set_alterloc(self, arg0, arg1)

set_beta(self, arg0, arg1)

set_chain(self, arg0, arg1)

set_elem(self, arg0, arg1)

set_insertres(self, arg0, arg1)

set_name(self, arg0, arg1)

set_num(self, arg0, arg1)

set_occ(self, arg0, arg1)

set_resid(self, arg0, arg1)

set_resname(self, arg0, arg1)

set_uniqresid(self, arg0, arg1)

set_x(self, arg0, arg1)

set_y(self, arg0, arg1)

set_z(self, arg0, arg1)

size(self)

write(self, arg0)

property active_model
add_Model(self: pdb_cpp.core.Coor, arg0: pdb_cpp.core.Model) None
property alterloc_str
property beta
property chain
property chain_str
clear(self: pdb_cpp.core.Coor) None
property conect
property elem_str
get_Models(self: pdb_cpp.core.Coor, arg0: int) pdb_cpp.core.Model
get_aa_DL_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid sequences with D-residues encoded as lowercase

get_aa_na_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid and nucleic-acid sequences per chain

get_aa_seq(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) dict[str, str]

Get amino-acid sequences per chain

get_aa_sequences(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) list[str]

Get the amino acid sequence, optionally including gaps and specifying a frame index

get_aa_sequences_dl(self: pdb_cpp.core.Coor, gap_in_seq: bool = True, frame: int = 0) list[str]

Get the amino acid sequence with D-residues encoded as lowercase

get_all_Models(self: pdb_cpp.core.Coor) list[pdb_cpp.core.Model]
get_alterloc(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_beta(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_chain(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_elem(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_index_select(self: pdb_cpp.core.Coor, selection: str, frame: int = 0) list[int]

Get indices of atoms based on a selection string and an optional frame index

get_insertres(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(2)]]
get_name(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_num(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_occ(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_resid(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_resname(self: pdb_cpp.core.Coor, frame: int = 0) list[Annotated[list[str], FixedSize(5)]]
get_uniq_chain(self: pdb_cpp.core.Coor) list[Annotated[list[str], FixedSize(2)]]
get_uniq_chain_str(self: pdb_cpp.core.Coor) list[str]
get_uniqresid(self: pdb_cpp.core.Coor, frame: int = 0) list[int]
get_x(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_y(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
get_z(self: pdb_cpp.core.Coor, frame: int = 0) list[float]
property insertres_str
property len

Return the number of atoms in the selection.

Returns:
int

Number of atoms.

property model_num

Return the number of models in the selection.

Returns:
int

Number of models.

model_size(self: pdb_cpp.core.Coor) int
property models

Return all models of the selection.

Returns:
list[Model]

List of models.

property name
property name_str
property num
property occ
read(self: pdb_cpp.core.Coor, filename: str, format: str = '') bool

Read a structure file; format can be ‘pdb’, ‘cif’, ‘pqr’, or ‘gro’ (default: infer from extension)

remove_incomplete_backbone_residues(self: pdb_cpp.core.Coor, back_atom: list[str] = ['CA', 'C', 'N', 'O']) pdb_cpp.core.Coor

Remove residues with incomplete backbone atoms

property resid
property resname
property resname_str
select_atoms(self: pdb_cpp.core.Coor, selection: str, frame: int = 0) pdb_cpp.core.Coor

Select atoms based on a selection string and an optional frame index

select_bool_index(self: pdb_cpp.core.Coor, arg0: list[bool]) pdb_cpp.core.Coor
set_alterloc(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_beta(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_chain(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_elem(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_insertres(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_name(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_num(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_occ(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_resid(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_resname(self: pdb_cpp.core.Coor, arg0: int, arg1: str) None
set_uniqresid(self: pdb_cpp.core.Coor, arg0: int, arg1: int) None
set_x(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_y(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
set_z(self: pdb_cpp.core.Coor, arg0: int, arg1: float) None
size(self: pdb_cpp.core.Coor) int
property uniq_resid
write(self: pdb_cpp.core.Coor, arg0: str) bool
property x
property xyz
property y
property z
class pdb_cpp.Model

Bases: pybind11_object

Attributes:
alterloc_str
beta
chain
chain_str
elem_str
insertres_str
len

Return the number of atoms in the selection.

name
name_str
num
occ
resid
resname
resname_str
uniq_resid
x
xyz
y
z

Methods

addAtom(self, arg0, arg1, arg2, arg3, arg4, ...)

clear(self)

get_alterloc(self)

get_beta(self)

get_centroid(*args, **kwargs)

Overloaded function.

get_chain(self)

get_elem(self)

get_field(self)

get_insertres(self)

get_name(self)

get_num(self)

get_occ(self)

get_resid(self)

get_resname(self)

get_uniqresid(self)

get_x(self)

get_y(self)

get_z(self)

select_atoms(self, arg0)

set_alterloc(self, arg0, arg1)

set_beta(self, arg0, arg1)

set_chain(self, arg0, arg1)

set_elem(self, arg0, arg1)

set_insertres(self, arg0, arg1)

set_name(self, arg0, arg1)

set_num(self, arg0, arg1)

set_occ(self, arg0, arg1)

set_resid(self, arg0, arg1)

set_resname(self, arg0, arg1)

set_uniqresid(self, arg0, arg1)

set_x(self, arg0, arg1)

set_y(self, arg0, arg1)

set_z(self, arg0, arg1)

size(self)

addAtom(self: pdb_cpp.core.Model, arg0: int, arg1: Annotated[list[str], FixedSize(5)], arg2: Annotated[list[str], FixedSize(5)], arg3: int, arg4: Annotated[list[str], FixedSize(2)], arg5: float, arg6: float, arg7: float, arg8: float, arg9: float, arg10: Annotated[list[str], FixedSize(2)], arg11: Annotated[list[str], FixedSize(5)], arg12: Annotated[list[str], FixedSize(2)], arg13: bool, arg14: int) bool
property alterloc_str
property beta
property chain
property chain_str
clear(self: pdb_cpp.core.Model) None
property elem_str
get_alterloc(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(2)]]
get_beta(self: pdb_cpp.core.Model) list[float]
get_centroid(*args, **kwargs)

Overloaded function.

  1. get_centroid(self: pdb_cpp.core.Model) -> Annotated[list[float], FixedSize(3)]

Calculate centroid of all atoms in the model

  1. get_centroid(self: pdb_cpp.core.Model, indices: list[int]) -> Annotated[list[float], FixedSize(3)]

Calculate centroid of atoms at specified indices

get_chain(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(2)]]
get_elem(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(5)]]
get_field(self: pdb_cpp.core.Model) list[bool]
get_insertres(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(2)]]
get_name(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(5)]]
get_num(self: pdb_cpp.core.Model) list[int]
get_occ(self: pdb_cpp.core.Model) list[float]
get_resid(self: pdb_cpp.core.Model) list[int]
get_resname(self: pdb_cpp.core.Model) list[Annotated[list[str], FixedSize(5)]]
get_uniqresid(self: pdb_cpp.core.Model) list[int]
get_x(self: pdb_cpp.core.Model) list[float]
get_y(self: pdb_cpp.core.Model) list[float]
get_z(self: pdb_cpp.core.Model) list[float]
property insertres_str
property len

Return the number of atoms in the selection.

Returns:
int

Number of atoms.

property name
property name_str
property num
property occ
property resid
property resname
property resname_str
select_atoms(self: pdb_cpp.core.Model, arg0: str) list[bool]
set_alterloc(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_beta(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
set_chain(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_elem(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_insertres(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_name(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_num(self: pdb_cpp.core.Model, arg0: int, arg1: int) None
set_occ(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
set_resid(self: pdb_cpp.core.Model, arg0: int, arg1: int) None
set_resname(self: pdb_cpp.core.Model, arg0: int, arg1: str) None
set_uniqresid(self: pdb_cpp.core.Model, arg0: int, arg1: int) None
set_x(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
set_y(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
set_z(self: pdb_cpp.core.Model, arg0: int, arg1: float) None
size(self: pdb_cpp.core.Model) int
property uniq_resid
property x
property xyz
property y
property z