GPU-accelerated molecular contact analysis. Computes hydrogen bonds, salt bridges, pi-stacking, T-stacking, pi-cation, van der Waals, and hydrophobic contacts across MD trajectories, outputting compressed Parquet files for downstream analysis.
~150 frames/second on a modern GPU (tested on a 6354-frame, ~20,000-atom protein trajectory completing in ~42 seconds).
- NVIDIA GPU with CUDA support
cupymatching your CUDA version:pip install cupy-cuda12x # CUDA 12 pip install cupy-cuda13x # CUDA 13
MDAnalysis,pyarrow,polars,tqdm,numpy
pip install -e .ultracontacts contacts \
--topology protein.pdb \
--trajectory sim.dcd \
--output contacts.parquetBy default this also writes contacts_frequencies.parquet (per-residue-pair contact frequencies). Use --no-frequencies to skip this.
Strongly recommended for H-bond detection — PDB CONECT records are often missing:
ultracontacts contacts \
--topology system.pdb \
--trajectory sim.dcd \
--openmm-system system.xml \
--output contacts.parquetultracontacts contacts \
--topology system.psf \
--trajectory production.dcd \
--itypes hb sb ps ts pc \
--beg 100 --end 4999 --stride 2 \
--sele "protein" \
--sele2 "resname LIG" \
--output prot_lig_contacts.parquet# Frequencies are written automatically — skip with:
ultracontacts contacts ... --no-frequencies
# Custom frequency file path (.tsv extension → TSV format):
ultracontacts contacts ... --frequencies my_freqs.tsv
# Also write condensed wide-format (one row, columns = "res1-res2" pair names):
ultracontacts contacts ... --condensed# Long format (itype, res1, res2, frequency, count):
ultracontacts frequencies --input contacts.parquet --output freqs.parquet
# Condensed wide-format (one row, residue pair column names, any-itype probability):
ultracontacts frequencies --input contacts.parquet --output freqs_condensed.parquet --condensed
# TSV output:
ultracontacts frequencies --input contacts.parquet --output freqs.tsv| Flag | Name | Default Criteria |
|---|---|---|
hb |
Hydrogen bond | D···A < 3.5 Å, D-H···A > 110° (matches VMD/getcontacts) |
sb |
Salt bridge | Anion–cation distance < 4.0 Å |
ps |
Pi-stacking | Centroid dist < 7.0 Å, plane angle < 30°, psi < 45° |
ts |
T-stacking | Centroid dist < 5.0 Å, plane angle ≈ 90° ± 30°, psi ≈ 90° ± 45° |
pc |
Pi-cation | Centroid–cation dist < 6.0 Å, normal–cation angle < 60° |
vdw |
Van der Waals | Distance < r₁ + r₂ + 0.5 Å |
hp |
Hydrophobic | C/S atoms < 4.0 Å (ALA, PHE, GLY, ILE, LEU, PRO, VAL, TRP) |
Use --itypes all to compute all types. Default: hb sb ps ts pc.
All geometric criteria can be overridden:
ultracontacts contacts ... \
--hbond-cutoff-dist 3.2 \
--hbond-cutoff-ang 120 \
--vdw-epsilon 0.3| Column | Type | Description |
|---|---|---|
frame |
int32 | Trajectory frame index |
itype |
string | Interaction type (hb, sb, ps, ts, pc, vdw, hp) |
atom1 |
string | chain:resname:resid:name |
atom2 |
string | chain:resname:resid:name |
| Column | Type | Description |
|---|---|---|
itype |
string | Interaction type |
res1 |
string | chain:resname:resid (lexicographically ≤ res2) |
res2 |
string | chain:resname:resid |
frequency |
float64 | Fraction of frames where contact occurs |
count |
int64 | Number of frames |
Canonical residue ordering matches getcontacts: res1 ≤ res2 lexicographically.
One row. Columns are residue pair names "res1-res2". Values are the fraction of frames where any contact of any type exists between that pair. Suitable for direct use in ML/statistics pipelines:
import polars as pl
df = pl.read_parquet("contacts_condensed.parquet")
# shape: (1, N_pairs)
# columns: ["A:ALA:1-A:ARG:5", "A:ALA:1-A:GLY:8", ...]ultracontacts uses the same geometric criteria and residue-pair ordering as getcontacts and produces compatible output, with these differences:
- VdW/HP sidechain contacts between adjacent residues: ultracontacts reports these; getcontacts blanket-excludes all adjacent-residue VdW/HP pairs
- Speed: ~150 fps on GPU vs ~1–5 fps for getcontacts on CPU
- No VMD dependency: uses MDAnalysis + CuPy CUDA kernels
Static topology data (atom indices, masks, radii) is uploaded to GPU once at startup. Per-frame, only coordinate data (~70 KB) is transferred. Five fused CUDA raw kernels evaluate candidate pairs via atomicAdd without materialising the full N² distance matrix:
dist_contacts— distance threshold (sb, hp)vdw_contacts— per-pair VdW cutoff from atomic radii (vdw)hbond_contacts— D···A distance + D-H···A angle at H vertex (hb)ring_stacking— centroid distance + plane angle + psi (ps, ts)pi_cation— centroid distance + normal-cation angle (pc)
Frequency calculations use Polars lazy streaming — bounded memory, Rust/SIMD string operations.