Skip to content

adding dtype='pdb' to to_file #369

@shehan807

Description

@shehan807

Describe the solution you'd like
The to_file method currently does not support writing to PDB files. For various molecular mechanics software that interface with QC packages, this could be useful. Ideally, this can be done by implementing a PDB writer to allow dtype='pdb'.

Describe alternatives you've considered
I've written a preliminary helper function to get this started and am happy to make a PR with more general PDB file handling if this feature is of serious interest:

def molecule_to_pdb_file(molecule, filename: str, res_name: str) -> None:
    """Writes a QCElemental Molecule to a PDB file, handling multiple fragments and adding CONECT records if connectivity information is provided."""
    coords = molecule.geometry * constants.bohr2angstroms
    symbols = molecule.symbols
    fragments = molecule.fragments

    pdb_lines = []
    atom_idx = 1

    for res_seq, fragment in enumerate(fragments, start=1):
        residue_name = res_name
        chain_id = ' '

        for i, atom in enumerate(fragment):
            symbol = symbols[atom]
            xyz = coords[atom]
            atom_name = f"{symbol.upper()}{i}"

            pdb_lines.append(
                f"ATOM  {atom_idx:5d} {atom_name:<4} {residue_name:<3} {chain_id}{res_seq:4d}    "
                f"{xyz[0]:8.3f}{xyz[1]:8.3f}{xyz[2]:8.3f}"
            )
            atom_idx += 1

    # pdb_lines.append("END")

    if molecule.connectivity:
        unique_bonds = set()
        for bond in molecule.connectivity:
            idx1, idx2 = sorted(bond[:2])  
            unique_bonds.add((idx1 + 1, idx2 + 1))

        for idx1, idx2 in unique_bonds:
            pdb_lines.append(f"CONECT{idx1:5d}{idx2:5d}")

    with open(filename, "w") as pdb:
        pdb.write('\n'.join(pdb_lines))

Presumably, the molecule.connectivity information is missing if it isn't required. In lieu of adding additional dependencies, one can handle CONECT records outside of QCElemental, e.g., I'm using MDAnalysis below:

from MDAnalysis import Universe 
def _add_CONECT(pdb_filename: str) -> None:
    """Adds CONECT records to an existing PDB file using MDAnalysis's default bond guesser."""
    from MDAnalysis import Universe
    u = Universe(pdb_filename, format='PDB', guess_bonds=True)
    conect_lines = set()

    for bond in u.bonds:
        idx1, idx2 = sorted(bond.indices + 1)  # PDB atom numbering starts at 1
        conect_lines.add(f"CONECT{idx1:5d}{idx2:5d}")

    with open(pdb_filename, "a") as pdb:
        pdb.write('\n' + '\n'.join(sorted(conect_lines)) + '\n')

So even a PDB file that's functionally equivalent to the current xyz file outputs would be useful.

Additional context
One concrete example of PDB file usage would be wherever one would like to run a molecular mechanics calculation on a given molecule, e.g., see CrystaLatte PR#14

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions