Skip to content

Qcarchive update#187

Merged
mikemhenry merged 32 commits into
choderalab:mainfrom
chrisiacovella:qcarchive_update
Jun 5, 2025
Merged

Qcarchive update#187
mikemhenry merged 32 commits into
choderalab:mainfrom
chrisiacovella:qcarchive_update

Conversation

@chrisiacovella

@chrisiacovella chrisiacovella commented Sep 21, 2023

Copy link
Copy Markdown
Member

This updates qcarchive_utils.py to be compatible with v0.5 of qcportal. Relates to issue #185

This code reproduces the same behavior as the prior implementation.

@mikemhenry

Copy link
Copy Markdown
Contributor

Awesome! This is good timing with #186

Once we get both in, we should cut a new release.

@chrisiacovella

Copy link
Copy Markdown
Member Author

This PR implements the logic in effectively the same way as the old code, which is on a per-record basis (i.e., a function operates on a single record name). The new version of qcportal has iterators on records, which are substantially faster (like orders of magnitude, due to prefetching and caching). The next commit will include functions that operate on the entire record sets to avoid slow performance.

@codecov-commenter

codecov-commenter commented Sep 21, 2023

Copy link
Copy Markdown

Codecov Report

❗ No coverage uploaded for pull request base (main@2e61215). Click here to learn what that means.
The diff coverage is n/a.

Additional details and impacted files

@mikemhenry

Copy link
Copy Markdown
Contributor

@mikemhenry

Copy link
Copy Markdown
Contributor

We can probably remove that line since https://github.com/choderalab/espaloma/pull/187/files#diff-ba5d22563299549a389183418fe5786b83275382be592bf1ed06fae673b7d086R33 will pull in what we need (I think, I am not sure what the "main" qcarchive package is)

@chrisiacovella

Copy link
Copy Markdown
Member Author

@mikemhenry mikemhenry left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, had two non-blocking notes

Comment thread espaloma/data/qcarchive_utils.py Outdated
Comment thread espaloma/data/qcarchive_utils.py Outdated

@ijpulidos ijpulidos left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is geat. I'm glad that we are now testing the behavior and have some documentation for these utils. I agree with the comments that have been made. Looks good to be merged, just a single non-blocking comment.

Comment thread espaloma/data/tests/test_qcarchive.py
Comment thread espaloma/data/qcarchive_utils.py
@mikemhenry mikemhenry self-requested a review September 22, 2023 16:31

@mikemhenry mikemhenry left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From @jchodera

There are apparently some additional issues with the object model such that datasets beyond OptimizationDataset are not supported

@chrisiacovella

Copy link
Copy Markdown
Member Author

From @jchodera

There are apparently some additional issues with the object model such that datasets beyond OptimizationDataset are not supported

Yes. the get_graph function in the initial code was only setup to work with the OptimizationDataset. I think it would be straight forward to support the SinglepointDataset objects and put in some checking in get_graph and get_graphs to give a descriptive failure message if a different set is tried.

@kntkb

kntkb commented Sep 22, 2023

Copy link
Copy Markdown
Contributor

@chrisiacovella I remember when fetching the results from the SinglepointDataset that uses b3lyp-d3bj (openff default level of theory), you needed to combine the results from the DFT and the dispersion correction terms. This is not the case for OptimizationDataset and TorsionDriveDataset. I wonder if this behavior is the same for the latest QCArchive server and qcprotal.

@chrisiacovella

chrisiacovella commented Sep 22, 2023

Copy link
Copy Markdown
Member Author

@chrisiacovella I remember when fetching the results from the SinglepointDataset that uses b3lyp-d3bj (openff default level of theory), you needed to combine the results from the DFT and the dispersion correction terms. This is not the case for OptimizationDataset and TorsionDriveDataset. I wonder if this behavior is the same for the latest QCArchive server and qcprotal.

@kntkb This is something I started looking at when switching from the old to the new version, but I can't seem to find my notes; for some reason I think one of the specifications does include the sum, but don't quote me on that. I'm currently trying to figure that out right now actually.

chrisiacovella and others added 6 commits September 22, 2023 14:10
… dataset has the smiles encoded for converting to openff.molecule
… dataset has the smiles encoded for converting to openff.molecule
…d so that it will raise the desired exception rather than failing.
…rse the singlepoint records properly at this point. Other issues need to be resolved with singlepoint energy beyond this (i.e., summation of dispersion corrections).
…rse the singlepoint records properly at this point. Other issues need to be resolved with singlepoint energy beyond this (i.e., summation of dispersion corrections). This PR should sufficiently reproduce the prior behavior, but with new qcportal.
@mikemhenry

Copy link
Copy Markdown
Contributor

@chrisiacovella Is this PR good to go? I know its a year old now BUT is it good to go?

@chrisiacovella chrisiacovella requested a review from mikemhenry June 4, 2025 21:48
@mikemhenry

Copy link
Copy Markdown
Contributor

Thanks! Will make a new release of espaloma next!

@mikemhenry mikemhenry merged commit 14f14bf into choderalab:main Jun 5, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants