Skip to content

Abstract class for unified workflows#466

Open
YuanbinLiu wants to merge 2 commits intoautoatml:mainfrom
YuanbinLiu:abstract
Open

Abstract class for unified workflows#466
YuanbinLiu wants to merge 2 commits intoautoatml:mainfrom
YuanbinLiu:abstract

Conversation

@YuanbinLiu
Copy link
Copy Markdown
Collaborator

@YuanbinLiu YuanbinLiu commented Jan 5, 2026

This PR is a first step toward developing a UnifiedMaker, which aims to make training data generation more flexible and composable.

The main idea is to support combined data generation workflows, such as chaining RSS and MD into a single process, instead of treating each flow independently.

This implementation is based on the DataGenerationMaker framework
(see: #451) and extends it into a more general abstraction.

Key features

  • A unified workflow for data generation and MLIP fitting with optional multi-stage iteration
  • Support for arbitrary sequences of RSS, MD, and rattling
  • Flexible ordering and repetition of stages
    • RSS → MD → RSS
    • MD → RSS → rattling

For now, this class is intentionally designed as an abstract-style template and serves as a foundation for future concrete workflow implementations.

# Example: ["rss", "md", "rattling"] means the workflow runs RSS first,
# then MD, and finally rattling.
multi_stages: Sequence[str] = field(
default_factory=lambda: ["rss", "md", "rattling"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "RSS", "MD" and "RATTLING" could all be part of one Literal definition. In this way, typos can be avoided at run time.

Comment on lines +154 to +160
if stage in self.stages_requiring_fit:
do_mlip_fit = MLIPFitMaker(...).make(
database_dir=do_data_gen.output["pre_database_dir"],
isolated_atom_energies=do_data_gen.output["isolated_atom_energies"],
**fit_kwargs,
)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this code dows not work at the moment, right?

@naik-aakash
Copy link
Copy Markdown
Collaborator

Hi @YuanbinLiu, it would be nice to have #451 first merged if this is supposed to be based on the DataGenerationMaker framework

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants