After the recent work in #49, where I tried to chain together all the individual functions from poseinterface.io, I was inspired to come up with a higher-level API for the project-conversion workflow, i.e. Step 1 of the below schematic.
The workflow currently involved running four per-asset functions (the primitives):
video_to_poseinterface
annotations_to_poseinterface
frames_to_poseinterface
predictions_to_poseinterface
It's on the caller to glue them together: discover the source paths, build the per-session output directories under, and call the four primitives in the right order with matching arguments.
The gallery example in #49 makes this concrete: only about a third of its per-session loop body is conversion calls, the rest is path wrangling and output-layout construction.
The same pattern would apply to anyone writing this kind of pipeline against their own data.
The proposal is to introduce a generic Session dataclass (or maybe attrs class for easier built-in validation?) and two orchestration functions in a new poseinterface/core.py. The existing primitives in poseinterface/io.py can stay mostly unchanged; the new functions would just compose them and write to the spec layout.
Proposed public API
@dataclass
class Session:
sub_id: str
ses_id: str
cam_id: str
video_path: Path
split: Literal["Train", "Test"] = "Train"
annotations_path: Path | None = None
frames_source_dir: Path | None = None
predictions_path: Path | None = None
def convert_session(
session: Session,
benchmark_root: Path,
*,
project: str,
) -> Path: ...
def convert_project(
sessions: Iterable[Session],
benchmark_root: Path,
*,
project: str,
) -> None: ...
Session carries everything we need to know about one session: its identifiers, which split it should go in, and the source paths. The above session signature is mostly informed by our recent work on DLC projects, we may need to tweak it to generalise well across pose estimation libraries.
convert_session runs the conversion primitives and writes to <benchmark_root>/<split>/<project>/sub-X_ses-Y/, following the benchmark dataset spec.
convert_project is a thin loop for converting multiple sessions belonging to the same project (sessions self-route to Train/Test via their split).
These 3 elements could be exposed at the top API level: from poseinterface import Session, convert_session, convert_project.
Concretely
The user constructs Session objects directly with explicit source paths and calls convert_project:
sessions = [
Session(
sub_id="...", ses_id="...", cam_id="...",
video_path=Path(".../video.mp4"),
annotations_path=Path(".../annotations.csv"),
frames_source_dir=Path(".../labeled-data/<stem>"),
predictions_path=Path(".../predictions.csv"),
split="Train",
),
# ... one per session
]
convert_project(sessions, benchmark_root, project="MyProject")
Once this API is in place, we can layer source-specific adapters/constructors on top — e.g. a helper that walks a DLC (or SLEAP, etc.) project and yields a list of Session objects in one go, so the user doesn't have to enumerate sessions by hand. That's a separate proposal once we've lived with this one for a bit; easier to design the batch conversion layer once we've settled on the per-session API layer.
Clip extraction (Step 2 of the workflow) stays separate, because clip params (duration, start_frame) are typically iterated independently of the one-shot conversion.
Happy to draft a PR after other PRs #45 and #49 are merged, assuming a consensus emerges around this proposal. Let me know your thoughts!
After the recent work in #49, where I tried to chain together all the individual functions from
poseinterface.io, I was inspired to come up with a higher-level API for the project-conversion workflow, i.e. Step 1 of the below schematic.The workflow currently involved running four per-asset functions (the primitives):
video_to_poseinterfaceannotations_to_poseinterfaceframes_to_poseinterfacepredictions_to_poseinterfaceIt's on the caller to glue them together: discover the source paths, build the per-session output directories under, and call the four primitives in the right order with matching arguments.
The gallery example in #49 makes this concrete: only about a third of its per-session loop body is conversion calls, the rest is path wrangling and output-layout construction.
The same pattern would apply to anyone writing this kind of pipeline against their own data.
The proposal is to introduce a generic
Sessiondataclass (or maybeattrsclass for easier built-in validation?) and two orchestration functions in a newposeinterface/core.py. The existing primitives inposeinterface/io.pycan stay mostly unchanged; the new functions would just compose them and write to the spec layout.Proposed public API
Sessioncarries everything we need to know about one session: its identifiers, which split it should go in, and the source paths. The above session signature is mostly informed by our recent work on DLC projects, we may need to tweak it to generalise well across pose estimation libraries.convert_sessionruns the conversion primitives and writes to<benchmark_root>/<split>/<project>/sub-X_ses-Y/, following the benchmark dataset spec.convert_projectis a thin loop for converting multiple sessions belonging to the same project (sessions self-route to Train/Test via theirsplit).These 3 elements could be exposed at the top API level:
from poseinterface import Session, convert_session, convert_project.Concretely
The user constructs
Sessionobjects directly with explicit source paths and callsconvert_project:Once this API is in place, we can layer source-specific adapters/constructors on top — e.g. a helper that walks a DLC (or SLEAP, etc.) project and yields a list of
Sessionobjects in one go, so the user doesn't have to enumerate sessions by hand. That's a separate proposal once we've lived with this one for a bit; easier to design the batch conversion layer once we've settled on the per-session API layer.Clip extraction (Step 2 of the workflow) stays separate, because clip params (
duration,start_frame) are typically iterated independently of the one-shot conversion.Happy to draft a PR after other PRs #45 and #49 are merged, assuming a consensus emerges around this proposal. Let me know your thoughts!