hyperpipe_demo_sim

A self-contained toy demo showing the RIFT hyperpipeline and simulation manager working together for adaptive population inference. The problem is intentionally simple so the full loop can be validated without real physics.

What it demonstrates

Hyperpipeline (create_eos_posterior_pipeline): adaptive iterative inference over generic parameters. Each iteration evaluates a batch of candidate points, fits a surrogate posterior, and concentrates the next batch near high-probability regions — no hand-tuned grid required.

Simulation manager (GMMArchive): a persistent, content-addressed cache that sits between the hyperpipe and the likelihood computation. Every worker call checks the cache first; results survive across iterations and across separate runs. In a real application this cache would hold expensive stellar-evolution or radiation-transport outputs; here it holds cheap GMM likelihood evaluations, but the interface is identical.

The toy problem

50 two-dimensional "mass pair" observations are drawn from a fixed two-component GMM (seed 42, true params below). The pipeline recovers the seven population parameters of the model, including the event rate.

True model:

param	value
`w1` (weight of component 1)	0.6
`mu1_x`, `mu1_y` (log₁₀ mean 1)	1.398, 1.477 (25, 30 M☉)
`mu2_x`, `mu2_y` (log₁₀ mean 2)	1.845, 1.778 (70, 60 M☉)
`log_sigma` (log₁₀ σ in dex)	−1.0 (σ = 0.1 dex scatter in log₁₀ mass)
`log_R`	depends on VT model (see below)

The GMM operates in log₁₀-mass space (columns of observations.dat are log₁₀(m/M☉)). This ensures strictly positive masses and gives a lognormal distribution in physical mass space.

Event rate and selection function. The full Poisson likelihood is

lnL = − R·VT_eff + N·ln(R) + Σ_k ln p(m_k | θ)

where VT_eff = ∫ VT(m) p(m|θ) dm is the effective surveyed volume. Two selection models are available (set VT_MODEL in the Makefile):

`VT_MODEL`	`VT(m1,m2)`	`R_true`	`log10(R_true)`
`uniform`	1	N = 50	≈ 1.7
`chirp_mass`	Mchirp^(15/6)	N / VT_eff ≈ 10⁻³	≈ −3

where Mchirp = (m1·m2)^(3/5) / (m1+m2)^(1/5). The chirp-mass scaling arises because GW detector range ∝ Mchirp^(5/6), so surveyed volume ∝ Mchirp^(15/6). create_data.py generates observations drawn from the VT-biased population and prints the true R.

For the chirp_mass model observations are weighted toward the high-mass component, so the pipeline should prefer it.

Known degeneracy: swapping components 1 ↔ 2 leaves the likelihood unchanged, so the posterior is bimodal. The pipeline converges to one mode; this is correct behaviour, not a failure.

Files

File	Purpose
`create_data.py`	Generate `observations.dat` and `true_gmm.json`
`gmm_archive.py`	`GMMArchive` — persistent lnL cache (simulation manager)
`gmm_worker.py`	Hyperpipe-compatible worker executable
`Makefile`	Orchestrates all build steps
`hyperpipe_conf.yaml`	YAML config for the `util_RIFT_hyperpipe.py` driver

How to run

Requires a Python environment with RIFT, NumPy, and SciPy (e.g. conda activate my_rift) and a condor submit node. See INSTALL.md for pixi quickstart instructions.

Path A — inline Makefile (default)

All pipeline configuration is written directly by shell commands in the Makefile; the initial grid is a separate Make target.

# 1. Generate synthetic observations (deterministic, seed=42)
make observations.dat

# 2. Generate initial random parameter grid
make initial_grid.dat

# 3. Smoke-test the worker locally — no condor needed
make test_worker

# 4. Build the condor DAG
make rundir

# 5. Submit
make submit

Path B — util_RIFT_hyperpipe.py + hyperpipe_conf.yaml

Set USE_HYPERPIPE=1 to delegate pipeline construction to the Hydra-based driver. It reads hyperpipe_conf.yaml, generates the initial grid internally, and assembles all args files. Dynamic paths (observations file, archive directory, VT model) are injected via environment variables resolved at runtime by OmegaConf.

USE_HYPERPIPE=1 make observations.dat test_worker   # smoke-test
USE_HYPERPIPE=1 make observations.dat rundir        # build DAG
make submit                                          # submit (same as Path A)

Or using pixi tasks:

pixi run smoke-test-hyperpipe   # smoke-test
pixi run build-dag              # observations + rundir
make submit

Shared knobs

make clean removes all generated files and directories (both paths). The gmm_archive/ cache persists across runs so repeated evaluations of the same points are free.

Key Makefile variables (top of file):

variable	default	meaning
`N_SAMPLES_PER_JOB`	1000	new grid points evaluated per iteration
`N_ITERATIONS`	5	number of adaptive iterations
`NCHUNK`	50	points per condor MARG job (Path A only)
`EXPLODE_JOBS`	3	parallel POST jobs per iteration
`VT_MODEL`	`uniform`	selection function (`uniform` or `chirp_mass`)
`USE_HYPERPIPE`	`0`	set to `1` to use Path B

Switching VT models: run make clean before changing VT_MODEL so that observations.dat and the gmm_archive/ cache are regenerated consistently with the new selection function.

Visualizing results

From rundir/ after the DAG completes:

cd rundir
plot_posterior_corner.py \
    --posterior-file grid-5.dat \
    --parameter w1 --parameter mu1_x --parameter mu1_y --parameter log_R \
    --composite-file all.marg_net \
    --composite-file-has-labels \
    --lnL-cut 15 \
    --use-all-composite-but-grayscale

grid-5.dat is the final posterior sample set (replace 5 with the actual last iteration). all.marg_net accumulates all likelihood evaluations. --lnL-cut 15 discards points more than 15 log-units below the peak; raise it if the posterior looks clipped.

Notes

Archive concurrency. Each condor job writes to its own content-addressed subdirectory (gmm_archive/<hash16>/), so concurrent writes never collide. index.jsonl may gain duplicate entries under races but per-hash result.json is authoritative.

OSG / shared filesystem. The current setup assumes all condor jobs share the submit-node filesystem. For OSG (no shared filesystem), the archive would need to be transferred per-job or backed by OSDF.

lnL scale. With 50 observations the optimum lnL is O(−380). The mcsamplerAdaptiveVolume sampler (--sampler-method AV) handles this scale correctly; the default mcsampler underflows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hyperpipe_demo_sim

What it demonstrates

The toy problem

Files

How to run

Path A — inline Makefile (default)

Path B — util_RIFT_hyperpipe.py + hyperpipe_conf.yaml

Shared knobs

Visualizing results

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
INSTALL.md		INSTALL.md
Makefile		Makefile
README.md		README.md
create_data.py		create_data.py
gmm_archive.py		gmm_archive.py
gmm_worker.py		gmm_worker.py
hyperpipe_conf.yaml		hyperpipe_conf.yaml
pixi.toml		pixi.toml

Folders and files

Latest commit

History

Repository files navigation

hyperpipe_demo_sim

What it demonstrates

The toy problem

Files

How to run

Path A — inline Makefile (default)

Path B — util_RIFT_hyperpipe.py + hyperpipe_conf.yaml

Shared knobs

Visualizing results

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages