Add learn-weights CLI step with PU learning and Optuna tuning by iankchristie · Pull Request #55 · NatLabRockies/reVeal

iankchristie · 2026-05-20T19:49:14Z

Introduces a new reVeal learn-weights pipeline step that trains a PUExtraTrees model on a normalized grid using point labels as positive samples. Derives feature importance weights for use with score-weighted.

Add reVeal/pu/ — PUExtraTree/PUExtraTrees adapted from https://github.com/jonathanwilton/PUExtraTrees, with modifications from /projects/largeload/.../models/PUExtraTrees/ (deterministic seeding, joblib parallelism)
Add reVeal/learn_weights.py — data preparation logic adapted from /projects/largeload/.../models/prepare_data.py (DataHandler), training and metrics from /projects/largeload/.../models/models.py (ModelTrainer)
Add Optuna-based hyperparameter tuning for class_prior — adapted from ModelTrainer.tune_hyperparameters() and ModelTrainer.objective() in /projects/largeload/.../models/models.py
Add CLI command (reVeal/cli/learn_weights.py), config (reVeal/config/learn_weights.py), and tests — new code following existing reVeal CLI patterns (score_weighted, normalize)
Add joblib, scipy, optuna to pyproject.toml and environment.yml

Tested by running the new CLI tool like so

python -m reVeal.cli.cli learn-weights -c /projects/largeload/geospatial/runs/test_scenarios/learned_weights_2026-05-19_agg64/config_learn_weights.json

You can use the learned_weights_2026-05-19_agg64/ directory to test this code.

Next Steps:
This works well to generate the score weights. However, in practice there many of these features are highly correlated or we want to manually exclude for other purposes. We want a few feature engineering tools that will allow users to iterate on the score outputs. This will include:

An exclude list
Visualization features (dendrogram and correlation matrix) to help users feature engineer before it goes into the score_weighted step.

Copilot

Pull request overview

Adds a new reVeal learn-weights pipeline step that uses PU (positive–unlabeled) ExtraTrees training to derive feature-importance-based weights and emit a score-weighted-compatible configuration.

Changes:

Introduces a PU ExtraTrees/ExtraTree implementation (reVeal/pu/*) with optional joblib parallelism and deterministic seeding.
Adds core learning + Optuna tuning workflow (reVeal/learn_weights.py) and exposes it via a new CLI command/config (reVeal/cli/learn_weights.py, reVeal/config/learn_weights.py).
Adds tests for data prep, training, tuning, and config generation (tests/test_learn_weights.py) and updates dependencies (pyproject.toml, environment.yml).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 20 comments.

Show a summary per file

File	Description
tests/test_learn_weights.py	New test coverage for the learn-weights pipeline helpers and Optuna tuning.
reVeal/pu/trees.py	Adds PUExtraTrees forest implementation plus grid/cutpoint utilities.
reVeal/pu/tree.py	Adds PUExtraTree decision tree implementation used by the forest.
reVeal/pu/init.py	Exposes PUExtraTrees from the new `reVeal.pu` package.
reVeal/learn_weights.py	Implements PU data preparation, training, Optuna tuning, and weight/config generation.
reVeal/config/learn_weights.py	Adds pydantic config model for learn-weights inputs.
reVeal/cli/learn_weights.py	New CLI command wiring to run learn-weights and write outputs.
reVeal/cli/cli.py	Registers the new `learn-weights` command in the main CLI.
pyproject.toml	Adds new runtime dependencies needed by learn-weights/PU trees (but missing scikit-learn).
environment.yml	Adds conda dependencies for the new functionality (but missing scikit-learn).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Introduces a new `reVeal learn-weights` pipeline step that trains a PUExtraTrees model on a normalized grid using point labels as positive samples. Derives feature importance weights for use with `score-weighted`. - Add reVeal/pu/ — PUExtraTree/PUExtraTrees adapted from https://github.com/jonathanwilton/PUExtraTrees, with modifications from /projects/largeload/.../models/PUExtraTrees/ (deterministic seeding, joblib parallelism) - Add reVeal/learn_weights.py — data preparation logic adapted from /projects/largeload/.../models/prepare_data.py (DataHandler), training and metrics from /projects/largeload/.../models/models.py (ModelTrainer) - Add Optuna-based hyperparameter tuning for class_prior — adapted from ModelTrainer.tune_hyperparameters() and ModelTrainer.objective() in /projects/largeload/.../models/models.py - Add CLI command (reVeal/cli/learn_weights.py), config (reVeal/config/learn_weights.py), and tests — new code following existing reVeal CLI patterns (score_weighted, normalize) - Add joblib, scipy, optuna to pyproject.toml and environment.yml

codecov-commenter · 2026-05-20T22:18:48Z

Codecov Report

❌ Patch coverage is 64.18605% with 77 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.62%. Comparing base (dc210e6) to head (2374e7b).

Files with missing lines	Patch %	Lines
reVeal/learn_weights.py	72.72%	35 Missing and 4 partials ⚠️
reVeal/cli/learn_weights.py	26.92%	38 Missing ⚠️

❌ Your patch status has failed because the patch coverage (64.18%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #55      +/-   ##
==========================================
- Coverage   84.50%   81.62%   -2.89%     
==========================================
  Files          17       20       +3     
  Lines        1304     1518     +214     
  Branches      180      200      +20     
==========================================
+ Hits         1102     1239     +137     
- Misses        163      236      +73     
- Partials       39       43       +4

Flag	Coverage Δ
unittests	`81.62% <64.18%> (-2.89%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ppinchuk

Thanks for adding tests and keeping the environment lockfile updated!

iankchristie requested a review from ppinchuk as a code owner May 20, 2026 19:49

Copilot AI review requested due to automatic review settings May 20, 2026 19:49

Copilot started reviewing on behalf of iankchristie May 20, 2026 19:49 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

iankchristie force-pushed the ichristi/learn_weights branch from 66ba519 to 687f961 Compare May 20, 2026 20:04

iankchristie changed the title ~~Add learn-weights CLI step with PU learning and Optuna tuning~~ DNR: Add learn-weights CLI step with PU learning and Optuna tuning May 20, 2026

iankchristie force-pushed the ichristi/learn_weights branch 2 times, most recently from 6d2e29f to a602ddc Compare May 20, 2026 22:08

WIP

2374e7b

iankchristie force-pushed the ichristi/learn_weights branch from a602ddc to 2374e7b Compare May 20, 2026 22:11

iankchristie changed the title ~~DNR: Add learn-weights CLI step with PU learning and Optuna tuning~~ Add learn-weights CLI step with PU learning and Optuna tuning May 20, 2026

ppinchuk approved these changes May 21, 2026

View reviewed changes

iankchristie merged commit 8cd3c1e into main May 21, 2026
10 checks passed

iankchristie deleted the ichristi/learn_weights branch May 21, 2026 19:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add learn-weights CLI step with PU learning and Optuna tuning#55

Add learn-weights CLI step with PU learning and Optuna tuning#55
iankchristie merged 2 commits into
mainfrom
ichristi/learn_weights

iankchristie commented May 20, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented May 20, 2026

Uh oh!

ppinchuk left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

iankchristie commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov-commenter commented May 20, 2026

Codecov Report

Uh oh!

ppinchuk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

iankchristie commented May 20, 2026 •

edited

Loading