Interactive web interface for exploring and visualizing segmentation challenge evaluation results — score distributions, submission drill-down, side-by-side 3D Neuroglancer views, and PNG image exports.
- Access to the cellmap challenge data (evaluation results and ground-truth zarr volumes)
- Python 3.9+ (handled automatically by pixi)
pixi creates an isolated environment and installs all dependencies on first run.
1. Install pixi (skip if already installed):
curl -fsSL https://pixi.sh/install.sh | sh
source ~/.bashrc # or ~/.zshrc2. Start the app:
cd cellmap-challenge-analysis
pixi run startpip install -r requirements.txt
./start.shOr run directly:
python app.py --deduplicateOn the very first run, if .config.json is missing, the app will prompt you interactively:
Configuration Setup
============================================================
Directory containing eval_*.results files (RESULTS_DIR)
Enter value (or press Enter for default):
Path to evaluations.csv file (EVALUATIONS_CSV)
Enter value (or press Enter for default):
Your answers are saved to .config.json and reused on subsequent runs. You can also create or edit this file directly — see Configuration.
After the 10–30 second startup (loading ~197 evaluations):
| Service | URL |
|---|---|
| Web dashboard | http://localhost:5000 |
| Neuroglancer viewer | http://localhost:8080 |
Ports are auto-selected if 5000 or 8080 are in use.
The app reads .config.json in the project root. This file is not tracked by git.
{
"RESULTS_DIR": "/path/to/results",
"EVALUATIONS_CSV": "/path/to/results/evaluations.csv",
"GT_DATA_PATH": "/path/to/ground_truth.zarr",
"EM_S3_URL_TEMPLATE": "s3://janelia-cosem-datasets/{dataset_name}/{dataset_name}.zarr/recon-1/em/fibsem-uint8",
"SEMI-PUBLIC_URL": "https://your-server-url"
}| Key | Required | Description |
|---|---|---|
RESULTS_DIR |
Yes | Directory containing eval_*.results JSON files |
EVALUATIONS_CSV |
Yes | Path to evaluations.csv metadata file |
GT_DATA_PATH |
No | Path to ground-truth zarr store; needed for image export and Neuroglancer |
EM_S3_URL_TEMPLATE |
No | S3 URL template for raw EM data; {dataset_name} is substituted at runtime |
SEMI-PUBLIC_URL |
No | Base URL used when constructing shareable Neuroglancer links |
The CSV must contain these columns (column order does not matter; a UTF-8 BOM is stripped automatically):
| Column | Required | Type | Description |
|---|---|---|---|
evaluation_id |
Yes | integer | Primary key; must match the numeric ID in the corresponding eval_<id>.results filename |
username |
Yes | string | Submitter identifier; used in the UI and for deduplication |
submission_name |
Yes | string | Human-readable submission label displayed in the table |
data_path |
Yes | string | Path to the prediction zarr store; used by Neuroglancer and image export |
status |
Yes | string | Evaluation status (e.g. completed, failed) |
created_at |
Yes | datetime string | Submission timestamp; used for date-range filtering and deduplication ordering |
The created_at column is parsed flexibly and accepts any of these formats:
2024-01-15 14:30:00 (ISO, 24h)
01/15/24 02:30 PM (short year, 12h)
01/15/2024 02:30 PM (full year, 12h)
Any other columns present in the file are ignored.
Minimal example:
evaluation_id,username,submission_name,data_path,status,created_at
42,alice,my-first-run,/data/predictions/alice_run1.zarr,completed,2024-03-01 10:00:00
43,bob,baseline,/data/predictions/bob_baseline.zarr,completed,2024-03-02 09:15:00The filter bar at the top of the page controls all views:
| Filter | Description |
|---|---|
| Metric | Score to plot (e.g. dice_score, iou, overall_score) |
| Crop | Restrict to a specific test crop, or "All Crops" for aggregate |
| Organelle | Restrict to a specific organelle, or "All Organelles" for aggregate |
| Date Range | Show only submissions created within a date window |
| Exclude Missing | Toggle whether NaN/missing scores appear in the histogram |
The histogram updates automatically whenever a filter changes. Active filters are shown as badges below the controls.
- Select a metric, crop, and/or organelle using the filters.
- The histogram shows the score distribution across all matching submissions.
- Click a bar to open a drill-down table of submissions in that score range, sorted by score (highest first).
- Click anywhere outside the table (or the X button) to close it.
- Set Crop and Organelle to specific values (not "All").
- Click a histogram bar to open the submissions table.
- Click "View in Neuroglancer" for any submission.
- A new tab opens with a side-by-side layout:
- Left panel — ground-truth segmentation
- Right panel — submitted prediction
- Raw EM grayscale background (if EM data is available)
Use standard Neuroglancer mouse controls to navigate (scroll to zoom, click+drag to pan, right-click to cross-section).
- Set filters to identify a submission of interest.
- In the submissions table, click "Export Images" for a submission.
- PNG files are generated (one per crop) and downloaded automatically.
- Single crop → single
.png - Multiple crops →
.ziparchive
- Single crop → single
Images show GT and prediction side-by-side with all organelles color-coded using canonical colors.
python app.py [--deduplicate] [--port PORT]| Flag | Default | Description |
|---|---|---|
--deduplicate |
off | Remove duplicate evaluations on startup (keeps latest per username+scores) |
--port PORT |
5000 | Flask listen port (auto-increments if busy) |
start.sh and pixi run start both pass --deduplicate automatically.
python make_eval_images.py EVAL_ID [--output-dir DIR] [--crops CROP ...] [--organelles ORG ...]| Argument | Description |
|---|---|
EVAL_ID |
Evaluation ID (integer) |
--output-dir DIR |
Where to save PNGs (default: <RESULTS_DIR>/eval_images/<eval_id>/) |
--crops CROP ... |
Limit to specific crop names (default: all) |
--organelles ORG ... |
Limit to specific organelle names (default: all) |
Example:
python make_eval_images.py 42 --crops test_crop1 test_crop3 --organelles mito erOr via pixi:
pixi run eval-images 42 --output-dir /tmp/eval42Identifies and removes duplicate evaluations (same username, identical overall scores), keeping the most recent submission per group.
python deduplicate_evaluations.pyThis runs automatically when
app.pyis started with--deduplicate.
Creates review_sheet.csv with pre-built Neuroglancer links for each evaluation+crop combination. State JSON files are saved to ng_states/.
python make_review_csv.pyCorrects data-type mismatches in zarr arrays (e.g. bool → uint8).
python fix_dtype.py --input /path/to/array.zarr
python fix_dtype.py --csv /path/to/list.csv
python fix_dtype.py --input /path/to/array.zarr --metadata-only| Metric | Description |
|---|---|
overall_score |
Combined instance + semantic score |
overall_instance_score |
Average instance segmentation score across organelles |
overall_semantic_score |
Average semantic segmentation score across organelles |
| Metric | Description |
|---|---|
accuracy |
Voxel-level classification accuracy |
hausdorff_distance |
Hausdorff distance between predicted and GT instances |
normalized_hausdorff_distance |
Hausdorff distance normalized by volume |
combined_score |
Combined instance metric |
| Metric | Description |
|---|---|
iou |
Intersection over Union |
dice_score |
Dice coefficient |
You can also enter metric expressions in the filter bar (evaluated via pandas.eval), for example:
(iou + dice_score) / 2All endpoints are under http://localhost:5000/api/.
Returns available filter options loaded at startup.
Response:
{
"metrics": ["dice_score", "iou", "overall_score"],
"crops": ["test_crop1", "test_crop2"],
"organelles": ["mito", "er", "nucleus"],
"date_range": {"min": "2024-01-01", "max": "2024-12-31"}
}Computes score distribution for a metric with optional filters.
| Parameter | Required | Description |
|---|---|---|
metric |
Yes | Metric name or expression |
crop |
No | Crop filter |
organelle |
No | Organelle filter |
bins |
No | Number of histogram bins (default: 20) |
exclude_missing |
No | Exclude NaN values (default: true) |
date_from |
No | ISO date string lower bound |
date_to |
No | ISO date string upper bound |
Response:
{"bins": [], "counts": [], "bin_edges": [], "bin_width": 0.05}Returns submissions whose score falls in a range.
| Parameter | Required | Description |
|---|---|---|
metric |
Yes | Metric name |
bin_min |
Yes | Minimum score (inclusive) |
bin_max |
Yes | Maximum score (inclusive) |
crop |
No | Crop filter |
organelle |
No | Organelle filter |
date_from |
No | ISO date lower bound |
date_to |
No | ISO date upper bound |
Response: Array of submission objects sorted by score descending.
Generates a Neuroglancer viewer URL for a specific submission.
| Parameter | Required | Description |
|---|---|---|
eval_id |
Yes | Evaluation ID |
crop |
Yes | Crop name |
organelle |
Yes | Organelle name |
Response: {"url": "http://localhost:8080/#!{state_token}"}
Generates PNG comparison images for a submission.
| Parameter | Required | Description |
|---|---|---|
eval_id |
Yes | Evaluation ID |
crop |
No | Restrict to one crop |
organelle |
No | Restrict to one organelle |
Response: Single PNG file or ZIP archive (multiple crops).
Returns metadata for a specific evaluation (username, date, data path, scores, etc.).
Returns the list of crops available for a given evaluation.
| Parameter | Required | Description |
|---|---|---|
eval_id |
Yes | Evaluation ID |
Returns the organelles available for a given evaluation+crop combination.
| Parameter | Required | Description |
|---|---|---|
eval_id |
Yes | Evaluation ID |
crop |
Yes | Crop name |
cellmap-challenge-analysis/
├── app.py # Flask server and API endpoints
├── config.py # Configuration loading (.config.json)
├── data_loader.py # Data ingestion and query logic
├── neuroglancer_utils.py # Neuroglancer viewer creation
├── make_eval_images.py # PNG generation (CLI + Flask API)
├── deduplicate_evaluations.py
├── make_review_csv.py
├── fix_dtype.py
├── start.sh # Convenience wrapper: python app.py --deduplicate
├── pixi.toml # Pixi environment definition
├── requirements.txt
├── .config.json # Local config (not tracked)
├── templates/index.html # Single-page web UI
├── static/js/main.js # Plotly histograms, filters, drill-down
├── static/css/style.css
└── ng_states/ # Cached Neuroglancer state JSON files
Startup
└─ load .config.json
└─ (optionally) deduplicate evaluations.csv
└─ initialize Neuroglancer server (port 8080+)
└─ load eval_*.results → pandas DataFrames
└─ merge with evaluations.csv metadata
└─ fetch crop→dataset manifest from GitHub
└─ cache everything in memory
Browser request
└─ select filters → GET /api/histogram_data → Plotly renders histogram
└─ click bar → GET /api/bin_submissions → table appears
└─ click button → GET /api/neuroglancer_url OR /api/export_images
| Package | Purpose |
|---|---|
| Flask + flask-cors | Web server and REST API |
| pandas | Data loading, filtering, histogram computation |
| neuroglancer | 3D viewer server |
| zarr (<3.0) | Reading zarr volume arrays |
| matplotlib | PNG image rendering |
| s3fs (optional) | Loading raw EM data from S3 |
- Verify all paths in
.config.jsonexist and are readable. - Re-run without
--deduplicateto skip that step:python app.py.
- The app auto-selects the next free port if 5000 or 8080 are busy. Check the startup log for the actual ports used.
- To force a specific port:
python app.py --port 5001.
- Confirm
GT_DATA_PATHin.config.jsonpoints to a valid zarr store. - Confirm the prediction zarr path in the evaluation results is accessible.
- Check the browser console for CORS or network errors.
- Same zarr path requirements as Neuroglancer above.
- Install
s3fsif EM background is needed:pip install s3fs.
- The eval ID must exist in
EVALUATIONS_CSV. - Results file
eval_<ID>.resultsmust be present inRESULTS_DIR.
For issues or questions, contact the CellMap team or open an issue in the project repository.