[CVPR 2026] SeeGroup: Multi-Layer Depth Estimation of Transparent Surfaces via Self-Determined Grouping
In this work, we propose SeeGroup, a multi-layer depth estimation method that allows the model itself to adaptively assign surfaces to depth maps. We formulate per-pixel multi-layer depth as a point process, treating depth layers as unordered events along each camera ray. This induces a permutation-invariant likelihood over the observed depth layers, yielding a loss that naturally supports arbitrary layer groupings. Experiments demonstrate that our method significantly advances the state of the art of multi-layer depth estimation, improving quadruplet relative depth accuracy on LayeredDepth benchmark from 61.34% to 70.09%.
If you find SeeGroup useful for your work, please consider citing our academic paper:
@misc{wen2026seegroupmultilayerdepthestimation,
title={SeeGroup: Multi-Layer Depth Estimation of Transparent Surfaces via Self-Determined Grouping},
author={Hongyu Wen and Jia Deng},
year={2026},
eprint={2605.28735},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2605.28735},
}
Install dependencies:
pip install -r requirements.txtDownload the released SeeGroup checkpoint for validation and test prediction:
bash scripts/download_seegroup_checkpoint.shEvaluate SeeGroup on LayeredDepth validation split with the released checkpoint:
python val.py --checkpoint-path checkpoints/seegroup.pthRun SeeGroup on LayeredDepth validation split with the released checkpoint and save the predictions:
python test.py \
--checkpoint-path checkpoints/seegroup.pth \
--output-dir predictions/layereddepth_test \
--format npyBefore training, download the DAV2 backbone once:
bash scripts/download_dav2_checkpoint.shRun single-GPU training:
python train.pyRun multi-GPU training on one machine:
torchrun --nproc_per_node=$gpus train.pyThis project relies on code from Depth-Anything-V2. We thank the original authors for their excellent work.
