All notable changes to RF-DETR are documented here.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
augmentation_backendfield onTrainConfig("cpu"/"auto"/"gpu"): opt-in GPU-side augmentation via Kornia applied inRFDETRDataModule.on_after_batch_transferafter the batch is resident on the GPU. CPU path is unchanged and remains the default. Install withpip install 'rfdetr[kornia]'. Phase 1 supports detection only; segmentation mask support is planned for Phase 2.BuilderArgs— a@runtime_checkabletyping.Protocoldocumenting the minimum attribute set consumed bybuild_model(),build_backbone(),build_transformer(), andbuild_criterion_and_postprocessors(). Enables static type-checker support for custom builder integrations. Exported fromrfdetr.models. (#841)build_model_from_config(model_config, train_config=None, defaults=MODEL_DEFAULTS)— config-native alternative tobuild_model(build_namespace(mc, tc)); accepts Pydantic config objects directly and constructs the internal namespace automatically. Exported fromrfdetr.models. (#845)build_criterion_from_config(model_config, train_config, defaults=MODEL_DEFAULTS)— config-native alternative tobuild_criterion_and_postprocessors(build_namespace(mc, tc)); returns a(SetCriterion, PostProcess)tuple. Exported fromrfdetr.models. (#845)ModelDefaultsdataclass — exposes the 35 hardcoded architectural constants previously buried insidebuild_namespace(). Pass adataclasses.replace(MODEL_DEFAULTS, ...)override to the new config-native builders to customise individual constants. Note: fields may be promoted toModelConfig/TrainConfigin future phases. Exported fromrfdetr.models. (#845)MODEL_DEFAULTS— the canonicalModelDefaultssingleton with production defaults. Exported fromrfdetr.models. (#845)RFDETR.predict(include_source_image=...)— opt-out flag (defaultTrue) to skip storing the source image indetections.data["source_image"]; set toFalseto reduce memory use when the image is not needed for annotation. (#912)model_nameis now stored in checkpoint files during training so thatRFDETR.from_checkpoint()can resolve the correct model class directly from the checkpoint, without requiring the caller to know or pass a class hint.strip_checkpoint()preserves this key. Backward-compatible: checkpoints withoutmodel_namecontinue to resolve viapretrain_weightsfilename matching. (#895)rfdetr_versionis now stored in checkpoint files during training for provenance tracking and compatibility hints.strip_checkpoint()preserves this key. The key is omitted gracefully when the package version cannot be resolved (e.g. editable install without metadata). Backward-compatible: checkpoints withoutrfdetr_versioncontinue to load normally. (#918)
build_namespace(model_config, train_config)— no longer used internally and deprecated in this release; usebuild_model_from_config,build_criterion_from_config, or_namespace_from_configsdirectly. It will be removed in v1.9 and currently emits aDeprecationWarningon use. (#845)load_pretrain_weights(nn_model, model_config, train_config)— thetrain_configpositional argument is deprecated and will be removed in v1.9; it is no longer used internally. Omit it:load_pretrain_weights(nn_model, model_config). Passing a non-Nonevalue emits aDeprecationWarning. (#845)TrainConfig.group_detr(architecture decision →ModelConfig),TrainConfig.ia_bce_loss(loss type tied to architecture family →ModelConfig),TrainConfig.segmentation_head(architecture flag →ModelConfig),TrainConfig.num_select(postprocessor count →ModelConfig;SegmentationTrainConfigusers: remove thenum_selectoverride — the model config value is always used),ModelConfig.cls_loss_coef(training hyperparameter →TrainConfig) — each now emitsDeprecationWarningwhen set on the wrong config object and will be removed in v1.9. (#841)
WindowedDinov2WithRegistersEmbeddings.forward()now raisesValueError(instead of silently failing under-O) when input spatial dimensions are not divisible bypatch_size * num_windows, with a clear message identifying the divisor and actual shape. (#167)- Fixed
_namespace.py:num_selectin the builder namespace now always reads fromModelConfig, eliminating a regression whereTrainConfig.num_select(default 300) silently overrode model-specific values of 100–200 for segmentation variants (RFDETRSegNano,RFDETRSegSmall,RFDETRSegMedium,RFDETRSegLarge,RFDETRSegPreview). Post-processing now uses the correct top-k count for each model. (#841) - Fixed
models/weights.py:load_pretrain_weightsnow correctly auto-aligns the model head when the checkpoint has fewer classes than the configured default, preventing a silent mismatch whennum_classeswas not explicitly set by the caller. (#845) - Fixed YOLO segmentation training on large datasets hitting OS out-of-memory:
supervision.DetectionDataset.from_yolo(force_masks=True)was eager-rasterising H×W boolean masks for every image at dataset construction time (≈1 GB/1 000 images at 1024 px). A new_LazyYoloDetectionDatasetstores polygon coordinates only and defers dense mask rasterisation to__getitem__, keeping RAM proportional to annotation count rather than (N × H × W). (#851) - Fixed ONNX/TRT dynamic batch inference:
gen_encoder_output_proposalsandTransformer.forwardextracted the batch size as a Python int and passed it totorch.full,.view(N_, ...),.expand(N_, ...), and.repeat(bs, ...), causing the ONNX tracer to bake the training batch size (e.g. 8) as a compile-time constant. TRT engines built with--minShapessmaller than the trace batch would fail at inference withReshape: reshaping failed. All six call sites are now replaced with ONNX-symbolic equivalents (zeros_like,-1reshapes,expand(memory.shape[0], ...)), keeping the batch dimension fully dynamic. (#950, closes #949)
predict()now includesclass_nameindetections.data, mapping each detection's 0-indexed class ID to its human-readable name. (#914)
- Fixed segmentation multi-GPU DDP training crash:
build_trainer()now wrapsstrategy="ddp"withDDPStrategy(find_unused_parameters=True)whensegmentation_head=True. The segmentation head'ssparse_forward()leaves parameters unused on some forward steps; plain"ddp"raisedRuntimeError: It looks like your LightningModule has parameters that were not used in producing the loss. Non-segmentation DDP and other strategies are unchanged. (#942, #947) - Fixed fused AdamW crash under FP32 multi-GPU training:
configure_optimizers()andclip_gradients()now gate fused AdamW on the trainer's actual precision (requiring a BF16 variant) rather than GPU capability alone. On Ampere+ hardwaretorch.cuda.is_bf16_supported()is alwaysTrue, so the old code enabled fused AdamW even withprecision="32-true", causingRuntimeError: params, grads, exp_avgs, and exp_avg_sqs must have same dtype, device, and layoutfrom DDP gradient bucket view stride mismatches. (#942, #947) - Fixed multi-GPU DDP training crashing in Jupyter notebooks and Kaggle: replaced fork-based
ddp_notebookstrategy with a spawn-based DDP strategy that avoids OpenMP thread pool corruption afterfork(). (#928) - Fixed
RFDETR.train(resolution=...)being silently ignored — the kwarg is now applied tomodel_configbefore training begins, with validation that the value is divisible bypatch_size * num_windows. (#933) - Fixed
save_dataset_gridsbeing silently a no-op —DatasetGridSaveris now wired into the training loop, saving sample grids to{output_dir}/dataset_grids/when enabled. Grid save failures are caught without interrupting training. (#946) - Fixed partial gradient-accumulation windows at the tail of training epochs: the training dataset is now padded to an exact multiple of
effective_batch_size * world_size, ensuring every optimizer step uses a full gradient window. Workaround for pytorch-lightning#19987. (#937) - Fixed
torch.export.exportfailing on the transformer decoder by threadingspatial_shapes_hwthrough all decoder layers. (#936) download_pretrain_weights()no longer overwrites fine-tuned checkpoints that share a filename with a registry model (e.g.rf-detr-nano.pth). Previously, an MD5 mismatch would fall through to_download_file()and silently replace the user's weights with the original COCO checkpoint. The function now returns early whenever the file exists andredownload=False, regardless of MD5 status — a warning is emitted when the hash differs. Passredownload=Trueto force a fresh download. (#935)
predict()now stores the original image and its shape on returnedsv.Detectionsobjects —detections.data["source_image"](NumPy array) anddetections.data["source_shape"](height, width) let you annotate results without loading the image separately. (#892)RFDETR.train()auto-detectsnum_classesfrom the dataset directory when not explicitly set, reinitializing the detection head to the correct class count automatically. A warning is emitted when the configured value differs from the dataset count. (#893)optimize_for_inference()now accepts dtype as a string name (e.g."float16") in addition to atorch.dtypeobject; invalid dtype inputs uniformly raiseTypeError. (#899)
- Fixed
models/lwdetr.py:reinitialize_detection_headnow replacesnn.Linearmodules instead of mutating.datatensors in-place, ensuringout_featuresmetadata stays consistent with the actual weight shape. This prevents ONNX export andtorch.jit.tracefrom emitting stale (pre-fine-tuning) class counts for fine-tuned models. (#904) - Fixed
RFDETR.optimize_for_inference()leaking a CUDA context on multi-GPU setups: the deep-copy, export, and JIT-trace steps now run insidetorch.cuda.device(device)to pin the context to the correct device. (#899) - Fixed
optimize_for_inference()leaving inconsistent state on failure: prior optimized state is now reset and flags are committed only after a successful build/trace; temp download files use unique per-process paths to avoid parallel worker collisions. - Fixed
deploy_to_roboflowfailing withFileNotFoundErrorafter PyTorch Lightning migration:class_names.txtis now written to the upload directory andargs.class_namesis populated before saving the checkpoint. (#890)
RFDETR.predict(shape=...)— optional(height, width)tuple overrides the default square inference resolution; useful when matching a non-square ONNX export. Both dimensions must be positive integers divisible bypatch_size × num_windowsas determined by the model configuration. (#866)
ModelConfig.deviceandRFDETR.train(device=...)now accepttorch.deviceobjects and indexed device strings such as"cuda:0". Values are normalized to canonical torch-style strings.RFDETR.train()warns when an unmapped device type is passed to PyTorch Lightning auto-detection. (#872)
- Fixed ONNX export ignoring an explicit
patch_sizeargument:export()andpredict()now resolvepatch_sizefrommodel_configby default, validate it strictly (positive integer, not bool), and enforce that(H, W)dimensions are divisible bypatch_size × num_windows. (#876) - Fixed ONNX export for models with dynamic batch dimensions — replaced
H_.expand(N_)withtorch.fullfor Python-int spatial dims to eliminate tracer failures. (#871)
RFDETR.export(..., simplify=..., force=...)— both arguments are now no-ops and emit aDeprecationWarning. RF-DETR no longer runs ONNX simplification automatically; remove these arguments from your calls. They will be removed in v1.8. (#861)
- Fixed
RFDETR.train(): a missingrfdetr[train]install (e.g. plainpip install rfdetrin Colab) now raises anImportErrorwith an actionable message —pip install "rfdetr[train,loggers]"— instead of a rawModuleNotFoundErrorwith no install hint. (#858) - Fixed
AUG_AGGRESSIVEpreset:translate_percentwas(0.1, 0.1)— a degenerate range that forced AlbumentationsAffineto always translate right/down by exactly 10%. Corrected to(-0.1, 0.1)for symmetric bidirectional translation. (#863) - Fixed PTL training path:
latest.ckptand per-interval checkpoints (checkpoint_interval_N.ckpt) are now properly written and restored on resume. (#847) - Fixed
BestModelCallbackand checkpoint monitor raisingMisconfigurationExceptionon non-eval epochs wheneval_interval > 1— monitor key absence is now handled gracefully. (#848) - Fixed
protobufversion constraint in theloggersextra to guard against TensorBoard descriptor crash (TypeError: Descriptors cannot be created directly) with protobuf ≥ 4. (#846) - Fixed duplicate
ModelCheckpointstate keys whencheckpoint_interval=1;last.ckptis omitted in that configuration to avoid collision. (#859)
- PyTorch Lightning training building blocks:
RFDETRModelModule,RFDETRDataModule,build_trainer(), and individual callbacks (RFDETREMACallback,COCOEvalCallback,BestModelCallback,DropPathCallback,MetricsPlotCallback) — all standard PTL components, swap/subclass/extend any piece. Level 3:rfdetr fit --configCLI with zero Python required. (#757, #794) - Multi-GPU DDP via
model.train():strategy,devices, andnum_nodesadded toTrainConfig; single-GPU behaviour unchanged when omitted. (#808) batch_size='auto': CUDA memory probe finds the largest safe micro-batch size, then recommendsgrad_accum_stepsto reach a configurable effective batch target (default 16 viaauto_batch_target_effective). (#814)ModelContextpromoted from_ModelContextto a public, exported API — inspectclass_names,num_classes, and related metadata viamodel.contextafter training. (#835)backbone_loraandfreeze_encoderadded as first-class fields inModelConfig. (#829)generate_coco_dataset(with_segmentation=True)produces COCO polygon annotations alongside bounding boxes for segmentation fine-tuning with synthetic data. (#781)set_attn_implementation("eager" | "sdpa")on the DINOv2 backbone — switch attention implementation at runtime. (#760)eval_max_dets,eval_interval, andlog_per_class_metricsadded toTrainConfig.python -m rfdetrentry point alongside therfdetrconsole script.py.typedmarker — RF-DETR is now PEP 561–compliant.
- Breaking: Minimum
transformersversion bumped to>=5.1.0,<6.0.0. The DINOv2 windowed-attention backbone now uses the transformers v5 API (BackboneMixin._init_transformers_backbone(), removedhead_maskplumbing). Projects still on transformers v4 must pinrfdetr<1.6.0. (#760) - Breaking: PyPI install extras renamed —
rfdetr[metrics]→rfdetr[loggers],rfdetr[onnxexport]→rfdetr[onnx]. draw_synthetic_shapenow returnsTuple[np.ndarray, List[float]]instead ofnp.ndarray. The second element is a flat COCO-style polygon list[x1, y1, x2, y2, …]. Any caller that previously didimg = draw_synthetic_shape(...)must be updated toimg, polygon = draw_synthetic_shape(...). (#781)- Albumentations version constraint broadened to
>=1.4.24,<3.0.0;RandomSizedCropconfigs usingheight/widthkwargs are automatically adapted to the 2.xsize=(height, width)API. (#786) - Current learning rate is now shown in the training progress bar alongside loss. (#809)
supervision,pytorch_lightning, and other heavy dependencies are now imported lazily (on first use) rather than at module load, reducing cold-import time in inference-only environments. (#801)
rfdetr.deploy.*— redirects torfdetr.export.*with aDeprecationWarning. Migrate before v1.7.rfdetr.util.*— redirects torfdetr.utilities.*with aDeprecationWarning. Migrate before v1.7.
- Raised a descriptive
ValueErrorinstead of a crypticRuntimeError/ tensor-size mismatch when a checkpoint is incompatible with the current model architecture — coverssegmentation_headmismatch andpatch_sizemismatch. (#810) - Fixed
class_namesnot reflecting dataset labels onmodel.predict()after training — class names are now synced from the dataset so inference always uses the correct label list. (#816) - Fixed detection head reinitialization overwriting fine-tuned weights when loading a checkpoint with fewer classes than the model default. The second
reinitialize_detection_headcall now fires only in the backbone-pretrain scenario. (#815, #509) - Fixed
grid_sampleand bicubic interpolation silently falling back to CPU on MPS (Apple Silicon) — both now run natively on the MPS device. (#821) - Fixed
early_stopping=FalseinTrainConfigbeing silently ignored — the setting now propagates correctly. (#835) - Fixed
AttributeErrorcrash inupdate_drop_pathwhen the DINOv2 backbone layer structure does not match any known pattern. - Added warning when
drop_path_rate > 0.0is configured with a non-windowed DINOv2 backbone, where drop-path is silently ignored. - Fixed
ValueError: matrix entries are not finiteinHungarianMatcherwhen the cost matrix contains NaN or Inf — non-finite entries are replaced with a finite sentinel beforelinear_sum_assignment, warning emitted at most once per matcher instance. (#787) - Fixed YOLO dataset validation rejecting
data.yml— both.yamland.ymlare now accepted. (#777) - Silently dropped degenerate bounding boxes (zero width or height) before Albumentations validation instead of raising
ValueError. (#825)
- Added peak GPU memory (
max_memin MB) to training and evaluation progress bars on CUDA; omitted on CPU and MPS. (#773)
- Fixed
aug_configbeing silently ignored when training on YOLO-format datasets —build_roboflow_from_yolonever forwarded the value, so transforms always fell back to the default. (#774) - Fixed segmentation evaluation metrics not being written to
results_mask.jsonduring validation and test runs. (#772) - Fixed
AttributeErrorcrash inupdate_drop_pathwhen the DINOv2 backbone layer structure does not match any known pattern —_get_backbone_encoder_layersnow returnsNonefor unrecognised architectures. (#762) - Fixed
drop_path_ratenot being forwarded to the DINOv2 model configuration; stochastic depth was never applied even when explicitly set. Added a warning whendrop_path_rate > 0.0is used with a non-windowed backbone. (#762) - Fixed incorrect COCO hierarchy filtering that excluded parent categories from the class list. (#759)
- Fixed evaluation metric corruption on 1-indexed Roboflow datasets caused by a flawed contiguity check in
_should_use_raw_category_ids. (#755)
- Added support for nested Albumentations containers (
OneOf,Sequential) insideaug_config. (#752)
- Migrated dataset transform pipeline to torchvision-native
Compose,ToImage, andToDtype;Normalizenow defaults to ImageNet mean/std. (#745)
- Fixed
RFDETRMediummissing from the public API —__all__contained a duplicateRFDETRSmallentry. (#748) - Fixed
AR50_90reporting an incorrect value inMetricsMLFlowSinkdue to a wrong COCO evaluation index. (#735) - Fixed supercategory filtering in
_load_classesfor COCO datasets with flat or mixed supercategory structures. (#744) - Fixed crash in geometric transforms when a sample contained zero-area or empty masks. (#727)
- Fixed segmentation training on Colab —
DepthwiseConvBlocknow disables cuDNN for depthwise separable convolutions. (#728) - Pinned
onnxsim<0.6.0to preventpip installfrom hanging indefinitely. (#749)
- Added custom training augmentations via
aug_configinmodel.train()— accepts a dict of Albumentations transforms, a built-in preset (AUG_CONSERVATIVE,AUG_AGGRESSIVE,AUG_AERIAL,AUG_INDUSTRIAL), or{}to disable. Bounding boxes and segmentation masks are transformed automatically. (#263, #702) - Added
save_dataset_grids=TrueinTrainConfigto write 3×3 JPEG grids of augmented samples tooutput_dirbefore training begins. (#153) - Added ClearML logger: set
clearml=TrueinTrainConfigto stream per-epoch metrics to ClearML. (#520) - Added MLflow logger: set
mlflow=TrueinTrainConfigto log runs and metrics to MLflow with custom tracking URI support. (#109) - Added live progress bar for training and validation with structured per-epoch logs. (#204)
- Added
devicefield toTrainConfigfor explicit device selection. (#687) ModelConfignow raises an error on unknown parameters, preventing silent misconfiguration. (#196)
- Deprecated
OPEN_SOURCE_MODELSconstant in favour ofModelWeightsenum. (#696) - Added MD5 checksum validation for pretrained weight downloads. (#679)
- Fixed Albumentations bool-mask crash during segmentation training. (#706)
- Fixed
UnboundLocalErrorwhen resuming training from a completed checkpoint. (#707) - Prevented corruption of
checkpoint_best_total.pthvia atomic checkpoint stripping. (#708) - Fixed PyTorch 2.9+ compatibility issue with CUDA capability detection. (#686)
- Fixed dtype mismatch error when
use_position_supervised_loss=True. (#447) - Fixed inconsistent return values from
build_model. (#519) - Fixed
positional_encoding_sizetype annotation (bool→int). (#524) - Fixed ONNX export
output_namesto include masks when exporting segmentation models. (#402) - Fixed
num_selectnot being updated correctly during segmentation model fine-tuning. (#399) - Fixed
np.argwhere→np.argmaxmisuse. (#536) - Fixed COCO sparse category ID remapping for non-contiguous or offset category IDs. (#712)
- Fixed segmentation mask filtering when using aggressive augmentations. (#717)
- Pretrained weight downloads now validate against an MD5 checksum to detect corrupted files. (#679)
- Fixed
deploy_to_roboflowfailing for segmentation model exports. (#578) - Fixed missing
infokey in COCO export format. (#681)
- Added
generate_coco_dataset()utility for generating synthetic COCO-format datasets with configurable class counts, split ratios, and bounding box annotations. (#617) - Added
run_test=FalsetoTrainConfig— skip test-split evaluation when your dataset has no test set. (#628)
model.predict()now accepts image URLs directly — no need to download images before inference. (#629)- Plus models (
RFDETRXLarge,RFDETR2XLarge) are now distributed as a separaterfdetr_pluspackage under the Roboflow Model License. (#645)
- Fixed segmentation ONNX export failure. (#626)
- Added native YOLO dataset format support alongside COCO. (#74)
- Added
--print-freqCLI argument to control training log frequency. (#603)
- Pinned
transformersto<5.0.0to prevent incompatibility with the transformers v5 API. (#599)
- Fixed class count mismatch in
train_from_configfor Roboflow-uploaded datasets. (#588) - Improved
num_classesmismatch warning messages to be actionable rather than misleading. (#261) - Fixed CLI crash when specifying the
deviceargument. (#246)
Headline release introducing new pre-trained model sizes — L, XL, and 2XL for object detection, and the full N/S/M/L/XL/2XL range for instance segmentation. Also added YOLO format training support, simplified the dependency footprint by removing several heavy packages (cython, fairscale, timm, einops, and others), and fixed per-class precision/recall/F1 computation. Drops Python 3.9 support.