Build deep learning network for affordances concept using transformer
Affomernet trains a transformer-based detector that produces both object bounding boxes and per-object affordance segmentation masks. Example: it detects a hammer with a bounding box and segments the handle region as a grasping affordance. The framework supports COCO and IIT datasets, reports mAP for boxes and affordance masks, and can export to ONNX for deployment.
- Purpose
- Prerequisites
- Setup
- Install
- Dataset Preparation
- Training
- Validation
- Export to ONNX
- Inference
- Custom Data Training
- Additional Training/Testing Scripts
- Deactivating the Virtual Environment
- Troubleshooting
Before setting up the project, ensure you have the following installed on your system:
- Ubuntu (latest version recommended)
- Git
- Anaconda or Miniconda (Download from Anaconda's official website)
- Download the Anaconda installer:
wget https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh- Install Anaconda:
bash Anaconda3-2023.09-0-Linux-x86_64.sh- Follow the prompts and accept the license terms. After installation, initialize Anaconda by either restarting your terminal or running:
source ~/.bashrcFollow these steps to set up the project environment using Anaconda:
Create a new environment with Python 3.11:
conda create --prefix affomernet_env python=3.11conda activate affomernet_envCheck that you're using the correct Python version:
python --versionThe output should show Python 3.11.x.
With the conda environment activated, install the project dependencies:
pip install --upgrade pip
pip install -r requirements.txtDownload and extract COCO 2017 train and val images:
cd dataset/coco
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
unzip train2017.zip
unzip val2017.zip
rm train2017.zip val2017.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip annotations_trainval2017.zip
rm annotations_trainval2017.zipModify img_folder and ann_file in coco_detection.yml.
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_iit.ymlFine-tune from a previously trained model:
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_iit.yml -t output/rtdetr_r50vd_6x_coco/checkpoint0069.pth &> train.log 2>&1export CUDA_VISIBLE_DEVICES=0,1,2,3
torchrun --nproc_per_node=4 tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_iit.ymlexport CUDA_VISIBLE_DEVICES=0,1,2,3
torchrun --nproc_per_node=4 tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_iit.yml -r path/to/checkpoint --test-onlypython tools/export_onnx.py -c configs/rtdetr/rtdetr_r50vd_6x_iit.yml -r output/rtdetr_r50vd_6x_iit/checkpoint0071.pth --checkExample:
python tools/export_onnx.py -c configs/rtdetr/rtdetr_r101vd_6x_coco.yml -r output/rtdetr_r101vd_2x_coco_objects365_from_paddle.pth --check -f output/rtdetr_r101vd_coco_objects365.onnx
python tools/export_onnx.py -c configs/rtdetr/rtdetr_r50vd_6x_iit.yml -r output/rtdetr_r50vd_6x_iit/checkpoint0071.pth --check -f output/rtdetr_r50vd_6x_iit_1.onnxpython tools/export_onnx.py --inference --file-name output/model.onnx --image path/to/your/image.jpgExample:
python tools/export_onnx.py --inference --file-name output/rtdetr_r101vd_coco_objects365.onnx --image dataset/coco/val2017/000000000139.jpg
python tools/export_onnx.py --inference --dataset iit --file-name output/iit_model01.onnx --image dataset/iit/data/VOCdevkit2012/VOC2012/JPEGImages/0.jpg- Set
remap_mscoco_category: Falsein the config file - Modify
mscoco_category2namebased on your dataset if needed - Add
-t path/to/checkpoint(optional) to fine-tune based on a pretrained checkpoint
# Train on multiple GPUs
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --nproc_per_node=4 --master-port=8989 tools/train.py -c path/to/config &> train.log 2>&1 &
# Additional options
-r path/to/checkpoint # Resume from checkpoint
--amp # Use Automatic Mixed Precision
--test-only # Run evaluation only
# Fine-tuning example
torchrun --master_port=8844 --nproc_per_node=4 tools/train.py -c configs/rtdetr/rtdetr_r18vd_6x_coco.yml -t https://github.com/lyuwenyu/storage/releases/download/v0.1/rtdetr_r18vd_5x_coco_objects365_from_paddle.pthOnce you are done working in the virtual environment, you can deactivate it to return to your system's default Python environment.
If you activated the virtual environment manually, simply run:
deactivateExample:
(affomernet_env) user@machine:~/affomernet$ deactivate
user@machine:~/affomernet$The virtual environment activated via pyenv is tied to your shell session. It will automatically deactivate when you:
- Close the terminal
- Navigate away from the project directory (if using pyenv local)
If you want to stop using the virtual environment without deactivating it manually, you can remove the .python-version file from your project directory:
cd /path/to/your/affomernet
rm .python-versionThis will revert to the global Python version set by pyenv.
If you encounter issues, here are some common troubleshooting steps:
Terminate any lingering training processes:
ps aux | grep "tools/train.py" | awk '{print $2}' | xargs kill -9Append &> train.log 2>&1 & or &> train.log 2>&1 to your command.
Example:
python tools/train.py -c configs/rtdetr/rtdetr_r50vd_6x_iit.yml &> train.log 2>&1 &- deactivate Command Not Found: Ensure that the virtual environment is activated properly using
pyenv activate affomernet_env - Persistent Activation: Check your shell configuration files (e.g.,
~/.bashrc,~/.zshrc) for any lines that automatically activate a virtual environment and remove or comment them out if necessary