Skip to content

Latest commit

 

History

History
189 lines (142 loc) · 7.06 KB

File metadata and controls

189 lines (142 loc) · 7.06 KB

Dataset Preparation

Scripts in datasets/ follow the conventions of Detectron2 and prior open-vocabulary segmentation work. A dataset can be used by accessing DatasetCatalog for its data, or MetadataCatalog for its metadata (class names, etc). This document explains how to setup the builtin datasets so they can be used by the above APIs. Use Custom Datasets gives a deeper dive on how to use DatasetCatalog and MetadataCatalog, and how to add new datasets to them.

Environment Variables

Create a new directory data to store all the datasets. Set your dataset root directory via DETECTRON2_DATASETS before training or evaluation:

export DETECTRON2_DATASETS=/path/to/datasets

Training Dataset

COCO (Panoptic + RefCOCOg)

Our setup follows the instructions from X-Decoder and Mask2Fomer.

Prepare panoptic_train2017, panoptic_semseg_train2017 exactly the same as Mask2Fomer.

coco/
  annotations/
    instances_{train,val}2017.jso
    panoptic_{train,val}2017.json
  {train,val}2017/
    # image files that are mentioned in the corresponding json
  panoptic_{train,val}2017/  # png annotations
  panoptic_semseg_{train,val}2017/  # generated by the script mentioned below

Install panopticapi:

pip install git+https://github.com/cocodataset/panopticapi.git

Then, run python datasets/prepare_coco_semantic_annos_from_panoptic_annos.py from the MaskFormer repo, to extract semantic annotations from panoptic annotations (only used for evaluation).

Download additional annotations and put them inside coco/annotations/:

# coco panoptic
wget https://huggingface.co/xdecoder/X-Decoder/resolve/main/panoptic_train2017_filtrefgumdval_filtvlp.json
# refcocog valid
wget https://huggingface.co/xdecoder/X-Decoder/resolve/main/refcocog_umd_val.json
# refcocog train: download from Google Drive
https://drive.google.com/file/d/1DQhgTo7B4E-8IIh5fGOlrQVmh4x14mJL/view?usp=sharing

Evaluation Datasets

ADE20K-150

Our setup follows the instructions from Mask2Fomer. The scripts mentioned below can be found in the Mask2Fomer repo.

Expected dataset structure for ADE20k:

ADEChallengeData2016/
  images/
  annotations/
  objectInfo150.txt
  # download instance annotation
  annotations_instance/
  # generated by prepare_ade20k_sem_seg.py
  annotations_detectron2/
  # below are generated by prepare_ade20k_pan_seg.py
  ade20k_panoptic_{train,val}.json
  ade20k_panoptic_{train,val}/
  # below are generated by prepare_ade20k_ins_seg.py
  ade20k_instance_{train,val}.json

The directory annotations_detectron2 is generated by running python datasets/prepare_ade20k_sem_seg.py.

Download the instance annotation from http://sceneparsing.csail.mit.edu/:

wget <http://sceneparsing.csail.mit.edu/data/ChallengeData2017/annotations_instance.tar>

Then, run python datasets/prepare_ade20k_pan_seg.py, to combine semantic and instance annotations for panoptic annotations.

And run python datasets/prepare_ade20k_ins_seg.py, to extract instance annotations in COCO format.

ADE20K full

Our setup follows the instructions from OV-Seg.

Download here: https://www.kaggle.com/datasets/sssunyy/ade20k/data

Expected dataset structure for ADE20k-Full (ADE20K-847):

ADE20K_2021_17_01/
  images/
  index_ade20k.pkl
  objects.txt
  # below are generated
  images_detectron2/
  annotations_detectron2/

The directories images_detectron2 and annotations_detectron2 are generated by running python datasets/prepare_ade20k_full_sem_seg.py (script can be found in OV-Seg).

PASCAL-Context (PC459 / PC59) and VOC 2012

Our setup follows the instructions from APE.

Obtain VOC2010 and VOC2012 from the Pascal VOC website.

Expected dataset structure for PC459 and PC59:

$DETECTRON2_DATASETS/
  VOCdevkit/
    VOC2010/
      Annotations/
      ImageSets/
      JPEGImages/
      SegmentationClass/
      SegmentationObject/
      # below are from <https://www.cs.stanford.edu/~roozbeh/pascal-context/trainval.tar.gz>
      trainval/
      labels.txt
      59_labels.txt # <https://www.cs.stanford.edu/~roozbeh/pascal-context/59_labels.txt>
      pascalcontext_val.txt # <https://drive.google.com/file/d/1BCbiOKtLvozjVnlTJX51koIveUZHCcUh/view?usp=sharing>
      # below are generated
      annotations_detectron2/
        pc459_val/
        pc59_val

It starts with a tar file VOCtrainval_03-May-2010.tar. Extract the file tar xf VOCtrainval_03-May-2010.tar. You may want to download the 5K validation set here.

The directory annotations_detectron2 is generated by running (script from APE)

python datasets/prepare_pascal_context.py

Expected dataset structure for VOC:

$DETECTRON2_DATASETS/
  VOCdevkit/
    VOC2012/
      Annotations/
      ImageSets/
      JPEGImages/
      SegmentationClass/
      SegmentationObject/
      SegmentationClassAug/ # <https://github.com/kazuto1011/deeplab-pytorch/blob/master/data/datasets/voc12/README.md>
      # below are generated
      images_detectron2/
      annotations_detectron2/
        val/

It starts with a tar file VOCtrainval_11-May-2012.tar.

The directories images_detectron2 and annotations_detectron2 are generated by running (script from APE)

python datasets/prepare_voc_sem_seg.py

SUN RGB-D (SUN-37)

Follow https://github.com/chrischoy/SUN_RGBD to download and extract the dataset:

wget http://cvgl.stanford.edu/data2/sun_rgbd.tgz
tar -xzf sun_rgbd.tgz

ScanNet

Use the official download script: https://kaldir.vc.in.tum.de/scannet/download-scannet.py

Convert to EfficientPS format following https://github.com/TUTvision/ScanNet-EfficientPS and run:

python tools/scannet_train_val_to_efficientps.py \
    -s /path/to/scannet_frames_25k \
    -t /path/to/ScanNet/Tasks/Benchmark/scannetv2_train.txt \
    -v /path/to/ScanNet/Tasks/Benchmark/scannetv2_val.txt \
    -o /path/to/output \
    -sc /path/to/ScanNet \
    -pn /path/to/panopticapi