Supplementary Materials for "Adapting Vision-Language Models for Evaluating World Models"

This repository contains all supplementary assets submitted with the paper, including code, human annotation study materials, and links to data used in support of the experiments and findings reported in the main manuscript.

Directory Structure

├── code/                       # Source code
├── dataset/README-dataset.md   # Dataset description (dataset itself hosted externally)
├── human-annotation-study/     # Human annotation study materials
├── LICENSE.txt                 # Licensing terms for included assets
└── README.md        	        # This file

Code

We release the full codebase used to run all experiments in the paper, including training and evaluation of UNIVERSE, as well as the code used to obtain baseline results—both zero-shot and fine-tuned—for PaliGemma, VideoLLaMA3, and CLIP. The codebase includes configuration files, data loaders, training and evaluation scripts, and supporting utilities.

For full usage instructions, refer to code/README-code.md.

Dataset

We release a subset of our evaluation dataset, curated from realistic human gameplay in a complex, multi-agent game environment.

Details on file structure and data formats are provided in dataset/README-dataset.md.

Note: Due to file size limitations, the dataset is hosted externally on Google Drive using a burner email account: link.

Human Annotation Study

This directory contains rollouts used in our human annotation study, designed to assess the fine-grained evaluation accuracy of UNIVERSE on rollouts generated by world models. We provide a total of 656 rollouts, generated by two world models across seven diverse environments.

The annotation scores are currently under internal review and will be released upon approval. For generation protocol and data breakdown, refer to human-annotation-study/README-human-annotation-study.md.

Note: Due to file size limitations, the dataset is hosted externally on Google Drive using a burner email account: link.

License and Availability

All materials are included in the supplementary ZIP file submitted with the paper and will be publicly released upon publication.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supplementary Materials for "Adapting Vision-Language Models for Evaluating World Models"

Directory Structure

Code

Dataset

Human Annotation Study

License and Availability

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
code		code
dataset		dataset
human-annotation-study		human-annotation-study
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Supplementary Materials for "Adapting Vision-Language Models for Evaluating World Models"

Directory Structure

Code

Dataset

Human Annotation Study

License and Availability

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages