Skip to content

[Research Project] MOEB: Massive Omni Embedding Benchmark #4842

Description

@AdnanElAssadi56

Goals

  • Expand Scope: Unify and expand embedding evaluation across text, image, audio, and video, including cross-modal settings and new applications.
  • Improve Quality: Measure capabilities better through improved datasets, robustness evaluations, and harder benchmarks.
  • Increase Efficiency: Develop faster and more informative evaluation methodologies and infrastructure.
  • Strengthen Governance: Improve maintainability, reproducibility, trust, and open benchmark practices.

Tracks

1. Modality Expansion (MOEB)

  • Unify MMTEB, MIEB, MAEB, and MVEB under an omni-modal benchmark
  • Add important missing models and datasets to MTEB, MIEB, MAEB, and MVEB
  • Expand cross-modal evaluations for missing combinations

2. Quality

  • Refresh saturated benchmarks.
  • Improve dataset quality and filtering.
  • Harder and more robust evaluations.
  • Contamination: autodetect dataset contamination #1636
  • Better measurement: BrowseComp, MTEB-gym

3. Efficiency & Methodology

  • Faster evaluation pipelines and infrastructure.
  • Informative task selection.
  • Benchmark compression.
  • IRT and optimal experiment design.
  • Partial-score estimation.

4. Governance

5. Human Annotations Baselines[Optional]:

Ideas to be Built on Top of MOEB

  • AutoResearch with Sentence Transformers.
  • Explore new domains (e.g. RAG, agents).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions