Skip to content

Add omni embedding results (new tasks): BidirLM, LCO-Omni 3B/7B#586

Merged
Samoed merged 1 commit into
mainfrom
add-omni-results-bidirlm-lco
Jul 2, 2026
Merged

Add omni embedding results (new tasks): BidirLM, LCO-Omni 3B/7B#586
Samoed merged 1 commit into
mainfrom
add-omni-results-bidirlm-lco

Conversation

@AdnanElAssadi56

Copy link
Copy Markdown
Contributor

Checklist

  • My model has a model sheet, report, or similar
  • My model has a reference implementation in mteb/models/model_implementations/, this can be as an API. Instruction on how to add a model can be found here
    • No, but there is an existing PR ___
  • The results submitted are obtained using the reference implementation
  • My model is available, either as a publicly accessible API or publicly on e.g., Huggingface
  • I solemnly swear that for all results submitted I have not trained on the evaluation dataset including training splits. If I have, I have disclosed it clearly.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

Model Results Comparison

Reference models: intfloat/multilingual-e5-large, google/gemini-embedding-001
New models evaluated: BidirLM/BidirLM-Omni-2.5B-Embedding, LCO-Embedding/LCO-Embedding-Omni-3B, LCO-Embedding/LCO-Embedding-Omni-7B

Results for BidirLM/BidirLM-Omni-2.5B-Embedding

task_name BidirLM/BidirLM-Omni-2.5B-Embedding google/gemini-embedding-001 intfloat/multilingual-e5-large Max result Model with max result In Training Data
AskUbuntuDupQuestions .633 .642 .592 .753 IEITYuan/Yuan-embedding-2.0-en False
BIOSSES .872 .890 .846 .969 Gameselo/STS-multilingual-mpnet-base-v2 False
Banking77Classification .850 .943 .749 .943 google/gemini-embedding-001 False
CQADupstackGamingRetrieval .635 .707 .587 .816 IEITYuan/Yuan-embedding-2.0-en False
CQADupstackUnixRetrieval .503 .537 .399 .720 voyageai/voyage-3-m-exp False
ClimateFEVERHardNegatives .235 .311 .260 .591 IEITYuan/Yuan-embedding-2.0-en False
FEVERHardNegatives .762 .890 .838 .945 ByteDance-Seed/Seed1.5-Embedding False
FiQA2018 .493 .618 .438 .821 ai-sage/Giga-Embeddings-instruct False
HotpotQAHardNegatives .654 .870 .706 .870 google/gemini-embedding-001 False
ImdbClassification .931 .950 .887 .974 Qwen/Qwen3-Embedding-8B False
MTOPDomainClassification .931 .980 .902 1.000 voyageai/voyage-3-m-exp False
MassiveScenarioClassification .725 .873 .651 .993 voyageai/voyage-3-m-exp False
MedrxivClusteringS2S.v2 .366 .450 .315 .702 codefuse-ai/F2LLM-4B False
MindSmallReranking .323 .329 .302 .344 Kingsoft-LLM/QZhou-Embedding False
StackExchangeClusteringP2P.v2 .422 .509 .385 .551 Kingsoft-LLM/QZhou-Embedding False
SummEvalSummarization.v2 .301 .383 .314 .389 annamodels/LGAI-Embedding-Preview False
Touche2020Retrieval.v3 .613 .524 .496 .762 jcorners/ingot-8b-r3 False
TweetSentimentExtractionClassification .666 .699 .628 .882 voyageai/voyage-3-m-exp False
TwentyNewsgroupsClustering.v2 .487 .574 .392 .876 GeoGPT-Research-Project/GeoEmbedding False
TwitterSemEval2015 .742 .792 .753 .901 jcorners/ingot-8b-r3 False
WorldQAVideoAudioCentricQA .380 nan nan .539 Haon-Chen/e5-omni-7B False
Average .596 .673 .572 .778 nan -

Results for LCO-Embedding/LCO-Embedding-Omni-3B

task_name LCO-Embedding/LCO-Embedding-Omni-3B google/gemini-embedding-001 intfloat/multilingual-e5-large Max result Model with max result In Training Data
AmazonCounterfactualClassification .784 .882 .697 .970 GeoGPT-Research-Project/GeoEmbedding False
ArXivHierarchicalClusteringP2P .580 .649 .557 .687 NovaSearch/jasper_en_vision_language_v1 False
ArXivHierarchicalClusteringS2S .552 .638 .537 .655 Qwen/Qwen3-Embedding-8B False
ArguAna .419 .864 .544 .898 voyageai/voyage-3-m-exp False
AskUbuntuDupQuestions .601 .642 .592 .753 IEITYuan/Yuan-embedding-2.0-en False
BIOSSES .818 .890 .846 .969 Gameselo/STS-multilingual-mpnet-base-v2 False
Banking77Classification .825 .943 .749 .943 google/gemini-embedding-001 False
BiorxivClusteringP2P.v2 .395 .539 .372 .842 codefuse-ai/F2LLM-4B False
CQADupstackGamingRetrieval .545 .707 .587 .816 IEITYuan/Yuan-embedding-2.0-en False
CQADupstackUnixRetrieval .395 .537 .399 .720 voyageai/voyage-3-m-exp False
ClimateFEVERHardNegatives .231 .311 .260 .591 IEITYuan/Yuan-embedding-2.0-en False
FEVERHardNegatives .567 .890 .838 .945 ByteDance-Seed/Seed1.5-Embedding False
FiQA2018 .345 .618 .438 .821 ai-sage/Giga-Embeddings-instruct False
HotpotQAHardNegatives .549 .870 .706 .870 google/gemini-embedding-001 False
ImdbClassification .881 .950 .887 .974 Qwen/Qwen3-Embedding-8B False
MTOPDomainClassification .897 .980 .902 1.000 voyageai/voyage-3-m-exp False
MassiveIntentClassification .556 .819 .602 .919 voyageai/voyage-3-m-exp False
MassiveScenarioClassification .623 .873 .651 .993 voyageai/voyage-3-m-exp False
MedrxivClusteringP2P.v2 .342 .472 .343 .720 codefuse-ai/F2LLM-4B False
MedrxivClusteringS2S.v2 .326 .450 .315 .702 codefuse-ai/F2LLM-4B False
MindSmallReranking .302 .329 .302 .344 Kingsoft-LLM/QZhou-Embedding False
SCIDOCS .183 .252 .174 .599 IEITYuan/Yuan-embedding-2.0-en False
SICK-R .828 .827 .802 .947 Gameselo/STS-multilingual-mpnet-base-v2 False
STS12 .789 .815 .800 .955 Gameselo/STS-multilingual-mpnet-base-v2 False
STS13 .861 .899 .816 .978 Gameselo/STS-multilingual-mpnet-base-v2 False
STS14 .822 .854 .777 .975 Gameselo/STS-multilingual-mpnet-base-v2 False
STS15 .881 .904 .893 .981 Gameselo/STS-multilingual-mpnet-base-v2 False
STS17 .841 .886 .821 .957 jcorners/ingot-8b-r3 False
STS22.v2 .659 .717 .643 .772 Kingsoft-LLM/QZhou-Embedding False
STSBenchmark .878 .891 .873 .950 Kingsoft-LLM/QZhou-Embedding False
SprintDuplicateQuestions .912 .969 .931 .984 Kingsoft-LLM/QZhou-Embedding False
StackExchangeClustering.v2 .574 .921 .464 .921 google/gemini-embedding-001 False
StackExchangeClusteringP2P.v2 .391 .509 .385 .551 Kingsoft-LLM/QZhou-Embedding False
SummEvalSummarization.v2 .370 .383 .314 .389 annamodels/LGAI-Embedding-Preview False
TRECCOVID .663 .863 .712 .983 IEITYuan/Yuan-embedding-2.0-en False
Touche2020Retrieval.v3 .482 .524 .496 .762 jcorners/ingot-8b-r3 False
ToxicConversationsClassification .695 .887 .660 .976 voyageai/voyage-3-m-exp False
TweetSentimentExtractionClassification .646 .699 .628 .882 voyageai/voyage-3-m-exp False
TwentyNewsgroupsClustering.v2 .445 .574 .392 .876 GeoGPT-Research-Project/GeoEmbedding False
TwitterSemEval2015 .749 .792 .753 .901 jcorners/ingot-8b-r3 False
TwitterURLCorpus .868 .870 .858 .957 TencentBAC/Conan-embedding-v2 False
Average .611 .729 .618 .840 nan -

Results for LCO-Embedding/LCO-Embedding-Omni-7B

task_name LCO-Embedding/LCO-Embedding-Omni-7B google/gemini-embedding-001 intfloat/multilingual-e5-large Max result Model with max result In Training Data
AmazonCounterfactualClassification .789 .882 .697 .970 GeoGPT-Research-Project/GeoEmbedding False
ArXivHierarchicalClusteringP2P .580 .649 .557 .687 NovaSearch/jasper_en_vision_language_v1 False
ArXivHierarchicalClusteringS2S .555 .638 .537 .655 Qwen/Qwen3-Embedding-8B False
ArguAna .396 .864 .544 .898 voyageai/voyage-3-m-exp False
AskUbuntuDupQuestions .629 .642 .592 .753 IEITYuan/Yuan-embedding-2.0-en False
BIOSSES .827 .890 .846 .969 Gameselo/STS-multilingual-mpnet-base-v2 False
Banking77Classification .835 .943 .749 .943 google/gemini-embedding-001 False
BiorxivClusteringP2P.v2 .407 .539 .372 .842 codefuse-ai/F2LLM-4B False
CQADupstackGamingRetrieval .575 .707 .587 .816 IEITYuan/Yuan-embedding-2.0-en False
CQADupstackUnixRetrieval .456 .537 .399 .720 voyageai/voyage-3-m-exp False
ClimateFEVERHardNegatives .207 .311 .260 .591 IEITYuan/Yuan-embedding-2.0-en False
FEVERHardNegatives .586 .890 .838 .945 ByteDance-Seed/Seed1.5-Embedding False
FiQA2018 .386 .618 .438 .821 ai-sage/Giga-Embeddings-instruct False
HotpotQAHardNegatives .592 .870 .706 .870 google/gemini-embedding-001 False
ImdbClassification .911 .950 .887 .974 Qwen/Qwen3-Embedding-8B False
MTOPDomainClassification .926 .980 .902 1.000 voyageai/voyage-3-m-exp False
MassiveIntentClassification .615 .819 .602 .919 voyageai/voyage-3-m-exp False
MassiveScenarioClassification .676 .873 .651 .993 voyageai/voyage-3-m-exp False
MedrxivClusteringP2P.v2 .361 .472 .343 .720 codefuse-ai/F2LLM-4B False
MedrxivClusteringS2S.v2 .334 .450 .315 .702 codefuse-ai/F2LLM-4B False
MindSmallReranking .306 .329 .302 .344 Kingsoft-LLM/QZhou-Embedding False
SCIDOCS .200 .252 .174 .599 IEITYuan/Yuan-embedding-2.0-en False
SICK-R .829 .827 .802 .947 Gameselo/STS-multilingual-mpnet-base-v2 False
STS12 .796 .815 .800 .955 Gameselo/STS-multilingual-mpnet-base-v2 False
STS13 .878 .899 .816 .978 Gameselo/STS-multilingual-mpnet-base-v2 False
STS14 .837 .854 .777 .975 Gameselo/STS-multilingual-mpnet-base-v2 False
STS15 .880 .904 .893 .981 Gameselo/STS-multilingual-mpnet-base-v2 False
STS17 .859 .886 .821 .957 jcorners/ingot-8b-r3 False
STS22.v2 .680 .717 .643 .772 Kingsoft-LLM/QZhou-Embedding False
STSBenchmark .877 .891 .873 .950 Kingsoft-LLM/QZhou-Embedding False
SprintDuplicateQuestions .908 .969 .931 .984 Kingsoft-LLM/QZhou-Embedding False
StackExchangeClustering.v2 .571 .921 .464 .921 google/gemini-embedding-001 False
StackExchangeClusteringP2P.v2 .395 .509 .385 .551 Kingsoft-LLM/QZhou-Embedding False
SummEvalSummarization.v2 .397 .383 .314 .389 annamodels/LGAI-Embedding-Preview False
TRECCOVID .666 .863 .712 .983 IEITYuan/Yuan-embedding-2.0-en False
Touche2020Retrieval.v3 .462 .524 .496 .762 jcorners/ingot-8b-r3 False
ToxicConversationsClassification .679 .887 .660 .976 voyageai/voyage-3-m-exp False
TweetSentimentExtractionClassification .646 .699 .628 .882 voyageai/voyage-3-m-exp False
TwentyNewsgroupsClustering.v2 .456 .574 .392 .876 GeoGPT-Research-Project/GeoEmbedding False
TwitterSemEval2015 .754 .792 .753 .901 jcorners/ingot-8b-r3 False
TwitterURLCorpus .867 .870 .858 .957 TencentBAC/Conan-embedding-v2 False
Average .624 .729 .618 .840 nan -

Model have high performance on these tasks: SummEvalSummarization.v2


@Samoed Samoed merged commit ad0bc24 into main Jul 2, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants