-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot reproduce results on text classification benchmark. #1490
Comments
You should load model like this: import mteb
model = mteb.load_model("jinaai/jina-embeddings-v3")
... |
mteb has no attribute "load_model" ? I am using mteb==1.20.0 |
Sorry, this should be import mteb
model = mteb.get_model("jinaai/jina-embeddings-v3")
... |
File "D:\code\mteb-main\mteb\models\overview.py", line 126, in get_model |
Can you provide code? I tried to run tasks with following code and everything was working import mteb
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModel
model = mteb.get_model("jinaai/jina-embeddings-v3")
tasks = mteb.get_tasks(
tasks=['AmazonCounterfactualClassification',
'AmazonReviewsClassification',
"Banking77Classification",
'EmotionClassification',
'ImdbClassification',
'MTOPIntentClassification',
'ToxicConversationsClassification',
'TweetSentimentExtractionClassification'
]
)
evaluation = mteb.MTEB(tasks=tasks)
results = evaluation.run(
model,
eval_splits=["test"],
output_folder="results"
) |
I am using the exact code of yours except I replace model=mteb.get_model("jinaai/jina-embeddings-v3") to model=mteb.get_model("jina_v3"), which is the local path of the download jina-embeddings-v3 model on https://huggingface.co/jinaai/jina-embeddings-v3. Could this be the problem? |
Yes, I think this is a problem |
I run the code on several text classification datasets. None of the results match the performance reported in the leaderboard. Neither significantly high nor low. Do you meet the same problem? |
@bwanglzu Do you have any ideas? Results:
|
To updatae. I randomly selected some models to reproduce the performance. NV-embed-v2 failed. learning2_model succeed. |
Hmm this seems odd.
Just want to state that this is indeed an issue as it will call sentence-transformers to load the model instead of our implementation, which also included prompt-handling (see implementation below): mteb/mteb/models/jina_models.py Line 199 in 3ff38ec
A few points to ensure. Check that everything works:
|
I am using MTEB==1.20.0, and the revision id of "jinaai/jina-embeddings-v3" model is 215a6e121fa0183376388ac6b1ae230326bfeaed |
I'll take a look this morning |
I am trying to reproduce the performance of the model "jina_v3" https://huggingface.co/jinaai/jina-embeddings-v3 on text classificiaton benchmark.
And I am using the code below:
import mteb
from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer, AutoModel
model_name = "jinaai/jina-embeddings-v3"
model = SentenceTransformer('model_name',trust_remote_code=True)
tasks = mteb.get_tasks(tasks=['AmazonCounterfactualClassification',
'AmazonReviewsClassification',
"Banking77Classification",
'EmotionClassification',
'ImdbClassification',
'MTOPIntentClassification',
'ToxicConversationsClassification',
'TweetSentimentExtractionClassification'])
evaluation = mteb.MTEB(tasks=tasks)
results = evaluation.run(model, eval_splits=["test"],output_folder=f"results/{model_name}")
The results seem to differ significantly from the results in https://huggingface.co/spaces/mteb/leaderboard .
Any suggestion?
The text was updated successfully, but these errors were encountered: