Is it possible to set the revision of `xlm-roberta-flash-implementation`?

#123
by aciddust20 - opened

I'm not familiar with transformers yet. So I'm being careful with my questions

my code

from sentence_transformers import SentenceTransformer

sentence_transformer = SentenceTransformer(
    model_name_or_path="jinaai/jina-embeddings-v3",
    device="cpu",
    trust_remote_code=True,
    revision="f1944de8402dcd5f2b03f822a4bc22a7f2de2eb9",
)

image.png

log

service-api-1          | INFO:service.dependency.llm:Loading SentenceTransformer model...
service-api-1          | INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: jinaai/jina-embeddings-v3
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - configuration_xlm_roberta.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - embedding.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - stochastic_depth.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - mlp.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - rotary.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - mha.py
service-api-1          | - rotary.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - block.py
service-api-1          | - stochastic_depth.py
service-api-1          | - mlp.py
service-api-1          | - mha.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - xlm_padding.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - modeling_xlm_roberta.py
service-api-1          | - embedding.py
service-api-1          | - block.py
service-api-1          | - xlm_padding.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
service-api-1          | A new version of the following files was downloaded from https://huggingface.co./jinaai/xlm-roberta-flash-implementation:
service-api-1          | - modeling_lora.py
service-api-1          | - modeling_xlm_roberta.py
service-api-1          | . Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.```

I'm not at my PC right now to double check, but I think passing code_revision to model_kwargs might work:

from sentence_transformers import SentenceTransformer

sentence_transformer = SentenceTransformer(
    model_name_or_path="jinaai/jina-embeddings-v3",
    device="cpu",
    trust_remote_code=True,
    revision="f1944de8402dcd5f2b03f822a4bc22a7f2de2eb9",
    model_kwargs={
        "code_revision": "refs/pr/1" # For example
    },
)
  • Tom Aarsen

Thanks
but unfortunately, not working yet

code

sentence_transformer = SentenceTransformer(
    model_name_or_path="jinaai/jina-embeddings-v3",
    device="cpu",
    trust_remote_code=True,
    revision="f1944de8402dcd5f2b03f822a4bc22a7f2de2eb9",
    model_kwargs={
        "code_revision": "2b6bc3f30750b3a9648fe9b63448c09920efe9be",
    },
)

.cache

# pwd
/root/.cache
# tree .
.
└── huggingface
    └── hub
        └── models--jinaai--jina-embeddings-v3
            ├── blobs
            │   ├── 0499948a6c6b702a2ec2188f96b4ad51707feb62
            │   ├── 5fbc8cddd923280c8c6a7cea4870ce448ad1fced
            │   └── a1a7ba66b5eb9989572d080f000f19f1bf50e663
            └── snapshots
                └── f1944de8402dcd5f2b03f822a4bc22a7f2de2eb9
                    ├── README.md -> ../../blobs/5fbc8cddd923280c8c6a7cea4870ce448ad1fced
                    ├── config_sentence_transformers.json -> ../../blobs/a1a7ba66b5eb9989572d080f000f19f1bf50e663
                    └── modules.json -> ../../blobs/0499948a6c6b702a2ec2188f96b4ad51707feb62

6 directories, 6 files

log

service-api-1          | INFO:service.dependency.llm:Loading SentenceTransformer model...
service-api-1          | INFO:sentence_transformers.SentenceTransformer:Load pretrained SentenceTransformer: jinaai/jina-embeddings-v3
service-api-1          | Could not locate the custom_st.py inside jinaai/jina-embeddings-v3.
service-api-1          | Traceback (most recent call last):
service-api-1          |   File "/huray/service/.venv/lib/python3.12/site-packages/sentence_transformers/util.py", line 1221, in import_from_string
service-api-1          |     module = importlib.import_module(dotted_path)
service-api-1          |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
service-api-1          |   File "/usr/local/lib/python3.12/importlib/__init__.py", line 90, in import_module
service-api-1          |     return _bootstrap._gcd_import(name[level:], package, level)
service-api-1          |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
service-api-1          |   File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
service-api-1          |   File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
service-api-1          |   File "<frozen importlib._bootstrap>", line 1310, in _find_and_load_unlocked
service-api-1          |   File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
service-api-1          |   File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
service-api-1          |   File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
service-api-1          |   File "<frozen importlib._bootstrap>", line 1324, in _find_and_load_unlocked
service-api-1          | ModuleNotFoundError: No module named 'custom_st'

packages

# python -m pip list | grep sentence
sentence-transformers 4.0.1

I'll try as transformers again, not sentence-transformers

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment