RuntimeError: value cannot be converted to type uint8_t without overflow

#19
by TyJaJa - opened

I have a rx 7800 xt and i was trying to run it with directml on windows but i get that error

venv\Lib\site-packages\transformers\models\gemma3\modeling_gemma3.py", line 873, in _prepare_4d_causal_attention_mask_with_cache_position
causal_mask[:, :, :, :mask_length] = causal_mask[:, :, :, :mask_length].masked_fill(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: value cannot be converted to type uint8_t without overflow

Google org

Hi @TyJaJa ,

This issue is caused by a type mismatch, the causal_mask tensor is of type uint8, which is incompatible with the current operation during masked attention calculation. To resolve it, please convert the tensor using causal_mask = causal_mask.to(torch.bool) (or torch.float32, if applicable) before applying masked_fill(). If the issue persists, feel free to reach out for further assistance. Additionally, we recommend testing the code in Google Colab as an alternative environment.

Thank you.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment