RuntimeError: value cannot be converted to type uint8_t without overflow
I have a rx 7800 xt and i was trying to run it with directml on windows but i get that error
venv\Lib\site-packages\transformers\models\gemma3\modeling_gemma3.py", line 873, in _prepare_4d_causal_attention_mask_with_cache_position
causal_mask[:, :, :, :mask_length] = causal_mask[:, :, :, :mask_length].masked_fill(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: value cannot be converted to type uint8_t without overflow
Hi @TyJaJa ,
This issue is caused by a type mismatch, the causal_mask
tensor is of type uint8
, which is incompatible with the current operation during masked attention calculation. To resolve it, please convert the tensor using causal_mask = causal_mask.to(torch.bool)
(or torch.float32
, if applicable) before applying masked_fill()
. If the issue persists, feel free to reach out for further assistance. Additionally, we recommend testing the code in Google Colab as an alternative environment.
Thank you.