Spaces:
Sleeping
Sleeping
File size: 359 Bytes
5caedb4 |
1 2 3 4 5 6 |
Allows to change the utilized attention implementation.
- **Auto** selection will automatically choose the implementation based on system availability.
- **Eager** relies on vanilla attention implementation in Python.
- **SDPA** uses scaled dot product attention in PyTorch.
- **Flash Attention 2** explicitly uses FA2 which requires the flash_attn package. |