Spaces:
Sleeping
Sleeping
Allows to change the utilized attention implementation. | |
- **Auto** selection will automatically choose the implementation based on system availability. | |
- **Eager** relies on vanilla attention implementation in Python. | |
- **SDPA** uses scaled dot product attention in PyTorch. | |
- **Flash Attention 2** explicitly uses FA2 which requires the flash_attn package. |