File size: 359 Bytes
5caedb4
 
 
 
 
 
1
2
3
4
5
6
Allows to change the utilized attention implementation. 

- **Auto** selection will automatically choose the implementation based on system availability.
- **Eager** relies on vanilla attention implementation in Python.
- **SDPA** uses scaled dot product attention in PyTorch.
- **Flash Attention 2** explicitly uses FA2 which requires the flash_attn package.