Add support for AQLM
#1
by
BlackSamorez
- opened
AQLM is a SOTA 2-bit LLM quantization algorithm, that shows incredible precision for its compression ratio. It's fully integrated with transformers and there are quite a few models prequantized.
Adding it to the leaderboard would shed light at what 2-bit quantization is really capable of.
@BlackSamorez please kindly consider to compare your method with AutoRound which have already shown remarkable results at W2G128 and W2G32, as presented in https://github.com/intel/auto-round/blob/main/docs/acc.md, without introducing any extra overhead at inference,
BlackSamorez
changed discussion status to
closed