70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Paper • 2504.11651 • Published 13 days ago • 28
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models Paper • 2201.11903 • Published Jan 28, 2022 • 13
Universal Language Model Fine-tuning for Text Classification Paper • 1801.06146 • Published Jan 18, 2018 • 7
Sparse Autoencoders Find Highly Interpretable Features in Language Models Paper • 2309.08600 • Published Sep 15, 2023 • 15
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 175
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 156
SYNTHETIC-1 Collection A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Feb 20 • 51
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated Feb 6 • 52
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 229