M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models Paper • 2504.10449 • Published 14 days ago • 10
Language Models Prefer What They Know: Relative Confidence Estimation via Confidence Preferences Paper • 2502.01126 • Published Feb 3 • 4
METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring Paper • 2501.02045 • Published Jan 3 • 21
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 56
Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining Paper • 2410.08102 • Published Oct 10, 2024 • 20
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27, 2024 • 42
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27, 2024 • 42
NeuralArTS: Structuring Neural Architecture Search with Type Theory Paper • 2110.08710 • Published Oct 17, 2021
Towards One Shot Search Space Poisoning in Neural Architecture Search Paper • 2111.07138 • Published Nov 13, 2021
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees Paper • 2110.03313 • Published Oct 7, 2021 • 1
OpenVLA: An Open-Source Vision-Language-Action Model Paper • 2406.09246 • Published Jun 13, 2024 • 40