michaelbenayoun/llama-2-tiny-4kv-heads-16layers-random Text Generation • Updated 5 days ago • 4.28k
Running 2.53k 2.53k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
michaelbenayoun/llama-2-tiny-4kv-heads-4layers-random Text Generation • Updated Oct 14, 2024 • 4.66k
Distributed Training Collection Papers and resources related to distributed training. • 5 items • Updated Jun 3, 2024