Show detailed model outputs for specific benchmarks
A leaderboard to rank large reasoning models
Explore advanced functionalities in a clonable space
Generate a custom benchmark from any document
A vibe-coded horror game where you see with sound.
Ranking of LLMs for agentic tasks
A demo for exploring and analyzing large-scale model repos
A leaderboard for LLMs powering smolagents
Evaluating LLMs on Greek financial tasks
Explore and discover all leaderboards from the HF community
Large Language Diffusion Models
Generate comic book adventures
Schedule tasks efficiently using AI-generated deadlines