Evaluation datasets

community

AI & ML interests

None defined yet.

Recent Activity

thomwolf authored a paper 20 days ago

SmolVLM: Redefining small and efficient multimodal models

lewtun authored a paper 20 days ago

SmolVLM: Redefining small and efficient multimodal models

thomwolf authored a paper 21 days ago

YourBench: Easy Custom Evaluation Sets for Everyone

View all activity

models 0

None public yet

datasets 75

lighteval/okapi_mmlu

Viewer • Updated Mar 24 • 443k • 491 • 1

lighteval/okapi_arc_challenge

Viewer • Updated Mar 24 • 79.6k • 63 • 1

lighteval/small_natural_questions

Viewer • Updated Jan 29 • 1.71k • 25

lighteval/SimpleQA

Viewer • Updated Jan 28 • 4.33k • 48 • 2

lighteval/MWP-TR

Viewer • Updated Jan 10 • 4.16k • 32

lighteval/MathQA-TR

Viewer • Updated Jan 10 • 19.6k • 31

lighteval/QazUNTv2

Viewer • Updated Nov 26, 2024 • 1.7k • 35

lighteval/HAWP

Viewer • Updated Nov 19, 2024 • 2.34k • 32 • 1

lighteval/elkarhizketak

Viewer • Updated Oct 8, 2024 • 1.63k • 22

lighteval/hellaswag_thai

Viewer • Updated Sep 25, 2024 • 25.6k • 41