4 4 9

EnxinSong

Enxin

AI & ML interests

None yet

Recent Activity

published a dataset 1 day ago

Enxin/lmms_video_mmlu

updated a collection 1 day ago

Video-MMLU

updated a collection 1 day ago

Video-MMLU

View all activity

Organizations

Posts 1

Post

1054

🎉 Introducing Video-MMLU, a new benchmark for evaluating large multimodal models on classroom-style lectures in math, physics, and chemistry!

🧑‍🏫📚Video-MMLU requires strong reasoning capabilities and world knowledge compared to the previous benchmarks for video LMMs.

Each video comes with two tasks:
📝 Take Notes — detailed captioning of multi-discipline lectures
🧠 Do Quiz — open-ended QA to test reasoning over visuals & proofs

We evaluated 90+ models, including vision-blind baselines, open-source models and proprietary ones.
📉 We find that existing models generally perform poorly, with accuracy ranging from only 10% to 50%.
📉We also explore how the number of visual tokens and the base LLMs influence performance, offering insights into the interplay between multimodal perception and reasoning in lecture comprehension.

For more details, please check below：
📄 Paper: https://arxiv.org/abs/2504.14693
💻 Code: https://github.com/Espere-1119-Song/Video-MMLU
🧠 Data: Enxin/Video-MMLU
🌐 Website: https://enxinsong.com/Video-MMLU-web/

Collections 2

Papers 2

arxiv:2410.08261

arxiv:2307.16449

models 2

Enxin/MovieChat-proj

Updated Apr 19, 2024

Enxin/MovieChat-vicuna

Text Generation • Updated Apr 19, 2024 • 174 • 2

datasets 9

EnxinSong

AI & ML interests

Recent Activity

Organizations

Posts 1

Collections 2

Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark

Enxin/Video-MMLU

Enxin/lmms_video_mmlu

Enxin/MovieChat-vicuna

lmms-lab/MovieChat-ckpt

Enxin/MovieChat-1K_train

Enxin/MovieChat-1K-test

Papers 2

models 2

Enxin/MovieChat-proj

Enxin/MovieChat-vicuna

datasets 9

Enxin/Video-MMLU

Enxin/lmms_video_mmlu

Enxin/EVU_scource

Enxin/VLMEval-VDC

Enxin/VLMEval-MovieChat1k

Enxin/lmms_MovieChat_test

Enxin/MovieChat-1K-test

Enxin/MovieChat-1K_train

Enxin/Football

EnxinSong

AI & ML interests

Recent Activity

Organizations

Posts 1

Collections 2

Papers 2

models 2 Sort: Recently updated

datasets 9 Sort: Recently updated

models 2

datasets 9