Submitted by xuchensong 42 Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning · 13 authors 1
Submitted by hongyuw 23 BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs · 3 authors 1
Submitted by YunxinLi 19 VideoVista-CulturalLingo: 360^circ Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension · 7 authors 1
Submitted by HanleiZhang 11 Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark · 8 authors 1
Submitted by pnawrot 8 The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs · 6 authors 2
Submitted by carpedkm 8 Subject-driven Video Generation via Disentangled Identity and Motion · 7 authors 1
Submitted by amazingj 6 DianJin-R1: Evaluating and Enhancing Financial Reasoning in Large Language Models · 7 authors 1
Submitted by zaplm 6 DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency · 7 authors 1
Submitted by alemiaschi 3 Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation · 9 authors
Submitted by Pclanglais 2 Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family · 9 authors 1