Mishig Davaadorj's picture

Mishig Davaadorj

mishig

·

AI & ML interests

NP-completeness, grammars, universality

Recent Activity

upvoted a paper 2 days ago

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

upvoted an article 3 days ago

Tiny Agents: a MCP-powered agent in 50 lines of code

upvoted a paper 5 days ago

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

View all activity

Organizations

mishig's activity

upvoted a paper 2 days ago

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Paper • 2504.11651 • Published 13 days ago • 28

upvoted an article 3 days ago

Article

Tiny Agents: a MCP-powered agent in 50 lines of code

4 days ago

• 172

upvoted a paper 5 days ago

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

Paper • 2201.11903 • Published Jan 28, 2022 • 13

upvoted a paper about 1 month ago

Universal Language Model Fine-tuning for Text Classification

Paper • 1801.06146 • Published Jan 18, 2018 • 7

upvoted a paper about 2 months ago

Sparse Autoencoders Find Highly Interpretable Features in Language Models

Paper • 2309.08600 • Published Sep 15, 2023 • 15

upvoted an article about 2 months ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15

• 175

upvoted an article 2 months ago

Article

Remote VAEs for decoding with HF endpoints 🤗

Feb 24

• 38

upvoted 2 papers 2 months ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 183

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 156

upvoted a collection 3 months ago

SYNTHETIC-1

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Feb 20 • 51

upvoted an article 3 months ago

Article

State of open video generation models in Diffusers

Jan 27

• 54

upvoted a paper 3 months ago

DynVFX: Augmenting Real Videos with Dynamic Content

Paper • 2502.03621 • Published Feb 5 • 30

upvoted a collection 3 months ago

Hibiki fr-en

Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated Feb 6 • 52

upvoted 3 papers 3 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 229

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 120

Deep Learning Scaling is Predictable, Empirically

Paper • 1712.00409 • Published Dec 1, 2017 • 1