Spaces:

nuojohnchen
/

JudgeLRMDemo

Running

nuojohnchen commited on 29 days ago

Commit

dac23c7

verified ·

1 Parent(s): 3fe8312

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -11,7 +11,7 @@ HF_TOKEN = os.environ.get("HF_TOKEN", None)
 DESCRIPTION = '''
 <div>
-<h1 style="text-align: center;">JudgeLRM: Large Reasoning Models as a Judge</h1>
 <p>This Space demonstrates the <a href="https://huggingface.co/nuojohnchen/JudgeLRM-7B"><b>JudgeLRM</b></a> model, designed to evaluate the quality of two AI assistant responses. JudgeLRM is a family of judgment-oriented LLMs trained using reinforcement learning (RL) with judge-wise, outcome-driven rewards. JudgeLRM models consistently outperform both SFT-tuned and state-of-the-art reasoning models. Notably, JudgeLRM-3B surpasses GPT-4, and JudgeLRM-7B outperforms DeepSeek-R1 by 2.79\% in F1 score, particularly excelling in judge tasks requiring deep reasoning.</p>
 <p>Enter an instruction and two responses, and the model will think, reason and score them on a scale of 1-10 (higher is better).</p>
 <p>You can also select Hugging Face models to automatically generate responses for evaluation.</p>

 DESCRIPTION = '''
 <div>
+<h1 style="text-align: center;">JudgeLRM</h1>
 <p>This Space demonstrates the <a href="https://huggingface.co/nuojohnchen/JudgeLRM-7B"><b>JudgeLRM</b></a> model, designed to evaluate the quality of two AI assistant responses. JudgeLRM is a family of judgment-oriented LLMs trained using reinforcement learning (RL) with judge-wise, outcome-driven rewards. JudgeLRM models consistently outperform both SFT-tuned and state-of-the-art reasoning models. Notably, JudgeLRM-3B surpasses GPT-4, and JudgeLRM-7B outperforms DeepSeek-R1 by 2.79\% in F1 score, particularly excelling in judge tasks requiring deep reasoning.</p>
 <p>Enter an instruction and two responses, and the model will think, reason and score them on a scale of 1-10 (higher is better).</p>
 <p>You can also select Hugging Face models to automatically generate responses for evaluation.</p>