Spaces:

metacritical
/

DeepSeekPapers

Running

File size: 4,653 Bytes

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>DeepSeek Papers</title>
  <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0-beta3/css/all.min.css">
  <style>
    body {
      font-family: 'Arial', sans-serif;
      margin: 0;
      padding: 0;
      line-height: 1.6;
      color: #333;
      background-color: #f9f9f9;
    }
    header {
      background: #4CAF50;
      color: white;
      padding: 20px 0;
      text-align: center;
    }
    h1 {
      margin: 0;
      font-size: 2.5em;
    }
    .container {
      max-width: 800px;
      margin: 20px auto;
      padding: 20px;
      background: white;
      border-radius: 8px;
      box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
    }
    .paper {
      margin-bottom: 20px;
    }
    .paper a {
      text-decoration: none;
      color: #4CAF50;
      font-weight: bold;
    }
    .paper a:hover {
      text-decoration: underline;
    }
    .coming-soon {
      color: #e74c3c;
      font-size: 0.9em;
      margin-left: 10px;
    }
    footer {
      text-align: center;
      padding: 10px 0;
      background: #4CAF50;
      color: white;
      margin-top: 20px;
    }
  </style>
</head>
<body>
  <header>
    <h1>DeepSeek Papers</h1>
  </header>
  <div class="container">
    <h2>DeepSeek Research Contributions</h2>
    <p>Below is a list of significant papers by DeepSeek detailing advancements in large language models (LLMs). Each paper includes a brief description and highlights upcoming deep dives.</p>

    <!-- Paper List -->
    <div class="paper">
      <a href="#">DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism</a>
      <span class="coming-soon">[Deep Dive Coming Soon]</span>
      <p><strong>Release Date:</strong> November 29, 2023<br>
      This foundational paper explores scaling laws and the trade-offs between data and model size, establishing the groundwork for subsequent models.</p>
    </div>
    <div class="paper">
      <a href="#">DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model</a>
      <span class="coming-soon">[Deep Dive Coming Soon]</span>
      <p><strong>Release Date:</strong> May 2024<br>
      This paper introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing training costs by 42%.</p>
    </div>
    <div class="paper">
      <a href="#">DeepSeek-V3 Technical Report</a>
      <span class="coming-soon">[Deep Dive Coming Soon]</span>
      <p><strong>Release Date:</strong> December 2024<br>
      This report discusses the scaling of sparse MoE networks to 671 billion parameters, utilizing mixed precision training and HPC co-design strategies.</p>
    </div>
    <div class="paper">
      <a href="#">DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning</a>
      <span class="coming-soon">[Deep Dive Coming Soon]</span>
      <p><strong>Release Date:</strong> January 20, 2025<br>
      The R1 model enhances reasoning capabilities through large-scale reinforcement learning, competing directly with leading models like OpenAI's o1.</p>
    </div>
    <div class="paper">
      <a href="#">DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models</a>
      <span class="coming-soon">[Deep Dive Coming Soon]</span>
      <p><strong>Release Date:</strong> April 2024<br>
      This paper presents methods to improve mathematical reasoning in LLMs, introducing the Group Relative Policy Optimization (GRPO) algorithm.</p>
    </div>
    <div class="paper">
      <a href="#">DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data</a>
      <span class="coming-soon">[Deep Dive Coming Soon]</span>
      <p>Focuses on enhancing theorem proving capabilities in language models using synthetic data for training.</p>
    </div>
    <div class="paper">
      <a href="#">DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence</a>
      <span class="coming-soon">[Deep Dive Coming Soon]</span>
      <p>This paper details advancements in code-related tasks with an emphasis on open-source methodologies, improving upon earlier coding models.</p>
    </div>
    <div class="paper">
      <a href="#">DeepSeekMoE</a>
      <span class="coming-soon">[Deep Dive Coming Soon]</span>
      <p>Discusses the integration and benefits of the Mixture-of-Experts approach within the DeepSeek framework.</p>
    </div>
  </div>
  <footer>
    &copy; 2025 DeepSeek Research. All rights reserved.
  </footer>
</body>
</html>