Spaces:
Running
Running
<html lang="en"> | |
<head> | |
<meta charset="UTF-8"> | |
<meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
<title>DeepSeek Papers</title> | |
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0-beta3/css/all.min.css"> | |
<style> | |
body { | |
font-family: 'Arial', sans-serif; | |
margin: 0; | |
padding: 0; | |
line-height: 1.6; | |
color: #333; | |
background-color: #f9f9f9; | |
} | |
header { | |
background: #4CAF50; | |
color: white; | |
padding: 20px 0; | |
text-align: center; | |
} | |
h1 { | |
margin: 0; | |
font-size: 2.5em; | |
} | |
.container { | |
max-width: 800px; | |
margin: 20px auto; | |
padding: 20px; | |
background: white; | |
border-radius: 8px; | |
box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1); | |
} | |
.paper { | |
margin-bottom: 20px; | |
} | |
.paper a { | |
text-decoration: none; | |
color: #4CAF50; | |
font-weight: bold; | |
} | |
.paper a:hover { | |
text-decoration: underline; | |
} | |
.coming-soon { | |
color: #e74c3c; | |
font-size: 0.9em; | |
margin-left: 10px; | |
} | |
footer { | |
text-align: center; | |
padding: 10px 0; | |
background: #4CAF50; | |
color: white; | |
margin-top: 20px; | |
} | |
</style> | |
</head> | |
<body> | |
<header> | |
<h1>DeepSeek Papers</h1> | |
</header> | |
<div class="container"> | |
<h2>DeepSeek Research Contributions</h2> | |
<p>Below is a list of significant papers by DeepSeek detailing advancements in large language models (LLMs). Each paper includes a brief description and highlights upcoming deep dives.</p> | |
<!-- Paper List --> | |
<div class="paper"> | |
<a href="#">DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism</a> | |
<span class="coming-soon">[Deep Dive Coming Soon]</span> | |
<p><strong>Release Date:</strong> November 29, 2023<br> | |
This foundational paper explores scaling laws and the trade-offs between data and model size, establishing the groundwork for subsequent models.</p> | |
</div> | |
<div class="paper"> | |
<a href="#">DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model</a> | |
<span class="coming-soon">[Deep Dive Coming Soon]</span> | |
<p><strong>Release Date:</strong> May 2024<br> | |
This paper introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing training costs by 42%.</p> | |
</div> | |
<div class="paper"> | |
<a href="#">DeepSeek-V3 Technical Report</a> | |
<span class="coming-soon">[Deep Dive Coming Soon]</span> | |
<p><strong>Release Date:</strong> December 2024<br> | |
This report discusses the scaling of sparse MoE networks to 671 billion parameters, utilizing mixed precision training and HPC co-design strategies.</p> | |
</div> | |
<div class="paper"> | |
<a href="#">DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning</a> | |
<span class="coming-soon">[Deep Dive Coming Soon]</span> | |
<p><strong>Release Date:</strong> January 20, 2025<br> | |
The R1 model enhances reasoning capabilities through large-scale reinforcement learning, competing directly with leading models like OpenAI's o1.</p> | |
</div> | |
<div class="paper"> | |
<a href="#">DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models</a> | |
<span class="coming-soon">[Deep Dive Coming Soon]</span> | |
<p><strong>Release Date:</strong> April 2024<br> | |
This paper presents methods to improve mathematical reasoning in LLMs, introducing the Group Relative Policy Optimization (GRPO) algorithm.</p> | |
</div> | |
<div class="paper"> | |
<a href="#">DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data</a> | |
<span class="coming-soon">[Deep Dive Coming Soon]</span> | |
<p>Focuses on enhancing theorem proving capabilities in language models using synthetic data for training.</p> | |
</div> | |
<div class="paper"> | |
<a href="#">DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence</a> | |
<span class="coming-soon">[Deep Dive Coming Soon]</span> | |
<p>This paper details advancements in code-related tasks with an emphasis on open-source methodologies, improving upon earlier coding models.</p> | |
</div> | |
<div class="paper"> | |
<a href="#">DeepSeekMoE</a> | |
<span class="coming-soon">[Deep Dive Coming Soon]</span> | |
<p>Discusses the integration and benefits of the Mixture-of-Experts approach within the DeepSeek framework.</p> | |
</div> | |
</div> | |
<footer> | |
© 2025 DeepSeek Research. All rights reserved. | |
</footer> | |
</body> | |
</html> |