Spaces:
Running
Running
File size: 7,713 Bytes
85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 85bcb47 94b76d5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 |
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="description" content="DeepSeek: Advancing Open-Source Language Models">
<meta name="keywords" content="DeepSeek, LLM, AI">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>DeepSeek: Advancing Open-Source Language Models</title>
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet">
<link rel="stylesheet" href="./static/css/bulma.min.css">
<link rel="stylesheet" href="./static/css/bulma-carousel.min.css">
<link rel="stylesheet" href="./static/css/bulma-slider.min.css">
<link rel="stylesheet" href="./static/css/fontawesome.all.min.css">
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<link rel="stylesheet" href="./static/css/index.css">
<link rel="icon" href="./static/images/favicon.svg">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script defer src="./static/js/fontawesome.all.min.js"></script>
<script src="./static/js/bulma-carousel.min.js"></script>
<script src="./static/js/bulma-slider.min.js"></script>
<script src="./static/js/index.js"></script>
</head>
<body>
<section class="hero">
<div class="hero-body">
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column has-text-centered">
<h1 class="title is-1 publication-title">DeepSeek: Advancing Open-Source Language Models</h1>
<div class="is-size-5 publication-authors">
A collection of groundbreaking research papers in AI and language models
</div>
</div>
</div>
</div>
</div>
</section>
<section class="section">
<div class="container is-max-desktop">
<!-- Abstract. -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Overview</h2>
<div class="content has-text-justified">
<p>
DeepSeek has released a series of significant papers detailing advancements in large language models (LLMs).
Each paper represents a step forward in making AI more capable, efficient, and accessible.
</p>
</div>
</div>
</div>
<!--/ Abstract. -->
<!-- Paper Collection -->
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Research Papers</h2>
<!-- Paper 1 -->
<div class="publication-block">
<div class="publication-header">
<h3 class="title is-4">DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism</h3>
<span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
<div class="is-size-5 publication-authors">
Released: November 29, 2023
</div>
</div>
<div class="content has-text-justified">
<p>This foundational paper explores scaling laws and the trade-offs between data and model size,
establishing the groundwork for subsequent models.</p>
</div>
</div>
<!-- Paper 2 -->
<div class="publication-block">
<div class="publication-header">
<h3 class="title is-4">DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model</h3>
<span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
<div class="is-size-5 publication-authors">
Released: May 2024
</div>
</div>
<div class="content has-text-justified">
<p>Introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing
training costs by 42%.</p>
</div>
</div>
<!-- Additional papers following same structure -->
<div class="publication-block">
<div class="publication-header">
<h3 class="title is-4">DeepSeek-V3 Technical Report</h3>
<span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
<div class="is-size-5 publication-authors">
Released: December 2024
</div>
</div>
<div class="content has-text-justified">
<p>Discusses the scaling of sparse MoE networks to 671 billion parameters.</p>
</div>
</div>
<div class="publication-block">
<div class="publication-header">
<h3 class="title is-4">DeepSeek-R1: Incentivizing Reasoning Capability in LLMs</h3>
<span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
<div class="is-size-5 publication-authors">
Released: January 20, 2025
</div>
</div>
<div class="content has-text-justified">
<p>Enhances reasoning capabilities through large-scale reinforcement learning.</p>
</div>
</div>
<div class="publication-block">
<div class="publication-header">
<h3 class="title is-4">DeepSeekMath: Pushing the Limits of Mathematical Reasoning</h3>
<span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
<div class="is-size-5 publication-authors">
Released: April 2024
</div>
</div>
<div class="content has-text-justified">
<p>Presents methods to improve mathematical reasoning in LLMs.</p>
</div>
</div>
<div class="publication-block">
<div class="publication-header">
<h3 class="title is-4">DeepSeek-Prover: Advancing Theorem Proving in LLMs</h3>
<span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
</div>
<div class="content has-text-justified">
<p>Focuses on enhancing theorem proving capabilities using synthetic data for training.</p>
</div>
</div>
<div class="publication-block">
<div class="publication-header">
<h3 class="title is-4">DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models</h3>
<span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
</div>
<div class="content has-text-justified">
<p>Details advancements in code-related tasks with emphasis on open-source methodologies.</p>
</div>
</div>
<div class="publication-block">
<div class="publication-header">
<h3 class="title is-4">DeepSeekMoE: Advancing Mixture-of-Experts Architecture</h3>
<span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
</div>
<div class="content has-text-justified">
<p>Discusses the integration and benefits of the Mixture-of-Experts approach.</p>
</div>
</div>
</div>
</div>
</div>
</section>
<footer class="footer">
<div class="container">
<div class="content has-text-centered">
<a class="icon-link" href="https://github.com/deepseek-ai" target="_blank" class="external-link">
<i class="fab fa-github"></i>
</a>
</div>
<div class="columns is-centered">
<div class="column is-8">
<div class="content">
<p>
This website is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative
Commons Attribution-ShareAlike 4.0 International License</a>.
</p>
</div>
</div>
</div>
</div>
</footer>
</body>
</html> |