metacritical commited on
Commit
89f340d
·
verified ·
1 Parent(s): 3e8c50c
Files changed (1) hide show
  1. index.html +172 -107
index.html CHANGED
@@ -1,119 +1,184 @@
1
  <!DOCTYPE html>
2
- <html lang="en">
3
  <head>
4
- <meta charset="UTF-8">
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
- <title>DeepSeek Papers</title>
7
- <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0-beta3/css/all.min.css">
8
- <style>
9
- body {
10
- font-family: 'Arial', sans-serif;
11
- margin: 0;
12
- padding: 0;
13
- line-height: 1.6;
14
- color: #333;
15
- background-color: #f9f9f9;
16
- }
17
- header {
18
- background: #4CAF50;
19
- color: white;
20
- padding: 20px 0;
21
- text-align: center;
22
- }
23
- h1 {
24
- margin: 0;
25
- font-size: 2.5em;
26
- }
27
- .container {
28
- max-width: 800px;
29
- margin: 20px auto;
30
- padding: 20px;
31
- background: white;
32
- border-radius: 8px;
33
- box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
34
- }
35
- .paper {
36
- margin-bottom: 20px;
37
- }
38
- .paper a {
39
- text-decoration: none;
40
- color: #4CAF50;
41
- font-weight: bold;
42
- }
43
- .paper a:hover {
44
- text-decoration: underline;
45
- }
46
- .coming-soon {
47
- color: #e74c3c;
48
- font-size: 0.9em;
49
- margin-left: 10px;
50
- }
51
- footer {
52
- text-align: center;
53
- padding: 10px 0;
54
- background: #4CAF50;
55
- color: white;
56
- margin-top: 20px;
57
- }
58
- </style>
59
  </head>
60
  <body>
61
- <header>
62
- <h1>DeepSeek Papers</h1>
63
- </header>
64
- <div class="container">
65
- <h2>DeepSeek Research Contributions</h2>
66
- <p>Below is a list of significant papers by DeepSeek detailing advancements in large language models (LLMs). Each paper includes a brief description and highlights upcoming deep dives.</p>
67
 
68
- <!-- Paper List -->
69
- <div class="paper">
70
- <a href="#">DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism</a>
71
- <span class="coming-soon">[Deep Dive Coming Soon]</span>
72
- <p><strong>Release Date:</strong> November 29, 2023<br>
73
- This foundational paper explores scaling laws and the trade-offs between data and model size, establishing the groundwork for subsequent models.</p>
74
- </div>
75
- <div class="paper">
76
- <a href="#">DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model</a>
77
- <span class="coming-soon">[Deep Dive Coming Soon]</span>
78
- <p><strong>Release Date:</strong> May 2024<br>
79
- This paper introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing training costs by 42%.</p>
80
- </div>
81
- <div class="paper">
82
- <a href="#">DeepSeek-V3 Technical Report</a>
83
- <span class="coming-soon">[Deep Dive Coming Soon]</span>
84
- <p><strong>Release Date:</strong> December 2024<br>
85
- This report discusses the scaling of sparse MoE networks to 671 billion parameters, utilizing mixed precision training and HPC co-design strategies.</p>
86
- </div>
87
- <div class="paper">
88
- <a href="#">DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning</a>
89
- <span class="coming-soon">[Deep Dive Coming Soon]</span>
90
- <p><strong>Release Date:</strong> January 20, 2025<br>
91
- The R1 model enhances reasoning capabilities through large-scale reinforcement learning, competing directly with leading models like OpenAI's o1.</p>
92
  </div>
93
- <div class="paper">
94
- <a href="#">DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models</a>
95
- <span class="coming-soon">[Deep Dive Coming Soon]</span>
96
- <p><strong>Release Date:</strong> April 2024<br>
97
- This paper presents methods to improve mathematical reasoning in LLMs, introducing the Group Relative Policy Optimization (GRPO) algorithm.</p>
98
- </div>
99
- <div class="paper">
100
- <a href="#">DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data</a>
101
- <span class="coming-soon">[Deep Dive Coming Soon]</span>
102
- <p>Focuses on enhancing theorem proving capabilities in language models using synthetic data for training.</p>
 
 
 
 
 
 
 
103
  </div>
104
- <div class="paper">
105
- <a href="#">DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence</a>
106
- <span class="coming-soon">[Deep Dive Coming Soon]</span>
107
- <p>This paper details advancements in code-related tasks with an emphasis on open-source methodologies, improving upon earlier coding models.</p>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
108
  </div>
109
- <div class="paper">
110
- <a href="#">DeepSeekMoE</a>
111
- <span class="coming-soon">[Deep Dive Coming Soon]</span>
112
- <p>Discusses the integration and benefits of the Mixture-of-Experts approach within the DeepSeek framework.</p>
 
 
 
 
 
 
113
  </div>
114
  </div>
115
- <footer>
116
- &copy; 2025 DeepSeek Research. All rights reserved.
117
- </footer>
118
  </body>
119
  </html>
 
1
  <!DOCTYPE html>
2
+ <html>
3
  <head>
4
+ <meta charset="utf-8">
5
+ <meta name="description" content="DeepSeek: Advancing Open-Source Language Models">
6
+ <meta name="keywords" content="DeepSeek, LLM, AI">
7
+ <meta name="viewport" content="width=device-width, initial-scale=1">
8
+ <title>DeepSeek: Advancing Open-Source Language Models</title>
9
+
10
+ <link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet">
11
+ <link rel="stylesheet" href="./static/css/bulma.min.css">
12
+ <link rel="stylesheet" href="./static/css/bulma-carousel.min.css">
13
+ <link rel="stylesheet" href="./static/css/bulma-slider.min.css">
14
+ <link rel="stylesheet" href="./static/css/fontawesome.all.min.css">
15
+ <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
16
+ <link rel="stylesheet" href="./static/css/index.css">
17
+ <link rel="icon" href="./static/images/favicon.svg">
18
+
19
+ <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
20
+ <script defer src="./static/js/fontawesome.all.min.js"></script>
21
+ <script src="./static/js/bulma-carousel.min.js"></script>
22
+ <script src="./static/js/bulma-slider.min.js"></script>
23
+ <script src="./static/js/index.js"></script>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  </head>
25
  <body>
 
 
 
 
 
 
26
 
27
+ <section class="hero">
28
+ <div class="hero-body">
29
+ <div class="container is-max-desktop">
30
+ <div class="columns is-centered">
31
+ <div class="column has-text-centered">
32
+ <h1 class="title is-1 publication-title">DeepSeek Papers</h1>
33
+ <div class="is-size-5 publication-authors">
34
+ Advancing Open-Source Language Models
35
+ </div>
36
+ </div>
37
+ </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  </div>
39
+ </div>
40
+ </section>
41
+
42
+ <section class="section">
43
+ <div class="container is-max-desktop">
44
+ <!-- Abstract. -->
45
+ <div class="columns is-centered has-text-centered">
46
+ <div class="column is-four-fifths">
47
+ <h2 class="title is-3">DeepSeek Research Contributions</h2>
48
+ <div class="content has-text-justified">
49
+ <p>
50
+ Below is a list of significant papers by DeepSeek detailing advancements in large language models (LLMs),
51
+ ordered by release date from most recent to oldest. Each paper includes a brief description and highlights
52
+ upcoming deep dives.
53
+ </p>
54
+ </div>
55
+ </div>
56
  </div>
57
+ <!--/ Abstract. -->
58
+
59
+ <!-- Paper Collection -->
60
+ <div class="columns is-centered">
61
+ <div class="column is-four-fifths">
62
+ <div class="content">
63
+ <div class="publication-list">
64
+ <!-- Papers in chronological order -->
65
+ <div class="publication-item">
66
+ <div class="publication-title">
67
+ <a href="#">DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning</a>
68
+ <span class="tag is-info is-light">[Deep Dive Coming Soon]</span>
69
+ </div>
70
+ <div class="publication-info">
71
+ <strong>Release Date:</strong> January 20, 2025
72
+ </div>
73
+ <div class="publication-description">
74
+ The R1 model enhances reasoning capabilities through large-scale reinforcement learning, competing
75
+ directly with leading models like OpenAI's o1.
76
+ </div>
77
+ </div>
78
+
79
+ <div class="publication-item">
80
+ <div class="publication-title">
81
+ <a href="#">DeepSeek-V3 Technical Report</a>
82
+ <span class="tag is-info is-light">[Deep Dive Coming Soon]</span>
83
+ </div>
84
+ <div class="publication-info">
85
+ <strong>Release Date:</strong> December 2024
86
+ </div>
87
+ <div class="publication-description">
88
+ This report discusses the scaling of sparse MoE networks to 671 billion parameters, utilizing mixed
89
+ precision training and HPC co-design strategies.
90
+ </div>
91
+ </div>
92
+
93
+ <div class="publication-item">
94
+ <div class="publication-title">
95
+ <a href="#">DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model</a>
96
+ <span class="tag is-info is-light">[Deep Dive Coming Soon]</span>
97
+ </div>
98
+ <div class="publication-info">
99
+ <strong>Release Date:</strong> May 2024
100
+ </div>
101
+ <div class="publication-description">
102
+ This paper introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing
103
+ training costs by 42%.
104
+ </div>
105
+ </div>
106
+
107
+ <div class="publication-item">
108
+ <div class="publication-title">
109
+ <a href="#">DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models</a>
110
+ <span class="tag is-info is-light">[Deep Dive Coming Soon]</span>
111
+ </div>
112
+ <div class="publication-info">
113
+ <strong>Release Date:</strong> April 2024
114
+ </div>
115
+ <div class="publication-description">
116
+ This paper presents methods to improve mathematical reasoning in LLMs, introducing the Group
117
+ Relative Policy Optimization (GRPO) algorithm.
118
+ </div>
119
+ </div>
120
+
121
+ <div class="publication-item">
122
+ <div class="publication-title">
123
+ <a href="#">DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism</a>
124
+ <span class="tag is-info is-light">[Deep Dive Coming Soon]</span>
125
+ </div>
126
+ <div class="publication-info">
127
+ <strong>Release Date:</strong> November 29, 2023
128
+ </div>
129
+ <div class="publication-description">
130
+ This foundational paper explores scaling laws and the trade-offs between data and model size,
131
+ establishing the groundwork for subsequent models.
132
+ </div>
133
+ </div>
134
+
135
+ <div class="publication-item">
136
+ <div class="publication-title">
137
+ <a href="#">DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data</a>
138
+ <span class="tag is-info is-light">[Deep Dive Coming Soon]</span>
139
+ </div>
140
+ <div class="publication-description">
141
+ Focuses on enhancing theorem proving capabilities in language models using synthetic data for training.
142
+ </div>
143
+ </div>
144
+
145
+ <div class="publication-item">
146
+ <div class="publication-title">
147
+ <a href="#">DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence</a>
148
+ <span class="tag is-info is-light">[Deep Dive Coming Soon]</span>
149
+ </div>
150
+ <div class="publication-description">
151
+ This paper details advancements in code-related tasks with an emphasis on open-source methodologies,
152
+ improving upon earlier coding models.
153
+ </div>
154
+ </div>
155
+
156
+ <div class="publication-item">
157
+ <div class="publication-title">
158
+ <a href="#">DeepSeekMoE</a>
159
+ <span class="tag is-info is-light">[Deep Dive Coming Soon]</span>
160
+ </div>
161
+ <div class="publication-description">
162
+ Discusses the integration and benefits of the Mixture-of-Experts approach within the DeepSeek framework.
163
+ </div>
164
+ </div>
165
+ </div>
166
+ </div>
167
+ </div>
168
  </div>
169
+ </div>
170
+ </section>
171
+
172
+ <footer class="footer">
173
+ <div class="container">
174
+ <div class="content has-text-centered">
175
+ <p>
176
+ This website is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative
177
+ Commons Attribution-ShareAlike 4.0 International License</a>.
178
+ </p>
179
  </div>
180
  </div>
181
+ </footer>
182
+
 
183
  </body>
184
  </html>