File size: 7,713 Bytes
85bcb47
 
 
 
94b76d5
 
85bcb47
94b76d5
85bcb47
94b76d5
85bcb47
 
 
 
94b76d5
85bcb47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94b76d5
85bcb47
94b76d5
85bcb47
 
 
 
 
 
 
 
 
 
 
 
94b76d5
85bcb47
 
94b76d5
 
85bcb47
 
 
 
 
 
94b76d5
85bcb47
 
94b76d5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
85bcb47
 
94b76d5
 
 
 
 
 
 
 
 
 
 
 
 
85bcb47
 
94b76d5
 
 
 
 
 
 
 
 
 
 
85bcb47
 
 
94b76d5
 
 
 
 
 
85bcb47
 
94b76d5
 
85bcb47
 
 
94b76d5
 
 
 
 
 
 
 
 
 
 
85bcb47
 
94b76d5
 
 
 
 
 
 
 
 
85bcb47
94b76d5
 
 
 
 
 
 
 
 
85bcb47
94b76d5
 
 
 
 
 
 
 
85bcb47
 
 
 
 
 
 
 
 
94b76d5
85bcb47
 
 
 
 
 
 
94b76d5
85bcb47
 
 
 
 
 
 
 
 
94b76d5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
<!DOCTYPE html>
<html>
<head>
  <meta charset="utf-8">
  <meta name="description" content="DeepSeek: Advancing Open-Source Language Models">
  <meta name="keywords" content="DeepSeek, LLM, AI">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>DeepSeek: Advancing Open-Source Language Models</title>

  <link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet">
  <link rel="stylesheet" href="./static/css/bulma.min.css">
  <link rel="stylesheet" href="./static/css/bulma-carousel.min.css">
  <link rel="stylesheet" href="./static/css/bulma-slider.min.css">
  <link rel="stylesheet" href="./static/css/fontawesome.all.min.css">
  <link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
  <link rel="stylesheet" href="./static/css/index.css">
  <link rel="icon" href="./static/images/favicon.svg">

  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
  <script defer src="./static/js/fontawesome.all.min.js"></script>
  <script src="./static/js/bulma-carousel.min.js"></script>
  <script src="./static/js/bulma-slider.min.js"></script>
  <script src="./static/js/index.js"></script>
</head>
<body>

<section class="hero">
  <div class="hero-body">
    <div class="container is-max-desktop">
      <div class="columns is-centered">
        <div class="column has-text-centered">
          <h1 class="title is-1 publication-title">DeepSeek: Advancing Open-Source Language Models</h1>
          <div class="is-size-5 publication-authors">
            A collection of groundbreaking research papers in AI and language models
          </div>
        </div>
      </div>
    </div>
  </div>
</section>

<section class="section">
  <div class="container is-max-desktop">
    <!-- Abstract. -->
    <div class="columns is-centered has-text-centered">
      <div class="column is-four-fifths">
        <h2 class="title is-3">Overview</h2>
        <div class="content has-text-justified">
          <p>
            DeepSeek has released a series of significant papers detailing advancements in large language models (LLMs). 
            Each paper represents a step forward in making AI more capable, efficient, and accessible.
          </p>
        </div>
      </div>
    </div>
    <!--/ Abstract. -->

    <!-- Paper Collection -->
    <div class="columns is-centered has-text-centered">
      <div class="column is-four-fifths">
        <h2 class="title is-3">Research Papers</h2>
        
        <!-- Paper 1 -->
        <div class="publication-block">
          <div class="publication-header">
            <h3 class="title is-4">DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism</h3>
            <span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
            <div class="is-size-5 publication-authors">
              Released: November 29, 2023
            </div>
          </div>
          <div class="content has-text-justified">
            <p>This foundational paper explores scaling laws and the trade-offs between data and model size, 
            establishing the groundwork for subsequent models.</p>
          </div>
        </div>

        <!-- Paper 2 -->
        <div class="publication-block">
          <div class="publication-header">
            <h3 class="title is-4">DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model</h3>
            <span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
            <div class="is-size-5 publication-authors">
              Released: May 2024
            </div>
          </div>
          <div class="content has-text-justified">
            <p>Introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing 
            training costs by 42%.</p>
          </div>
        </div>

        <!-- Additional papers following same structure -->
        <div class="publication-block">
          <div class="publication-header">
            <h3 class="title is-4">DeepSeek-V3 Technical Report</h3>
            <span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
            <div class="is-size-5 publication-authors">
              Released: December 2024
            </div>
          </div>
          <div class="content has-text-justified">
            <p>Discusses the scaling of sparse MoE networks to 671 billion parameters.</p>
          </div>
        </div>

        <div class="publication-block">
          <div class="publication-header">
            <h3 class="title is-4">DeepSeek-R1: Incentivizing Reasoning Capability in LLMs</h3>
            <span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
            <div class="is-size-5 publication-authors">
              Released: January 20, 2025
            </div>
          </div>
          <div class="content has-text-justified">
            <p>Enhances reasoning capabilities through large-scale reinforcement learning.</p>
          </div>
        </div>

        <div class="publication-block">
          <div class="publication-header">
            <h3 class="title is-4">DeepSeekMath: Pushing the Limits of Mathematical Reasoning</h3>
            <span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
            <div class="is-size-5 publication-authors">
              Released: April 2024
            </div>
          </div>
          <div class="content has-text-justified">
            <p>Presents methods to improve mathematical reasoning in LLMs.</p>
          </div>
        </div>

        <div class="publication-block">
          <div class="publication-header">
            <h3 class="title is-4">DeepSeek-Prover: Advancing Theorem Proving in LLMs</h3>
            <span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
          </div>
          <div class="content has-text-justified">
            <p>Focuses on enhancing theorem proving capabilities using synthetic data for training.</p>
          </div>
        </div>

        <div class="publication-block">
          <div class="publication-header">
            <h3 class="title is-4">DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models</h3>
            <span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
          </div>
          <div class="content has-text-justified">
            <p>Details advancements in code-related tasks with emphasis on open-source methodologies.</p>
          </div>
        </div>

        <div class="publication-block">
          <div class="publication-header">
            <h3 class="title is-4">DeepSeekMoE: Advancing Mixture-of-Experts Architecture</h3>
            <span class="tag is-primary is-medium">Deep Dive Coming Soon</span>
          </div>
          <div class="content has-text-justified">
            <p>Discusses the integration and benefits of the Mixture-of-Experts approach.</p>
          </div>
        </div>
      </div>
    </div>
  </div>
</section>

<footer class="footer">
  <div class="container">
    <div class="content has-text-centered">
      <a class="icon-link" href="https://github.com/deepseek-ai" target="_blank" class="external-link">
        <i class="fab fa-github"></i>
      </a>
    </div>
    <div class="columns is-centered">
      <div class="column is-8">
        <div class="content">
          <p>
            This website is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative
            Commons Attribution-ShareAlike 4.0 International License</a>.
          </p>
        </div>
      </div>
    </div>
  </div>
</footer>

</body>
</html>