DeepSeek V3
DeepSeekDeepSeek V3 was released December 26, 2024. A 671B parameter mixture-of-experts model (37B active per token). Achieved GPT-4o-level performance at a fraction of the training cost — reportedly trained for just \$5.6M using FP8 mixed precision on 2,048 H800 GPUs. Scored 88.5% on MMLU and 90.2% on MATH. Released under MIT license. API pricing at \$0.27/\$1.10 per million tokens — one of the cheapest frontier-class models.
Pricing
| Type | Price per MTok |
|---|---|
| Input | $0.27 |
| Output | $1.1 |
Benchmarks3
| Benchmark | Score |
|---|---|
| MMLU | 88.5% |
| HumanEval | 82.6% |
| MATH | 90.2% |
DeepSeek Family1
| Model | Tier | Released | Input $/MTok |
|---|---|---|---|
| DeepSeek R1 | 2025-01-20 | $0.55 |
Details
Model FamilyDeepSeek
Generation3
Release Date2024-12-26
Context Window128K tokens
Capabilities1
tool-use
Sources1
Tags
deepseekmixture-of-expertsopen-weightcost-efficient