DeepSeek — Description: DeepSeek-V3 training cost approximately $5.58M using 2,788K H800 GPU hours over ~2 months on 2,048 H800 GPUs
The source text directly confirms all key elements of the claim: (1) Total training cost of 2.788M H800 GPU hours matches exactly; (2) The cost calculation of approximately $5.58M is confirmed (source states $5.576M, which rounds to $5.58M); (3) Pre-training on 14.8T tokens is explicitly stated; (4) Training duration of ~2 months ('less than two months') is confirmed; (5) 2,048 H800 GPUs is explicitly mentioned. The claim's description accurately reflects Table 1 and the surrounding text in the technical report. All numerical values align within rounding conventions.
Our claim
entire record- Subject
- DeepSeek
- Property
- Description
- Value
- DeepSeek-V3 training cost approximately $5.58M using 2,788K H800 GPU hours over ~2 months on 2,048 H800 GPUs
- As Of
- December 2024
- Notes
- Dramatically lower than comparable frontier models; pre-training on 14.8T tokens
Source evidence
1 src · 1 checkNoteThe source text directly confirms all key elements of the claim: (1) Total training cost of 2.788M H800 GPU hours matches exactly; (2) The cost calculation of approximately $5.58M is confirmed (source states $5.576M, which rounds to $5.58M); (3) Pre-training on 14.8T tokens is explicitly stated; (4) Training duration of ~2 months ('less than two months') is confirmed; (5) 2,048 H800 GPUs is explicitly mentioned. The claim's description accurately reflects Table 1 and the surrounding text in the technical report. All numerical values align within rounding conventions.