CAISI Evaluation of DeepSeek AI Models Finds Shortcomings and Risks

government

NIST·nist.gov/news-events/news/2025/09/caisi-evaluation-deepse...

Credibility Rating

5/5

Gold(5)

Gold standard. Rigorous peer review, high editorial standards, and strong institutional reputation.

Rating inherited from publication venue: NIST

This NIST/CAISI report is a government-authored comparative safety and performance evaluation of Chinese AI models, relevant to AI governance, deployment risk, and geopolitical dimensions of AI safety.

Metadata

Importance: 62/100press releaseanalysis

Summary

NIST's Center for AI Standards and Innovation (CAISI) evaluated DeepSeek AI models (R1, R1-0528, V3.1) against leading U.S. models across 19 benchmarks, finding DeepSeek significantly underperforms on technical metrics and cost-effectiveness. The report also identifies security vulnerabilities and systematic censorship in DeepSeek responses as risks to developers, consumers, and U.S. national security. The evaluation highlights concerns about the rapid global adoption of PRC-developed AI models spurred by DeepSeek's prominence.

Key Points

•DeepSeek R1, R1-0528, and V3.1 were benchmarked against OpenAI and Anthropic models across 19 evaluation dimensions, with U.S. models outperforming on most metrics.
•DeepSeek models exhibit security vulnerabilities that pose risks to developers and end users who deploy or interact with them.
•Censorship behaviors embedded in DeepSeek's responses raise concerns about information integrity and geopolitical influence on AI outputs.
•DeepSeek's rise has accelerated global adoption of PRC-developed AI, which CAISI flags as a U.S. national security consideration.
•The evaluation is conducted by a U.S. federal standards body, giving it institutional weight in ongoing AI governance and policy discussions.

Cited by 2 pages

Page	Type	Quality
Open vs Closed Source AI	Crux	60.0
Multipolar Trap (AI Development)	Risk	91.0

Cached Content Preview

HTTP 200Fetched Apr 10, 20266 KB

CAISI Evaluation of DeepSeek AI Models Finds Shortcomings and Risks | NIST 
 
 
 
 

 

 
 
 
 Skip to main content
 
 

 
 
 
 
 
 
 
 
 
 
 
 
 Official websites use .gov 
 

 A .gov website belongs to an official government organization in the United States.
 

 
 
 
 
 
 
 Secure .gov websites use HTTPS 
 

 A lock ( 
 
 Lock 
 A locked padlock 
 
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
 

 
 
 
 
 
 
 

 
 
 
 
 https://www.nist.gov/news-events/news/2025/09/caisi-evaluation-deepseek-ai-models-finds-shortcomings-and-risks

 
 

 

 
 
 
 
 

 

 

 
 
 
 

 
 

 

 
 
 
 
 
 
 
 
 NEWS 
 

 
 

 
 
 
 CAISI Evaluation of DeepSeek AI Models Finds Shortcomings and Risks

 
 

 
 
 
 September 30, 2025 
 

 
 

 
 
 
 
 
 Share

 
 
 
 
 Facebook 
 
 
 
 
 Linkedin 
 
 
 
 
 X.com 
 
 
 
 
 Email 
 
 
 
 
 
 

 
 

 
 
 
 
 

 
 
 
 
 
 
 
 
 
 AI models from developer DeepSeek were found to lag behind U.S. models in performance, cost, security and adoption.

 Security shortcomings and censorship may pose risks to application developers, consumers and U.S. national security.

 DeepSeek’s products are contributing to a rapid rise in the global use of models from the PRC.

 

 
 

 
 
 
 
 
 
 
 WASHINGTON — The Center for AI Standards and Innovation (CAISI) at the Department of Commerce’s National Institute of Standards and Technology (NIST) evaluated AI models from the People’s Republic of China (PRC) developer DeepSeek and found they lag behind U.S. models in performance, cost, security and adoption.

 “Thanks to President Trump’s AI Action Plan, the Department of Commerce and NIST’s Center for AI Standards and Innovation have released a groundbreaking evaluation of American vs. adversary AI,” said Secretary of Commerce Howard Lutnick. “The report is clear that American AI dominates, with DeepSeek trailing far behind. This weakness isn’t just technical. It shows why relying on foreign AI is dangerous and shortsighted. By setting the standards, driving innovation, and keeping America secure, the Department of Commerce will ensure continued U.S. leadership in AI.”

 The CAISI evaluation also notes that the DeepSeek models’ shortcomings related to security and censorship of model responses may pose a risk to application developers, consumers and U.S. national security. Despite these risks, DeepSeek is a leading developer and has contributed to a rapid increase in the global use of models from the PRC.

 CAISI’s experts evaluated three DeepSeek models (R1, R1-0528 and V3.1) and four U.S. models (OpenAI’s GPT-5, GPT-5-mini and gpt-oss and Anthropic’s Opus 4) across 19 benchmarks spanning a range of domains. These evaluations include state-of-the-art public benchmarks as well as private benchmarks built by CAISI in partnership with academic institutions and other federal agencies.

 The evaluation from CAISI responds to President Donald Trump’s America’s AI Action

... (truncated, 6 KB total)

Resource ID: ff1a185c3aa33003 | Stable ID: sid_rqorf3vhoy