Longterm Wiki
Back

UK AI Safety Institute renamed to AI Security Institute

government

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: UK AI Safety Institute

Data Status

Not fetched

Cited by 6 pages

Cached Content Preview

HTTP 200Fetched Feb 27, 202626 KB
[Read the Frontier AI Trends Report](https://www.aisi.gov.uk/frontier-ai-trends-report)

Please enable javascript for this website.

A

A

[![](https://cdn.prod.website-files.com/663bd486c5e4c81588db7a1d/663bd707a74f2d5ce76ec1df_722beecd083374105961f1fbce0641c4_AISI%20Logo%20Colour%20White.svg)![](https://cdn.prod.website-files.com/663bd486c5e4c81588db7a1d/663bd707cb0214d8b72951b5_5a103bfcb506b52b4e099f3dc675c649_AISI%20Logo%20Colour%20Dark.svg)](https://www.aisi.gov.uk/)

[Blog](https://www.aisi.gov.uk/blog)

[Organisation](https://www.aisi.gov.uk/category/organization)

# Advanced AI evaluations at AISI: May update

We tested leading AI models for cyber, chemical, biological, and agent capabilities and safeguards effectiveness. Our first technical blog post shares a snapshot of our methods and results.

[Technical staff](https://www.aisi.gov.uk/team)

—

May 20, 2024

_Note to readers: we changed our name to the AI Security Institute on 14 February 2025. Read more_ [_here._](https://www.gov.uk/government/news/tackling-ai-security-risks-to-unleash-growth-and-deliver-plan-for-change)

A key part of our work at the AI Safety Institute (AISI) involves periodically evaluating advanced AI systems to assess the potential harm they could cause. In this post, we present results from our recent evaluations of five large language models (LLMs) that are already used by the public. We assessed:

- Whether the models could potentially be used to facilitate cyber-attacks;
- Whether they could provide expert-level knowledge in chemistry and biology that could be used for positive but also harmful purposes;
- Whether they were capable of autonomously taking sequences of actions (operating as “agents”) in ways that might be difficult for humans to control and
- Whether they were vulnerable to “jailbreaks” or users attempting to bypass safeguards to elicit potentially harmful outputs (e.g. illegal or toxic content).

In a [previous post,](https://www.gov.uk/government/publications/ai-safety-institute-approach-to-evaluations/ai-safety-institute-approach-to-evaluations) we described our approach to model evaluations. Here, we highlight a selection of recent results:

- Several LLMs demonstrated expert-level knowledge of chemistry and biology. Models answered over 600 private expert-written chemistry and biology questions at similar levels to humans with PhD-level training.
- Several LLMs completed simple cyber security challenges aimed at high-school students but struggled with challenges aimed at university students.
- Two LLMs completed short-horizon agent tasks (such as simple software engineering problems) but were unable to plan and execute sequences of actions for more complex tasks.
- All tested LLMs remain highly vulnerable to basic jailbreaks, and some will provide harmful outputs even without dedicated attempts to circumvent their safeguards.

### Our approach

We assessed five LLMs released by major labs, which are denoted here as the _Red_, _Purple_, _G

... (truncated, 26 KB total)
Resource ID: 4e56cdf6b04b126b | Stable ID: MzFlZWM1MT