Longterm Wiki

Robust Unlearning

Alignment Trainingemerging

Removing dangerous knowledge from model weights in a way that resists relearning.

Cluster: Alignment Training

Tags

function:specificationstage:trainingscope:technique