Refusal Training
Alignment TrainingactiveSafety-specific fine-tuning to train models to decline harmful or dangerous requests.
Cluster: Alignment Training
Tags
function:specificationstage:trainingscope:technique
Safety-specific fine-tuning to train models to decline harmful or dangerous requests.