Anthropic: Claude can now end conversations to prevent harmful uses
What’s new: Anthropic has introduced a feature in its Claude Opus 4 and 4.1 models that allows the AI to end conversations if it detects potential harm or abuse. This feature is described as a “model welfare” measure and is intended to be a last resort after attempts to redirect users to helpful resources have failed. The Claude Sonnet 4 model will not receive this feature.
Who’s affected
Users of Claude Opus 4 and 4.1 may experience conversations being ended in extreme edge cases where harmful use is detected. Most users will not notice this feature during normal interactions.
What to do
- Familiarize yourself with the new feature in Claude Opus 4 and 4.1 to understand its implications for user interactions.
- Monitor user feedback regarding conversation endings to assess any impact on user experience.
- Consider updating documentation or training materials to reflect this new capability.