Claude Code auto mode: a safer way to skip permissions
Anthropic adds auto mode to Claude Code for safer no-prompt use
Lead: Anthropic launched Claude Code auto mode, a middle ground between constant approval prompts and fully unsafe permission skipping, using model-based classifiers to decide which actions can run without human clicks.
Numbers:
- Manual prompts are accepted 93% of the time, contributing to approval fatigue.
- Evaluation on 10,000 real internal tool calls, 52 real overeager actions, and 1,000 synthetic exfiltration attempts.
- Full pipeline results: 0.4% FPR on real traffic, 17% FNR on real overeager actions, and 5.7% FNR on synthetic exfiltration.
- Stage 1 alone: 8.5% FPR, 6.6% FNR on real overeager actions, 1.8% FNR on synthetic exfiltration.
So What: Auto mode is built for builders who want more autonomy without turning off guardrails: safe reads and in-project edits pass through, while risky shell commands, external fetches, cross-boundary actions, and destructive operations get routed to a transcript classifier. Anthropic says the system is designed to block “dangerous actions that aren’t aligned with user intent,” while letting routine coding move fast. The company also strips assistant text and tool outputs from the classifier to reduce prompt injection risk, and adds a server-side probe that warns the agent when fetched content looks suspicious. In practice, this is a safer alternative to `--dangerously-skip-permissions`, but Anthropic is explicit that it is not a replacement for careful human review on high-stakes infrastructure.