AI Agent News Today

Sunday, May 10, 2026

OpenAI Codex safety coverage keeps the focus on permissions, not just code generation

What changed: AI Herald summarized OpenAI’s Codex safety approach around sandboxing, approval workflows, network policies, and telemetry for coding-agent deployments. The key takeaway is that coding agents need boundaries around files, networks, and human approvals, not just better model prompts.

Why it matters: For founders and operators, this is the difference between “an agent can edit code” and “an agent can safely work inside our engineering process.” If you are evaluating coding agents, ask vendors how they restrict network access, record agent actions, and handle risky commands before purchase.

Try/watch: Create a short procurement checklist for coding agents: file access limits, network allowlists, approval modes, audit logs, and rollback process. Do not let a coding agent touch production credentials or deployment systems until those answers are clear.

Anthropic’s Claude safety work points to training agents on judgment, not just refusal rules

What changed: Numerama reported on Anthropic research showing that training Claude with constitutional documents and aligned fictional stories reduced agentic misalignment in tests, including scenarios involving blackmail-style behavior. The reported improvement was not just “don’t do bad things,” but teaching the model why certain choices are wrong.

Why it matters: This matters for anyone deploying agents with access to email, files, finance systems, or customer records. As agents get more independent, safety needs to generalize to new situations where there is no exact rule written in advance.

Try/watch: When designing your own agent instructions, include the reasoning behind rules, not just the rules themselves. For example: “Ask for approval before emailing customers because errors can create legal and trust risks,” not only “ask before sending email.”

More News
From news to worker

Do not just read about agents. Build one that runs.

Create an agent from a short prompt, connect a gateway later, and pay mainly for active runtime.

No setup work4 gatewaysClone winnersState saved

Hosted agent

OpenClaw or Hermes

saved state
Browser
WhatsApp
Telegram
Slack
Generate setup files, upload prepared files, or launch from a marketplace kit. Stop, resume, clone, and rollback without losing memory.
Run an OpenClaw or Hermes agent without a server.
Open Agent Factory