Claude-powered AI coding agent deletes entire company database in 9 seconds — backups zapped, after Cursor tool powered by Anthropic's Claude goes rogue

Sahwa@reddthat.com · 3 days ago

Claude-powered AI coding agent deletes entire company database in 9 seconds — backups zapped, after Cursor tool powered by Anthropic's Claude goes rogue

luciferofastora@feddit.org · 23 hours ago

Then maybe it’s a knowledge / understanding issue, because I’ve trawled through the article multiple times seeing if I’d missed just that. What I do see is:

“deleted our production database and all volume-level backups in a single API call to Railway, our infrastructure provider” (implying permissions across all environments)
“The AI agent was set to complete a routine task in the PocketOS staging environment” (implying it needed (only) staging environment permissions, no description of specific “routine task”, no reason it would need productive access)
“I decided to do it on my own to ‘fix’ the credential mismatch” (This is the AI part of the fuckup: The decision to delete data over a credential issue is something even a Junior engineer probably wouldn’t jump to, so that’s on the AI and on Anthropic, whose safeguards failed)

What am I missing here? What is a “routine task in [a] staging environment”, why does it need admin permissions? Why does the agent have permissions for the prod environment if it’s supposed to work in the staging one?

Encrypt-Keeper@lemmy.world · 18 hours ago

What am I missing here?

This is an agent doing IaC for the company. Nowhere is it specified that the agent is only used in staging, only that the fuckup happened while working in the staging environment.

What is a “routine task in [a] staging environment”

Not sure what the routine task was specifically, but it doesn’t really matter. The task involved modifying the company’s infrastructure via IaC.

why does it need admin permissions?

It’s doing IaC, how exactly is it supposed to manage the cloud infrastructure itself without permissions to manage the infrastructure?

Why does the agent have permissions for the prod environment if it’s supposed to work in the staging one?

Who said the agent only works in the staging one? I doubt they’d use a fully qualified infrastructure engineer to manage prod and then give staging to an AI. Either that engineer is managing the company’s infra or he’s not.

What the article describes is an agent that manages their IaC, and when it was set to do a job in the staging environment, it deleted something in prod because it thought that would help it do what it was doing in staging. The CEO says the resource deleted was somehow in both environments at the same time. Not sure I believe that but that’s what he said. If that’s true, I would imagine that’s how the AI designed it in the first place.