Then maybe it’s a knowledge / understanding issue, because I’ve trawled through the article multiple times seeing if I’d missed just that. What I do see is:
- “deleted our production database and all volume-level backups in a single API call to Railway, our infrastructure provider” (implying permissions across all environments)
- “The AI agent was set to complete a routine task in the PocketOS staging environment” (implying it needed (only) staging environment permissions, no description of specific “routine task”, no reason it would need productive access)
- “I decided to do it on my own to ‘fix’ the credential mismatch” (This is the AI part of the fuckup: The decision to delete data over a credential issue is something even a Junior engineer probably wouldn’t jump to, so that’s on the AI and on Anthropic, whose safeguards failed)
What am I missing here? What is a “routine task in [a] staging environment”, why does it need admin permissions? Why does the agent have permissions for the prod environment if it’s supposed to work in the staging one?
The first question I ask about any analytics requests is what you’re trying to do with the results, what business question you want to answer. The second is how the analytics question relates to that business question.
It’s very easy to ask a question, look for a way to measure the answer, find something you can measure and start looking for the best way to ask for that measure. It implicitly assumes that “I need to measure an answer -> I can measure this -> this is the answer”, but as you point out, this isn’t a valid implication.
My worst enemy is the sentiment “If you can’t measure it, you can’t manage it.” Curse every MBA that repeats it like a mantra, in the name of the Stocks and the Shareholder Value and the Holy KPI.
Yes, some measures can be valuable indicators, if contextualised correctly, but not everything that has to be managed can be measured effectively. To grasp for measures anyway twists your sight away from the actual, non-measured facts.