@luciferofastora

luciferofastora@feddit.org · 7 hours ago

Depressingly, I suspect an executive would consider me far less productive because I only did 5 lines of change and the junior dev would have done thousands…

The first question I ask about any analytics requests is what you’re trying to do with the results, what business question you want to answer. The second is how the analytics question relates to that business question.

It’s very easy to ask a question, look for a way to measure the answer, find something you can measure and start looking for the best way to ask for that measure. It implicitly assumes that “I need to measure an answer -> I can measure this -> this is the answer”, but as you point out, this isn’t a valid implication.

My worst enemy is the sentiment “If you can’t measure it, you can’t manage it.” Curse every MBA that repeats it like a mantra, in the name of the Stocks and the Shareholder Value and the Holy KPI.

Yes, some measures can be valuable indicators, if contextualised correctly, but not everything that has to be managed can be measured effectively. To grasp for measures anyway twists your sight away from the actual, non-measured facts.

luciferofastora@feddit.org · 8 hours ago

Then maybe it’s a knowledge / understanding issue, because I’ve trawled through the article multiple times seeing if I’d missed just that. What I do see is:

“deleted our production database and all volume-level backups in a single API call to Railway, our infrastructure provider” (implying permissions across all environments)
“The AI agent was set to complete a routine task in the PocketOS staging environment” (implying it needed (only) staging environment permissions, no description of specific “routine task”, no reason it would need productive access)
“I decided to do it on my own to ‘fix’ the credential mismatch” (This is the AI part of the fuckup: The decision to delete data over a credential issue is something even a Junior engineer probably wouldn’t jump to, so that’s on the AI and on Anthropic, whose safeguards failed)

What am I missing here? What is a “routine task in [a] staging environment”, why does it need admin permissions? Why does the agent have permissions for the prod environment if it’s supposed to work in the staging one?

luciferofastora@feddit.org · 14 hours ago

I started studying “technical IT” for two years before changing to a less technical version that I ended up enjoying more (physics is fun to learn, but I don’t wanna calculate that shit).

One of the most valuable things to come out of it is one class where we worked our way up all the way from logic gates to the functions of an ALU and a rough look at CPUs and memory architecture. Probably would have gone deeper in a follow-up class I never ended up taking.
Point of the course was that one of the focus options for that course featured micro-controllers and embedded systems, including low-level optimisation (the typical memory constraints might be getting more lax, but learning it isn’t a bad idea).

I don’t remember most of the details, I’m afraid, but it was an interesting insight into the things I take for granted when working in higher level languages.

luciferofastora@feddit.org · 17 hours ago

Not being on Microslop’s ecosystem.

More seriously, it depends on what exact part you want. Matrix + Jitsi work for chats and calls, though they are a little shaky (but so is Teams sometimes, I don’t much see the downsides). Calendar apps are ~~a dime~~ free a dozen.

What other tools often lack is the direct between multiple services. You’d have to manually link a NextCloud directory in the Matrix chat, bevause it doesn’t have a near seamless sharepoint integration.

The upside is independence: You can migrate between hosts, or host your own instances of NextCloud and Matrix, entirely within your (virtual) private network. And those are just services I can name off the top of my head, odds are there are plenty of other good solutions.

luciferofastora@feddit.org · 17 hours ago

Teams on Desktop works significantly better than Browser in my experience, at least on my company-issued Win11 (no Linux for me at work 😢), but I really wish we didn’t have Microslop’s hands so far up our figurative ass.

luciferofastora@feddit.org · 17 hours ago

My employer uses them too for certain documents, and I gotta say, I’m not a fan of their tech side either. The contact I’m working with is a dear, always on top of getting my questions to the relevant engineers and getting a timely response back, but he can’t make gold from straw either.

luciferofastora@feddit.org · 18 hours ago

They gave the AI the job of managing IaC for their environment. Then were shocked when the AI managed the environment incorrectly. This is absolutely not something you let a junior engineer anywhere near.

See, this is the piece of information I was missing. When the article says “routine tasks”, I didn’t think it meant “manage environment”.

In that case, I agree that it is an issue of trusting AI with something that it shouldn’t have been.

You seem to be suggesting that the AI should be able to do the job they gave it without being given the permission required for it to do.

No, I was simply mistaken about the job it was given. Like I said, all I had to work with was the tomshardware article, which doesn’t go into much detail. I didn’t know that the “routine task in staging” required permission to delete entire cloud volumes across all environments instead of just specific environment-scoped project tokens.

Obviously, if it’s tasked with managing all project environments and given the access to do so, that’s a timebomb. In this case, it was, until it blew up.

The thing about doing things in IT, is you need to have permissions to do the things you’re asked to do.

The thing about conversations on the Internet is you need to actually read the whole comment and realise that there may be some misunderstanding if the other party says things like “I can’t read the twitter link” and assumes it’s a junior dev job when you know it’s not. Then you could just point out the part they didn’t know without being condescending and assuming a fundamental lack of understanding of how IT works.

I’ve had more than enough instances of troubleshooting just which scopes my access token needs to be intimately familiar with the way permissions work. I personally tend to request the least amount required for a given task and only expand when needed and reasonable. It is my understanding that this is the best practice. It was my assumption that they had assigned permissions their agent didn’t need, because you generally don’t hand out “fuck up my prod system” rights.

luciferofastora@feddit.org · 1 day ago

Yes and in this case using it for this job at all was clearly not within safe limits.

Do you have any detail on what “this job” was? Like I said, I don’t have access to the original statement because twatter wants me to log in to see it.

What I do see is “routine task in the […] staging environment”, and that doesn’t sound like a big blast zone job. Again, it’s comparable to a job you’d give a junior engineer. There shouldn’t be much a junior engineer can fuck up, no matter how “creative” their solutions.

Whether it’s a human junior engineer, an automatic script or an agentic AI, they should never have more privileges than they need for their job. Granting someone or something that isn’t the senior admin permission to delete a volume is irresponsible.

The AI generating that fucking awful idea is on the AI (or its developers). Both are partial causes for the incident. It’s not just human error, but it’s also human error that would have been dangerous regardless of AI involvement.

luciferofastora@feddit.org · 2 days ago

I can’t read the original twitter link, but I’m not sure they handed it the job of a senior infrastructure engineer. The article says “routine”, which to me is something you can hand off to a junior just fine. When they hit a snag, they obviously should stop and ask what to do, but even then, a human might want to avoid admitting ignorance and try to fix it themselves instead. They shouldn’t have privileges to fuck up that badly.

So while it’s on the AI for taking destructive steps, I do think there’s a human error in the form of grossly irresponsible rights allotment. If this was a first-of-its-kind incident that shows otherwise stellar AI fucking up badly, I’d classify it as a pure AI problem, but their limits are hardly novel at this point. There have been previous incidents circulating the media. We’ve had memes about it. If you can’t stay up to date on your tools and their shortcomings, you shouldn’t be using them, because discovering a footgun becomes a question of “when”, not “if”.

That’s why I consider this partially a human failing: If you’re gonna use a tool, make sure that it operates within safe limits. The chainsaw doesn’t know the difference between tree and bone, so it’s on you to make sure it stays away from anyone’s legs. So while “Chainsaw can saw legs if wielded improperly” is a problem that was accepted as a tradeoff for its utility, you can’t really blame the chainsaw if you zip-tied the safety.

(Again, not to say Anthropic is blameless for letting its random generator generate randomly destructive shit. I just don’t think that’s the only point of failure here.)

luciferofastora@feddit.org · 2 days ago

A human with the same permissions would have been capable of fucking up too. Giving the equivalent of a junior dev with a learning disability the keys to the whole place is just dumb.

(Relying on AI is dumb anyway, but that’s not the biggest issue in this specific case)