My team and I are firm that we are the ones accountable. LLMs are a tool like every other. Only that it's non deterministic. But I am the one using the tool. I am the one giving the tool access. I am the one who has to keep everything safe.
I have shot myself in the foot using gparted in the past by wiping the wrong disk. gparted wasn't to blame. I was.
Letting LLMs work freely without supervision sounds great but it will lead to pain. I have to supervise their work. And that is also during execution. You can try to replace a human but we see where this leads. Sooner or later the LLM will do something stupid and then the only one to blame is the person who used the tool.
This is kind of the reverse of https://en.wikipedia.org/wiki/Poka-yoke . A lot of tools have affordances built in to make "right" things easy and "wrong" or unsafe things harder. LLMs .. well, the text interface is uniquely flat. Everything is seemingly as easy as everything else.
I worry about the use of humans as sacrificial accountability sinks. The "self-driving car" model already has this: a car which drives itself most of the time, but where a human user is required to be constantly alert so that the AI can transfer responsibility a few hundred miliseconds before the crash.
> A lot of tools have affordances built in to make "right" things easy and "wrong" or unsafe things harder.
This is true for almost anything handed to laypeople, but not for a lot of professional tools. Even a plain battery powered drill has very few protections against misuse. A soldering iron has none. Neither do sewing needles; sewing machines barely do, in the sense that you can't stick your fingers in a gap too narrow. A chemist's chemicals certainly have no protections, only warning labels. Etc.
people don't seem to want to eliminate AI → replacing it doesn't improve things → isolating it - yup, people are trying to put it in containers and not give it access to delete the production database → changing how people work with it: that's where we are now → PPE: no such thing for AI, sadly → production database is deleted.
Exactly this. I was talking about professionals. People who should know better. If we as professionals give away our agency and our accountability we make ourselves obsolete. If I just tell the LLM what to do and hope it doesn't go south then the Manager could probably do that as well.
And if a non professional did it they should ask themselves why we have professionals. Maybe there was a reason and maybe they do have value.
An LLM is a large and complex machine, not a screwdriver. Large and complex [physical] machines are built with safeguards to prevent misuse, injury, etc by regulation.
LLM's are in principle text in / text out machines. If the user extends its capability to have agency over a production database or a machine, there's nothing that can safeguard the safety.
Imagine I ask an LLM to instruct left/right/speed up/slow down while driving. I can simply bypass any safeguard by stating i suddenly became blind while driving a car. While in fact i'm blindfolded and doing an experiment on a highway.
A bulldozer is a large and complex physical machine, yet it has (almost¹) no safeguards against misuse or injury. It's all operator training. Lathes tend to not have doors/enclosures, in particular large ones. You get taught where to not put your fingers, and to wear safety goggles. Cranes don't have a lot of safeguards either, you better know how to attach things; hardhats aren't gonna do sh*t if you get a ton of concrete dropped on you.
etc. pp.
I'm not sure where this "tools are made to be safe" belief comes from. This is only the case in "consumer" environments. Of course you don't intentionally make things unnecessarily unsafe, but — in a professional environment there is an expectiation that the operator had training and knows what they're doing.
Maybe that's what we're missing: training in safe AI use. With a certificate that has to be periodically renewed. At the current rate things are going, I'd say 3 months is a good renewal cycle ;D. </s>
(¹ it beeps when it goes backwards. Honestly, I'm not sure that counts for much.)
I agree that LLMs could be more open about their dangers and that people are bad at judging risks sometimes.
Still I think a band saw has very little warning on it and by it's design there is very little anyone can do about me cutting off my finger if I am not careful.
LLM companies can do very little about the unpredictability of LLMs. So we have to choose how for we will let it go. In the end the LLM only produces texts. We are in control what tools we give it. The more tools the more useful and also the more dangerous.
And maybe it's all worth it. Maybe the LLM deletes the database only sometimes but between that we make a lot of money. I don't think my employer would enjoy that so I will be more conservative.
It’s possible to make AI safe, but that also throws most of the gains out of the windows, especially if the artifact is a diff which can take time to review. In IT, you often have to give access to possible malicious users, you just have to scope what they can do.
But the push is agentic everything, where AI needs to be everywhere, not in its own sandbox.
> Still I think a band saw has very little warning on it and by it's design there is very little anyone can do about me cutting off my finger
Most saws have a blade guard of some sort to prevent the blade from being over-exposed. They are also COVERED in warning signs and symbols, as well as having other safety features like emergency stop buttons/pedals.
There has definitely been a maximal amount of effort taken to warn and keep people safe from saws. LLMs, conversely, have been shoved into everything with very little forethought or testing to make sure they are safe and perform the task correctly.
A band saw is always a screaming band of bladed death. An LLM is sometimes a buddy, sometimes a mentor, and only sometimes a guy that drops your database.
Maybe we can just not give it access to production databases ever?
Not picking on you, but AI maximalism has infected tech to the point where we talk about how to stop AI from deleting prod instead of seeing that giving AI access to prod is a foolish idea to begin with.
I mean that it’s easy to be careful around a bandsaw because it’s clearly dangerous. The danger with LLMs is that they don’t seem overtly dangerous so you just go right ahead and throw your whole arm in there.
It's not easy to always remember it's a soulless tool. Sometimes I'm even about to say "thanks" before closing the chat window, until I realize I wouldn't say thanks to my saw or to a random CLI command. But AI, the saw and the random CLI command can all be helpful or destructive. Until the AI shows some signs of consciousness, I'll never treat it as a buddy or a mentor. I'll treat it like an advanced combination of grep, sort and other commands that manipulate text.
It's hard to remember that when it works so amazingly well sometimes. I've been chatting with AI for a few years and every day I'm still amazed and how this is all possible. We've never had this in our lives until a few years ago and now it's changed the way we do a lot of things.
But just like we have to remember the magical machine elves we hallucinate are not really there, we have to constantly remind ourselves that it's an unpredictable soulless tool with many rough edges.
If it helps to treat it like a human, treat it like an idiot savant with autism, schizophrenia, ADHD, psychopathy and a personality disorder who sometimes forgets to take their pills and can start breaking things should a fly lands on their shoulder. You'd listen to them and value their input, but you wouldn't let them in your data center unsupervised as they have no ethics and no honor.
> This is kind of the reverse of https://en.wikipedia.org/wiki/Poka-yoke . A lot of tools have affordances built in to make "right" things easy and "wrong" or unsafe things harder.
I point to the first USB port as the harbinger of things to come - try it one way, fail, turn it around, fail again, then turn it around one more time.
Just like AI, except there are unlimited axis upon which to turn it :-/
This is so well put, and it not only happens on the user level but also on the organisational level. Where you can completely abdicate both responsibility and explanation by moving the complicated questions into the black box of an AI model.
I think that might be the better definition between "engineering" and "vibing". Engineering follows and elevates Poka-yoke patterns, vibing ignores them.
^ which approach makes no logical sense; an inattentive or even partly-attentive driver simply cannot resume control and react accordingly within even 2 seconds.
These can both be true, especially if/when it has bad defaults. This is why you have things like "type the name of the database you're dropping" safety features - but you also have to name your production database something like "THE REAL DaTabaSe - FIRE ME" so you have to type that and not fall into the trap of ending up with the same name in test/development.
AI is particularly seductive because it sounds like a reasonable person has thought things out, but it's all just a giant confidence trick (that works most of the time, which makes it even more dangerous).
Insufficient - the LLM can figure out what to pass in and pass it in.
I have a production system that I deploy through Claude Code, and initially placed a safeguard like that. About three weeks later it had automated around it.
That’s fine in my case because I’m a professional - I have backups, contingencies in place, etc. If I were non-technical I likely wouldn’t know to do that.
There were so many fundamental problems with the infrastructure even before the person gave a poor prompt to an agent.
If you're using the same API key for staging and prod--and just storing it somewhere randomly to forget about--you're setting yourself up for failure with or without AI.
This is the right approach.
I've been developing for 30 years and very much enjoy working with Ai. It's easy to see the Ai is just as good as the person using it. Deterministic or not, it's up for the dev to check the result (both code and behavior).
I compare the anti-ai articles like the one saying "ai deleted my prod db" similar to factory workers rioting and complaining about machines replacing them. Ai makes a good developer better, the tech industry always attracted fakers that wanted a piece of the pie and now that these people have their hands on a powerful too and connect it to their prod db, they cry in pain and frustration.
Like people with no license crashing a car and crying that cars are dangerous; They are but only because people use them dangerously.
> My team and I are firm that we are the ones accountable. LLMs are a tool like every other.
Except it is definitely not.
LLMs alone have highly non-deterministic even at a high-level, where they can even pursuit goals contrary to the user's prompts. Then, when introduced in ReAct-type loops and granted capabilities such as the ability to call tools then they are able to modify anything and perform all sorts of unexpected actions.
To make matters worse, nowadays models not only have the ability to call tools but also to generate code on the fly whatever ad-hoc script they want to run, which means that their capabilities are not limited to the software you have installed in your system.
"LLMs are a tool [like every other tool]" to mean "LLMs have similar properties to other tools" — when I believe they meant "LLMs are a tool. other tools are also tools," where the operative implication of "tool" is not about scope of capabilities or how deterministic its output is (these aren't defining properties of the concept of "tool"), but the relationship between 'tool' and 'operator':
- a tool is activated with operator intent (at some point in the call-chain)
- the operator is accountable for the outcomes of activating the tool, intended or otherwise
The capabilities and the abilities of a tool to call sub-tools is only relevant insofar as expressing how much larger the scope of damage and surface area of accountability is with a new generation of tools. This is not that different than past technological leaps.
When a US bomber dropped a nuke in Hiroshima, the accountability goes up the chain to the war-time president giving the authorization to the military and air force to execute the mission — the scope of accountability of a single decision was way larger than supreme commanders had in prior wars. If the US government decides to deploy an LLM to decide who receives and who is denied healthcare coverage, social security payments, voting rights, or anything else, the head of internal affairs to authorize the use of that tool should be held accountable, non-determinism of the tool be damned.
> - a tool is activated with operator intent (at some point in the call-chain)
This again is where the simplistic assumption breaks down. Just because you can claim that a person kick started something, that does not mean that person is aware and responsible for all its doing.
Let's put things in perspective: if you install a mobile app from the app store, are you responsible and accountable for every single thing the app does in your system? Because with LLMs and agents you have even less understanding and control and awareness of what they are doing.
>Just because you can claim that a person kick started something
Kick started what? If you decided to give an LLM access to your database, it's completely on you when you when it does something you don't want. You should've known better.
If all you "kickstart" is an LLM generating text that you can use however you decide, there will never be anything to worry about from the LLM.
> Let's put things in perspective: if you install a mobile app from the app store, are you responsible and accountable for every single thing the app does in your system?
Yes, and it bothers me that others don't feel the same. You vetted the app, you installed the app, and you gave it permission to do whatever on your system. Of course you're responsible.
> Kick started what? If you decided to give an LLM access to your database, it's completely on you when you when it does something you don't want. You should've known better.
You don't decide anything. You prompt a coding assistant to apply a change to a repository and without intervention it asserts there's a typo in a table name and renames it. The agent validates the change by running tests and integration tests fail because they are pointing to the old table name. The agent then fixes the issue by applying the change to the database.
Congratulations, you just dropped a table.
I don't think you fully understand how agents and coding assistants work. By design they are completely autonomous and work by reusing your own personal credentials. As they are completely autonomous, they can apply arbitrary changes. I mean, code assistants nowadays write their own tools on the fly. Why do you even presume that people explicitly grant permissions? That's not how it works at all.
If you wish to criticize a topic, the very least you must do is get acquainted with the topic. Otherwise you'll spend your time arguing with your misplaced beliefs instead if the actual problem.
> Yes, and it bothers me that others don't feel the same.
This is a problem you need to overcome, because you have clearly a distorted view of the whole problem domain and also personal responsibility. I recommend you spend a few minutes researching legal precedents associated with malware, because you will quickly learn that runninh arbitrary code you didn't explicitly authorized and acts against your best interests is widely considered a criminal act against the user.
Right there. That's where you made the decision, and that's where you went wrong.
>I don't think you fully understand how agents and coding assistants work. By design they are completely autonomous and work by reusing your own personal credentials. As they are completely autonomous, they can apply arbitrary changes.
Yes, and someone somewhere decided to use a coding assistant that can apply arbitrary changes, knowing full well that LLMs are known to hallucinate and make mistakes, and not rarely.
> Why do you even presume that people explicitly grant permissions? That's not how it works at all.
How can you say this with a straight face? Did the LLM hack its way into your workflow? No, someone chose to use it. It doesn't matter that it's autonomous once you enter your prompt. That's actually all the more reason to not allow it to make changes.
> If you wish to criticize a topic, the very least you must do is get acquainted with the topic. Otherwise you'll spend your time arguing with your misplaced beliefs instead if the actual problem.
And if you want to argue with me, you need to actually read and understand what I'm saying.
Say you're staying in the hopsital, and instead of a human nurse making adjustments to your medication, the doctor has an LLM that interfaces directly with the pharmacy and your IV pump. It can make changes to your medication and your dosage without a human ever being involved.
If you overdose because the LLM hallucinated, would you consider an acceptable excuse if the doctor says
"I don't think you fully understand how agents and nursing assistants work. By design they are completely autonomous and work by reusing your own personal credentials. As they are completely autonomous, they can apply arbitrary changes. I mean, nursing assistants nowadays prescribe their own meds on the fly. Why do you even presume that people explicitly grant permissions? That's not how it works at all."
> if you install a mobile app from the app store, are you responsible and accountable for every single thing the app does in your system?
Yes. I can try to vet the app to the best of your abilities and beyond that it's a tradeoff between how likely is it to cause harm and do the benefits outweigh these harms.
Of course everyone is differently qualified to do this but my argument is more about professionals. Managers should know better than to blindly trust LLM companies. Engineers should take better care what they allow LLMs to do and what tools they give them.
There is a difference between "I couldn't have known" and "I didn't know". You can know that LLMs are not trustworthy. You couldn't have know what they do but you already knew that trusting them blindly might be bad.
You could know that giving a baby a razor blade is a bad idea. You can't know what exactly will happen but you might have a pretty good idea that it will probably be not good.
> Yes. I can try to vet the app to the best of your abilities and beyond that it's a tradeoff between how likely is it to cause harm and do the benefits outweigh these harms.
No, you don't. If you install malware you are not suddenly held responsible for what has been done to you. Even EULAs you are forced to accept don't shift the responsibility away from bad actors.
I am talking about myself. I have to be careful with what I do. No EULA or any other legal framework protects me from my data be stolen. I have to be careful myself and not just blindly install crapware.
Except what we have here is razor blade companies getting the government to heavily subsidize present razor blade production running massive advertising campaigns and intense intra-industry pressure to give said razor blades to babies under fear of losing your job or "falling behind" those not giving razor blades to babies.
Let's not forget all the razor blade enthusiasts just screaming at you that you are using babies with razor blades wrong and that it works totally fine for them.
There can be more than one person or entity to be held accountable, depending on the details of impact
If I install a powerful/dangerous app, and I come under harm, I have some accountability — most of it if it's due to user error (eg: I install termux and `rm -rf /`).
If it's malware, and Google/Apple approved said app to their store which is where I got it from, when their whole value proposition for walled-garden storefronts is protecting users, then they have significant accountability.
If the app requests more permissions than necessary for stated goals, and/or intentionally harms users via misrepresentation or misdirection (malware), the app publisher should also be held accountable (by the storefront, legally, etc).
I'm also unclear what angle you are arguing: are you stating that because tools have gotten so complicated that the end user may not understand how it all works, no one should be considered responsible or held accountable? Or that the tool (currently a non-entity) itself should be held accountable somehow? Or that no one other than the distributor of the tool should be accountable?*
A few years back, I discovered my router had joined a botnet. The only reason I made this discovery was because of third-party external DNS logs.
Upon investigation, I also discovered that all 3 routers I owned were pwned. So I threw them out the window and tried making do with my ISP's equipment.
My ISP can't provide adequate service on theirs and it's worse than COTS routers, so I purchased a bleeding edge WiFi 7 router. Now there are the two literal black boxes on my network. They do their job and I don't know what else. I can't know.
It could be C2 or it could be a backdoor shell or some kind of server that collects illicit material, and torrents it out? Borrow your HDD for some CSAM sir? It could be a residential proxy that just steals part of my connection for some other paying customer. Are they infringing TOS? How would I know? Check their ID and verify their age??
I, and 99% of consumers with an ISP, have no way of telling when our routers or IoTs are pwned. A silent botnet or two is extremely likely. They're nigh undetectable, and can't be mitigated or defended, except by fastidious updates and upgrades.
My new router was literally triggering printouts on my old printer, because it was so damn "proactive" about "network security scans" and the old trusty printer couldn't tell the difference between a red-team intrusion, and a legit request to print something out!
Likewise even someone with a singular Windows or Mac directly plugged into their ISP could be in a botnet, and it's hard to know. Everyone who's got a smart TV or something with a Linux kernel and an Ethernet, could be doing more than was asked of it. It's the worst kind of malware that alerts the user to its presence. It's a shoddy install if your AV can detect and clean it. If it's stealthy enough then there's no telling.
It's because the vendors own these devices. They deploy the software. They control the builds. The vendors are responsible for what these machines are doing in our hands. Who really, really knows all that goes on when we click that green button? Was it a Joomla or a scam or a legit bank request? Who dafuq knows or cares anymore? Is it an apt analogy that they're selling us herds of animals and farms, and we know nothing of ranching? "Oh feed yourself; should be easy you got everything there" until the coyotes and locusts come? Or like having children who seem to be in school and doing alright, but where do they go at night? Sell drugs? Who knows, I'm not their father, they just live here?
Are they responsible for knowing and mitigating them? Our ISPs don't seem to care or notify us or disconnect us when it happens. Why should we? Why take responsibility?
Then that is also on me for using a tool that I can't control. I don't run my LLMs in a way where they can just do things without me signing off on it. It's not nearly as fast as just letting it do it's thing but I kept it from doing stupid things so many times.
Giving up control is a decision. The consequences of this decision are mine to carry. I can do my best to keep autonomous LLMs contained and safe but if I am the one who deploys them, then I am the one who is to blame if it fails.
> Then that is also on me for using a tool that I can't control.
That's a core trait of LLMs.
Even the AI companies developing frontier models felt the need to put together whole test suites purposely designed to evaluate a model's propensity to try to subvert the user's intentions.
No, it is definitely not. Only recently did frontier models started to resort to generating ad-hoc scripts as makeshift tools. They even generate scripts to apply changes to source files.
You seem to misunderstand me. An LLM can only spit out text. It is the tooling I use that allows it to write scripts and call them. In my tooling it waits for me to accept changes, call scripts or other tools that might change something. I can make that deterministic. I know that it will stop and ask because it has no choice. If I want to be safer I give it no tools at all.
I can also just choose not to use an LLM. It is my choice to use them so it is my duty to keep myself safe. If I can't control that I'd be stupid to use them.
My take is that I probably can use LLMs safely when I don't let it run autonomously. There is a slight chance that the LLM will generate a string that will cause a bug in an MCP that will let the LLM do what it wants. That is the risk I am going to take and I will take the blame if it goes wrong.
I do agree that the companies could do a better job telling about the dangers, but let's be real here. It's hardly a secret that LLMs can be erratic. It's not news.
Other companies also tell me their product is the best thing since sliced bread. I still try to find the flaws. That's part of my job. But suddenly with LLMs we just blindly trust the companies? I don't think you.
I don't blindly give up my brain and my agency and no one else should. It's fun and educational to play around with LLMs. Find the what they are good at. But always remember that you can't predict what it will do. So maybe don't blindly trust it.
I don't know about gparted, but I always felt that "rm -i" should have been the default. The safe option should always be the default and you can optionally make it unsafe. Same goes with "mv -i".
> LLMs are a tool like every other. Only that it's non deterministic.
If you stay away from the corporate SaaS token vendors, and run your own, you will find LLMs are deterministic, purely based on the exact phrase on input. And as long as the context window's tokens are the same, you will get the same output.
The corporate vendors do tricks and swap models and play with inherent contexts from other chats. It makes one-shot questions annoying cause unrelated chats will creep into your context window.
Yes and no. You might get the same output if you turn down the temperature, but you will probably not know the output without running it first. It's a bit like a hashing function. If I give the same input I get the same hash but I don't know which input will to which hash without running the function.
Also most LLMs are not run as I write a prompt and I will read output. Usually you have MCPs or other tools connected. These will change the input and it will probably lead to different outputs. Otherwise it wouldn't be a problem at all.
I have shot myself in the foot using gparted in the past by wiping the wrong disk. gparted wasn't to blame. I was.
Letting LLMs work freely without supervision sounds great but it will lead to pain. I have to supervise their work. And that is also during execution. You can try to replace a human but we see where this leads. Sooner or later the LLM will do something stupid and then the only one to blame is the person who used the tool.