"Review" them how? Read every single line of code before installing something? If it's a binary package, how do you do that? Make reproducible builds for everything you install? Move to from source distro? Putting this on users is not a tenable solution. There's room for common sense, but blaming the users for this is ridiculous
This is like saying a user who clone a random git repo is not to blame and git-scm should do more to prevent cloning of malicious repos.
If it is not official, it is your job to review, if you dont like it, use iOS instead of Arch Linux.
If you crash your car, you are liable for the accident. If you aren't ready for that, take the bus.
They *are* doing the basic housekeeping. What do you think this announcement is, if not exactly that? AUR is very clearly documented as user-submitted, and automatic installs from it are heavily discouraged by the maintainers for this reason. Malware aside, there is very little quality control, and a poorly made AUR has the potential to break the system pretty badly. (Though, in my experience, most of the useful AUR packages are trivial to remove if something goes wrong.)
The officially maintained repositories (which are part of a default installation) were not affected. Users need to go somewhat out of their way to use an AUR.
The definition files are all plain text and not especially complicated. It's not too difficult to glance at the file before doing an install to get a basic idea of what it's about to do, just like you should do when running a random shell script or cloning a random git repo. Indeed, most AURs are implemented by cloning an upstream git repo and configuring it so it can be built. The same basic threat model applies: Do you trust the install script? Do you trust the upstream URL whose code it is about to compile?
i read all the pkgbuild diffs, still doesn't give me a good sense. sure, i can verify that it's coming from the official repo but even then there's no guarantee that there isn't junk in there or that the git ref is actually pointing at the right thing.
it would be better if there were stronger community moderation and review that has stamps i can trust rather than this idea that eyeballing build scripts is a reasonable security posture.
> it would be better if there were stronger community moderation and review that has stamps i can trust rather than this idea that eyeballing build scripts is a reasonable security posture.
Ok, so instead of having a reasonable security posture yourself, you'd rather rely on a number of random strangers who've eyeballed the PKGBUILD instead?
Generally, I think Arch tries to prevent users from relying on bad signals, and this principle might be applied here too.
> i read all the pkgbuild diffs, still doesn't give me a good sense. sure,
Do you have an example of a diff that doesn't give a good sense? I review all my diffs too, but I feel like all of them give me a good sense if it's safe to install or not. I mean, why would I otherwise, what's the point in reviewing if you don't use it to make a decision if to install it or not?
Well ArchLinux has a product for you if you want packages that were vetted: the official repositories.
AUR is just a centralized place to put user created packages, like npm is a place to put user created node packages.
Nothing is "disguised" here. Arch Linux makes an enormous effort to warn that due dilligence is required before installing things, and to dissuade users from using the User Repository at all, to the point of not offering package manager support for it. The wiki even cites previous instances where malware was discovered in the AUR packages.
The only way you could possibly not be aware of the AUR's nature as an "uncontrolled free-for-all" is if you didn't read the Arch Wiki, and anyone who doesn't read the Arch Wiki should not be using Arch Linux to begin with.
"Uncontrolled free-for-all" is exactly the status quo of programming language package managers such as npm and pip. It's just as easy for total randoms to sign up for an account and push packages on those services as it is to push a package to the AUR. Only the AUR made the lack of trust explicit and part of the culture.
PKGBUILDs are not packages. They’re (user-contributed) instructions on how to build packages.
> available through the OS's repos.
No. The AUR is a platform, similarly to NPM or PyPI, that allows users to upload PKGBUILDs. It is not part of “the OS’s repos,” and it says that loud and clear, multiple times, including on the front page.
You seem to have a wild misconception of what the AUR actually is.
It'd be more like a public toilet anyone could urinate in, and you lick the floor right next to the toilet and then is surprised that it tastes like pee. Of course there is pee on the floor, anyone can pee there!
Better analogy would blaming a supermarket that hosts an outdoor farmers market because you contracted food poisoning from a stand owned by someone else - NOT for buying food from within the supermarket itself.
Meanwhile one of the other customers has norovirus and is deliberately touching everything so others contract it.
As an arch user, I would always skim the PKGBUILD file of AUR packages to see if they install the software they claim to install from official sources and if there's something obviously fishy.
Yeah, I've prevented this locally too by never building such a platform in the first place, always the best solution!
Jokes aside and just in case, you do realize ports and AUR have two very different models? Ports is more similar to the official Arch repositories, which obviously doesn't suffer from the same problem, and AFAIK, there is no BSD-equivalent of AUR.
BSD is cool and useful for lots of reasons, but comparisons based on misunderstandings helps no one :)
An archlinux package build file is just a shell script. It's pretty easy to take a look and see if all the manifest info is right and it doesn't do more than ./configure; make; make install DESTDIR=$PKG or whatever. If you're building random software using random instructions from the internet and don't make sure they're not malicious, you only have yourself to blame when you catch something. Actually reading through the source files for vulns is something best left for automatic detection, checking the build script is basic.
You find one that builds from source, or you still review PKGBUILD and friends and lean more on evaluating the reputation of upstream and its maintainers, or you simply decide never to install binary packages. Your policy is yours to decide.
> Putting this on users is not a tenable solution.
The alternative would be to not have an AUR. Archlinux has official package repos where packages are vetted. The AUR (Arch User Repository) is not that. The AUR is there to provide greater variety of software than the official repos can, and it does that by not incurring the cost of being individually maintained by volunteer Arch staff and developers. It needs to not incur that cost for it to exist, otherwise it'd just be the official repos. It's like github, but limited to repos with PKGBUILDs.
And in this alternative past/future, everyone is using GitHub to host their PKGBUILDs instead, then someone gets tired/lazy and builds one repository that indexes those, and we have ArchPacBrewRepository or something, and very same issue appears again, unless people change their approach to installing random 3rd party software.
Ask an LLM to assess the package and do a web search for you. Nobody is installing tens of packages a day, you can take a few minutes to consider what you are installing. This isn't blaming the user, it's basic digital hygiene.
Lets take two real and random examples, and I'll share what I'd look for:
First, very easy one, we want to install Brave, so we find https://aur.archlinux.org/packages/brave-bin. All the dependencies are in the official repos already, so those we trust already, you open the downloaded PKGBUILD and you find it's downloading a binary from github.com/brave, you check to see it's the official GitHub profile/organization that you expect. Quickly scan prepare/package for anything out of place, like downloading more files not defined in "source" or whatever. In this case, "suid sandbox" stuff should make you investigate closer so you understand what that stuff does, many things related to Chrome has things like that. That AUR package also has a brave-bin.sh, so a look through that would make sense. AFAIK, everything checks out, this is literally just downloading the official release from GitHub, and extracts it into the right place, so if you trust the GitHub org/user, you can trust the PKGBUILD. The PKGBUILD also seems to be officially maintained by Brave themselves, so probably already there you can verify the AUR user and be done if you feel lax.
Second example is unofficial package, https://aur.archlinux.org/packages/lmstudio-bin, maintained by noureddinex and created by MadGoat, neither which seem official at a glance. Read through the comments to see if anyone else flagged anything, seems fine so again go read the source of the package and the PKGBUILD. PKGBUILD seems standard, downloads something from "installers.lmstudio.ai" so first thing to check is if that's actually the official website, so use search engine to find official website, copy the URL of the download, verify it's the same. In this case, lmstudio.ai is the real website, but download URL on website ends up being "https://lmstudio.ai/download/latest/linux/x64" in the HTML/DOM, so use "curl -v -L $URL" to see redirects, and then we've confirmed installers.lmstudio.ai is actually what they use for official releases. Read through "prepare" and "package", both seem standard and fine, then look through the rest of the files, all of them seem fine, mostly maintenance scripts for the AUR package itself. Package seems fine as a whole, and we could install it, if we're willing to review it again on upgrades in the future.
This is basically all you have to do. Writing what I did while doing it, made each "review" take maybe 5-10 minutes, and it isn't harder than that, regardless who the user is. You just need to know what to look for, and think how you'd "officially" install it anyways. And if what the PKGBUILD differs from what you'd imagine an "official install" would do, investigate if it makes sense and if not, don't install the package, maybe leave a comment for others in AUR to dive deeper.
Question is if this would be thorough enough for this attack? A package with a slightly more involved build process, maybe some patches because it was made to build on a different distro. Maybe you've already installed (and thoroughly inspected) it before, so you're only updating to a newer version, so you're not as thorough with your review. Or an xz-style backdoor.
Yes, it'd be enough. If a package you're using suddenly adds new 3rd party dependencies, you confirm this is actually needed, and if not, you know something is up. When you install software from random strangers, you have to be vigilant and consider the implications of what you do.
I recall the same situation recently with yt-dlp, as they started to depend on a JS engine for some captcha stuff or related. So when you see that, you need to adjust the mindset of "ah whatever it's probably fine" to "Ok, why are these changes actually here?", and if it's not worth reviewing, you might want to reconsider the approach of installing random binaries from the internet that are flagged as unreviewed.
People probably think you’re being ridiculous but Shai Hulud had its very first attempt at manipulating AI lead analysis and I know of at least one company where that resulted in them getting pwned.
This is only going to become more of a problem in the future and people need to educate themselves on the technical barriers to use because guardrails only sometimes work.
The fallback doesn't seem to be working for me, I haven't scanned a project in it immediately booted me when it found a security bug even though I didn't ask for it
It hasn't changed, and I don't know why people are saying that most books don't have DRM. It is only a small minority.
Tor books is the largest publisher without it (owned by Macmillan). Otherwise everything is truly hard DRM either ACSM with epub or Kindle's. They are both more or less easily defeated though.
I am not sure anyone knows what a harness is at this point. I've heard 17 different definitions of it at this point. It's almost like a buzzword in search of a problem.
Huh. My definition - or rather, explanation - has always been, "The model is just a big bag of floats you multiply with some numbers to get some numbers out, plus a regular program that runs a loop which, at minimum, turns inputs (text, images) into a stream of numbers, pushes it through those multiplications against the bag of floats, and turns results back into text/images/whatnot. That regular program is called a harness[0]. Now, the trick to make LLMs into agents, is to add another loop in the harness that reads the output and decides whether to send it out to user, or do something else, like executing more code (that's what tools are), or feeding it back to input with some commentary (that's how you get "thinking"), or both (that's how you get the "agentic loop")".
Because there isn't really much more to it. And ever since we, i.e. those of us who played with ChatGPT API early on, bolted tools to it, some half a year before OpenAI woke up and officially named it "function calling" - ever since then, we knew that harness was the key. What kept changing was which logic (and how much of it) to put in explicitly, vs. pushing it back to the model on the "main thread", vs. pushing it to a model on a separate conversation track. But the basic insight remains the same.
--
[0] - Well, today - until recently you'd call it a "runner" or "runtime".
Like as in what its made out of, or what it makes? Neither really makes sense here? Lots of things are made out of code and not necessarily agents, but also (from my decidedly outside observer perspective) "agents" are not limited to being code producers either.
If you use cloud models.. the harness is what runs in your computer
AI companies would love if everything ran in their cloud, but arguably there are latency reasons or other reasons to run at least some stuff in your own computer
There is an LLM API. You send it a system prompt and the conversation history. If the last message is a user message the agent will send back a response. It can also send back a “thinking” message before it sends a response and it can also send back a structured message with one or more function calls for functions you defined in your API request (things like “ls(): list files”).
The harness is the part that makes the API calls, interacts with the user, makes the function calls, and keeps track of the conversation memory.
You can also use the LLM to summarize the conversation into a single shorter message so you get compaction. And instead of statically defining which functions are available to the LLM you can create an MCP server which allows the LLM to auto-discover functions it can call and what they do.
That’s the whole magic of something like Claude Code. The rest is details.
I'd say the core is that the harness/runtime/${whatever you call it} doesn't just unconditionally sends model output to the user, and user input to the model, +/- some post-processing, but instead runs a loop that feeds the output back to the model if some conditions are met. That gives you basic "thinking" and single "function calling" a-la early ChatGPT. However, if you allow it to loop arbitrary number of times and allow the output to decide whether to loop or to stop, you get a basic agent.
Agent is currently defined as "what I want it to mean given whatever I am talking about".
Personally, for me it embodies a level of autonomy. I define that as, an AI model with potential to interact with something external to itself based on its output, where that includes its own future behavior.
Fight enshittification. For whatever reason, many travel sites no longer send full details in the e-mail confirmation, they want you to click through to the site...which means I can't forward it to plans@tripit.com for automatic import.
Immediately after booking something,I tell Gemini to add it to my TripIt. Works great. I have a little prompt explaining how I like it formatted that I cut and paste, so I can just make this a one-click prompt. I could also have it add flights to my.flightradar24.com.
I also use Gemini in Chrome to add appointment confirmations to my calendar. Or remember things in Google Keep.
reply