Hacker Newsnew | past | comments | ask | show | jobs | submit | bonzini's commentslogin

At the time of DOS, x86 didn't have multiple privileges. The system call instruction was typically INT, the software interrupt instruction.

Later on the 386 Intel added virtual 8086 mode which trapped to the kernel privileged instruction exception also for certain instructions that had to be virtualized, among them INT.


Yes, exactly.

We used a set of INT instructions in well-known low memory addresses that all jumped to the same place. We had an ASM file that you linked with, that had sixteen different address combinations for each.

The common entry point would look back on the stack and calculate from the return address which entry point had been called, and run the appropriate kernel call. We called it the CS:IP hack.

In the context of this post, the DOS INT10 and INTx(I forget) required the caller to load registers with the desired system call number, then perform the trap instruction in their code. Fortunately CTOS didn't need those particular software interrupts, so I could implement them for my purposes.


Windows 95 used a related hack. Whenever a v8086 program asked to create a call to protected mode code ("please give me a real mode address to call to, in order to start executing the protected mode routine at address 0x123456"), Windows would store the entry point in a table and hand out real mode addresses like FFD0:0, FFCF:10, FFCE:20, FFCD:30, FFCC:40 that all point to the same instruction (because the segment part is shifted left by 4 in real or v8086 modes).

The routine at 0xFFD00 could then enter protected mode and use the code segment to build the index into a table of entry points: FFD0 goes to index 0, FFCF goes to index 1, and so on. But for extra kicks, the address isn't actually pointing to valid code. It points to a random "c" character in the BIOS, which is an ARPL instruction - which in turn is invalid in v8086 mode and therefore invokes the undefined opcode exception handler. The exception handler, which handily enough is already running in protected mode, then takes care of doing the 32-bit call.

Related: https://devblogs.microsoft.com/oldnewthing/20041215-00/?p=37...

Also described here: https://news.ycombinator.com/item?id=45283085


Thanks! This is so interesting.

> It so happens that on the 80386 chip of that era, the fastest way to get from V86-mode into kernel mode was to execute an invalid instruction! Consequently, Windows/386 used an invalid instruction as its syscall trap.

I also read this part but I wonder how did they benchmark back then?

> Schulman’s Unauthorized Windows 95 describes a particularly unhinged one: in the hypervisor of Windows/386 (and subsequently 386 Enhanced Mode in Windows 3.0 and 3.1, as well as the only available mode in 3.11, 95, 98, and Me), a driver could dynamically register upcalls for real-mode guests (within reason), all without either exerting control over the guest’s memory map or forcing the guest to do anything except a simple CALL to access it. The secret was that all the far addresses returned by the registration API referred to the exact same byte in memory, a protected-mode-only instruction whose attempted execution would trap into the hypervisor, and the trap handler would determine which upcall was meant by which of the redundant encodings was used.

And if that’s not unhinged enough for you: the boot code tried to locate the chosen instruction inside the firmware ROM, because that will have to be mapped into the guest memory map anyway. It did have a fallback if that did not work out, but it usually succeeded. This time, the secret (the knowledge of which will not make you happier, this is your final warning) is that the instruction chosen was ARPL, and the encoding of ARPL r/m16, AX starts with 63 hex, also known as the ASCII code of the lowercase letter C. The absolute madmen put the upcall entry point inside the BIOS copyright string.

(Incidentally, the ARPL instruction, “adjust requested privilege level”, is very specific to the 286’s weird don’t-call-it-capability-based segmented architecture... But it’s has a certain cunning to it, like CPU-enforced __user tagging of unprivileged addresses at runtime.)


int 21h !

Need some params!

mov ah, 2

mov dl,7

Ahhh.. probably my first program. Don't forget the int 20 at the end! It was beeping great. Still never unlocked the mysteries of those TSR programs though.


It's been a long time since I've touched any of this, so the details have slipped my mind. However, the general idea was that there were two different exit calls in DOS: terminate and terminate and stay resident. The difference between the two is that the stay resident option wouldn't release the memory used by your application. Further, the interrupt table, which told the processor how to handle each interrupt, was in RAM and therefore writable.

So, what TSRs would do is overwrite one or more interrupts to point to a routine that would check if the system call in question was one it wanted to handle (eg, to add a hotkey it would grab the keyboard handler and check for a special set of keys before passing control back to the normal handler). Once that was fine, it would call the TSR system call and control would be passed back to the OS with the hook still in place


Still never unlocked the mysteries of those TSR programs though.

I made a bunch of those, in TurboPascal. Just needed to save registers (including stack and heap segments) and hook some key combination. One of them was used commercially for installations by a very big company.

Testing was a little prone to spectacular failures. But once the general procedure was debugged, it was easy as pie.


Yes, and it becomes unbearable after a while.

I don’t get it. Unlike a lot of the technical article slop that is posted here, this obviously had a lot of human thought and effort put into the prompt.

The LLM pass (unsurprisingly) made it worse.

For example:

The results were conclusive: 100% pass rate, showing Reno recovered cleanly after the loss phase, and revealing that this is a CUBIC-related bug.

Look, I’m reading a description of a Linux kernel network congestion bug. I don’t need the hand-holding.


Yeah, you aren't selling anything. "Reno has a 100% pass rate for recovering cleanly after the loss phase, so the bug is almost certainly related to CUBIC" is a perfectly fine technical text.

Also, the same event both “showing” and “revealing” two different things is just bad writing.

Another LLM tell is that they penalize repetition so they'll use as many synonyms as possible. You may end up recognizing the same concept being rehashed with synonyms constantly. You can look up examples of thie as "elegant variation"

In one of the languages I read, journalists do this when quoting someone and it pisses me off. Instead of "said", they'll cycle through the same 6-7 synonyms. Instead of just quoting everything together, they break it up.

So instead of:

> President Jackson said "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.".

They'll do something like:

> President Jackson noted that "Lorem ipsum dolor sit amet". The head of state also remarked that "consectetur adipiscing elit" while emphasizing that "sed do eiusmod tempor incididunt ut labore et dolore magna aliqua".

> "Ut enim ad minim veniam, quis nostrud exercitation", categorically proclaimed the former business tycoon. He concluded that "ullamco laboris nisi ut aliquip ex ea commodo consequat".

I've seen this way before LLMs and how much it's used varies a bit from language to language. But it's so formulaic, I can't help but imagine some brain-dead moron sitting in front of the keyboard, trying to make 5 paragraphs from 2 sentences someone said without adding anything else.


It does not, it's actually arbitrary code running during the dynamic loading process, i.e. before _start.

But what if I have a C++ dynamic library? Does it call constructors for global variables before _start function in the main program starts?

_start takes care of calling the global initializers and register the atexit callback for the finalizers.

(In practice _start calls __libc_start_main, a libc function that handles all of that).


That, or in general for system calls. In particular EX AF,AF' + EXX allowed you to exchange all registers other than AF, to be able to return an error code or status.

Elbereth always seemed like a cheat... Never understood the point of it. I got to the quest without it, but not deeper.

It's not a cheat. It's explained in the game's manual (the Official Guidebook [1], just search for Engrave).

It's been nerfed since 3.6.X as well. Now it can no longer be used for fighting, only escape, and attempting to fight while standing on it will make you "feel like a hypocrite" and deduct 5 from your alignment score.

[1] https://www.nethack.org/v500/Guidebook.html#toc_4


I know it's not, but I don't understand why it has to exist.

There are a million other wacky things in the game. Nothing about Elbereth seems out of place or unbalanced (since it’s been nerfed anyway).

They added the ability to apply your money to flip a coin in this version! Why does that need to exist? Because they thought of it!


If it's 2% now after 2000-3000 generations, it must have stabilized because any number <.995 is basically zero when raised to the 2000th power. The neanderthal genes would have to be 1-10^-5 as fit as a the sapiens genes, which is basically noise.

For one literacy right now is ~100% and has never been anywhere close to that until 50-60 years ago.

Literacy.

Percentage of children to survive to adulthood.

Global food surplus.

The was a big phase shift over the course of the 20th century...


Vimeo was bought by Bending Spoons.

https://news.ycombinator.com/item?id=45197302


That was after IAC:

"Additional acquisitions in 2006 included ShoeBuy.com,[46] which the company later sold to Jet,[47] and Connected Ventures including CollegeHumor and Vimeo".


Partly they already have enough on their plate. It's up to the reporter to pick how to handle the disclosure, and unless a specific maintainer chooses to handle it, the Linux security team clearly says they won't.

Partly they have a strong belief that all kernel bugs are vulnerabilities and all vulnerabilities are just bugs; sometimes taken to the extreme in both ways (on one hand this case where the vulnerability is almost ignored; on the other hand, I saw cases where a VM panic that could be triggered only by a misbehaving host—which could just choose to stop executing the VM—was given a CVE).


This couldn't be more backwards. This has literally nothing to do with bandwidth. The kernel is a CNA, they are explicitly the ones to do this.

The reason they don't is because Linus and Greg have repeatedly, publicly stated that they don't want to because they don't believe that vulnerabilities conceptually make sense for the linux kernel and they refuse to engage in the process.


> they don't believe that vulnerabilities conceptually make sense

That's exactly what I wrote: "they have a strong belief that all kernel bugs are vulnerabilities and all vulnerabilities are just bugs; sometimes taken to the extreme in both ways".

But there is also a question of bandwidth. If a maintainer asks to bring a specific vulnerability to distros-list, the kernel security people will be reasonable. I did it last March.


How does that square with this comment from greg from today?

https://www.openwall.com/lists/oss-security/2026/05/01/3

(About heads up to distros)

> Nope, sorry, we are NOT allowed to notify anyone about anything "ahead of time" otherwise we will have to tell everyone about everything. That's the only policy by which all the legal/governmental agencies have agreed to allow us to operate in, so we are stuck with it.


I don't know, this is the one that I mentioned:

https://www.openwall.com/lists/oss-security/2026/03/30/5

You can see my name under "Timeline", I asked kindly for both distros-list and a longer embargo than usual and got them.

I guess Greg is not allowed to notify distros-list, but someone else is?


He's full of shit lol


Seems a little crazy. Somebody should evaluate blast radius and do appropriate distro notifications in a case like this (I presume the impact was part of the disclosure, so not much extra work).


You know the linux kernel is a free software project right? If you think “somebody should” do a thing but you aren’t prepared to do it yourself then you should maybe ask for a full refund.


Thank you very much, seanhunter. You hit the nail on the head there.


Not really, because they made Linux a CNA specifically to own the process and distort it the way they want it to be.


A single PR for a 3000-line addition would, in all likelihood, be rejected anyway.


Really depends the author and context. Large PRs are often justified for compiler work, you have a lot of pieces to touch at the same time



When somebody comments PR with “Incredible work, Jacob. It is an honor to call you my colleague.” then it's safe to assume it's out of the ordinary contribution. Pretty much falling outside of the “in all likelyhood”.

3000 line LLM commit is not that.


Also 95% of those 30k lines changed are fully self-contained inside of the aarch64 directory and of the remaining changes it looks like the majority is just adding "aarch64" as another item into an existing list. There are a few core changes that to me look like they could be done in their own PRs, but also core maintainers get to decide if they want to apply bureaucracy to their own work.


No description provided. I love this PR. But yeah, try being anyone besides Jacob and submitting that!


> In successful open source projects you eventually reach a point where you start getting more PRs than what you’re capable of processing. Given what I mentioned so far, it would make sense to stop accepting imperfect PRs in order to maximize ROI from your work, but that’s not what we do in the Zig project. Instead, we try our best to help new contributors to get their work in, even if they need some help getting there. We don’t do this just because it’s the “right” thing to do, but also because it’s the smart thing to do.

I feel like if their goal is to prioritize contributors over contributions, it'd also logically follow that they should try to have descriptions where possible? Just to make exploring any set of changes and learning easier? Looked it over briefly, no Markdown or similar doc changes there either.

I mean the changes can be amazing, it's just that adding some description of what they are in more detail, alongside the considerations during development, for new folks or anyone wanting to learn from good code would also be due diligence.


How would you differentiate a 3000 line LLM commit made by the best models and good AI processes from a 3000 line commit made by the best human developer?

edit Okay, I set the bar too high here with "best human developer" and vague "good AI processes". My bad. Yes, LLM is not quite there yet.


A personal relationship and trust, as seems to be the case here?


By using my brain.


Don't be ridiculous! We don't do that anymore.


Read it?


It's still fairly obvious just by skimming the code. The best AI models are still quite far from the best human developers in ability and especially in code quality.


When the best AI models are the same or better than the best[1] human developers, what then?

We're already at the point talking about best vs. best.


If that happens and we have a way of reliably knowing if some code is produced to that high quality, then I think we probably can accept that AI coding is the only sensible option.

We definitely are not close to that point though and it's unclear if/when we will get there.


It seems to me that people might be arguing from conflicting hidden premises here. "AI Coding" is a spectrum that could mean something as simple as letting the LLM proofread your changes and then act on those with your own human brain, or it could mean just telling the agent what you want and let it rip and tear until it is done.

If I do the latter and submit a PR to something like Zig, I'll be certainly caught doing it and rightfully chastised. If I do the former, my PR will be better without anybody besides myself having any way of knowing how it got better. Probably I do something in between when I contribute to open-source these days.

Blanket banning all of these seems like a bad idea to me. It actively gates people like myself from contributing, because I respect these people and projects that much. It feels like I would be doing something they find disgusting if my work has touched an LLM and I obviously don't want to do that to people I respect. But it's fine, there are plenty of things to do in the world even when some doors are closed.

I do not presume to have any say on Zig project's well argued decisions[0] -- I'm not really even their user let alone someone important like a contributor. Their point of preferring human contact is superb, frankly. Probably a different kind of problem in an open-source project staffed with a lot of remote working people, where human contact is scarce.

https://kristoff.it/blog/contributor-poker-and-ai/


Blanket banning all of these seems like a bad idea to me. It actively gates people like myself from contributing

in my projects i will reject any contribution that i do not understand. even if the contribution is handwritten by an expert developer. that developer will have to earn my trust like anyone else, like you would have too.

LLM contributions are non-deterministic, which means they can never be trusted.

therefore, if you use LLM to contribute, you can not earn my trust. if you believe that you can not create a meaningful contribution without the use of LLM then you are realizing that you are not skilled enough to understand the code that you contribute. because if you could understand it, then you could write it yourself. i want your personal contributions, not those of your LLM. i want contributions that the submitter actually understands. i want you to earn my trust by showing me that you understand what you are doing. i want you to grow your understanding of my project. none of this happens when you use LLMs.

if you are unable to make a contribution without the help of an LLM then you are not ready to contribute. try looking for smaller issues that you can work on instead until you learned enough to make larger contributions.


> i will reject any contribution that i do not understand

Fair.

> that developer will have to earn my trust like anyone else

What does it take to "earn your trust"?

> LLM contributions are non-deterministic, which means they can never be trusted.

Provably incorrect. LLM contributions can be reviewed, tested, and understood like any other contribution. There's nothing "special" about LLM contributions.

Contributions authored by human brains are also non-deterministic, perhaps if the author was feeling in a slightly different way they'd have formatted the code a bit differently.

> therefore, if you use LLM to contribute, you can not earn my trust.

The premise is wrong.

> if you believe that you can not create a meaningful contribution without the use of LLM then you are realizing that you are not skilled enough to understand the code that you contribute

What if I believe I can do so without an LLM, but that it could be even better with an LLM?

What if I'm great at understanding code, but terrible at writing it?

Again, this is a premise that you just decided to take as truth, without proof.

> because if you could understand it, then you could write it yourself.

False. I can understand a novel algorithm by reading and studying it, but perhaps I could have not come up with it myself.

> i want you to earn my trust by showing me that you understand what you are doing

I can easily do that even if my contribution involves LLM assistance.

> i want you to grow your understanding of my project

Ditto.

> none of this happens when you use LLMs

False. Why do you think so?

> if you are unable to make a contribution without the help of an LLM then you are not ready to contribute.

Again, this is your opinion and you have no way of proving it. I can prove the opposite.


> What does it take to "earn your trust"?

multiple successful contributions of increasing complexity, among other things.

>> LLM contributions are non-deterministic, which means they can never be trusted.

> Provably incorrect. LLM contributions can be reviewed, tested, and understood like any other contribution. There's nothing "special" about LLM contributions.

read this comment to see what i mean: https://news.ycombinator.com/item?id=47968180

> Contributions authored by human brains are also non-deterministic, perhaps if the author was feeling in a slightly different way they'd have formatted the code a bit differently.

i can tell a human to focus on a certain issue. they will either listen and follow my instructions, or i will reject their contribution. the LLM is almost guaranteed to not follow all my instructions and make changes i didn;t ask for. see my comment above.

>> therefore, if you use LLM to contribute, you can not earn my trust.

> The premise is wrong.

how so?

>> if you believe that you can not create a meaningful contribution without the use of LLM then you are realizing that you are not skilled enough to understand the code that you contribute

> What if I believe I can do so without an LLM, but that it could be even better with an LLM?

what you believe is not relevant. only what you can convince me of. you'll have to first show that you actually can work without an LLM before i will consider your contribution.

> What if I'm great at understanding code, but terrible at writing it?

your problem not mine. if you are terrible at writing code but good at understanding it then it's your choice to only do code reviews. you can still make a meaningful contribution that way. i'd even let you write code so you can practice that, but i am not interested in your LLM generated code.

> Again, this is a premise that you just decided to take as truth, without proof.

i don't need proof. i need trust. you need to convince me that your code can be trusted.

>> because if you could understand it, then you could write it yourself.

> False. I can understand a novel algorithm by reading and studying it, but perhaps I could have not come up with it myself.

that's called learning. once you learned it, you can write it. but in order to effectively learn you also have to practice. if you let LLM write all your code then you are not practicing, so you won't improve.

>> i want you to earn my trust by showing me that you understand what you are doing

> I can easily do that even if my contribution involves LLM assistance.

it depends on the level of assistance. i am not ruling out use of AI to do research and learn, just don't let it write the code for you.

>> i want you to grow your understanding of my project

>> none of this happens when you use LLMs

> False. Why do you think so?

as i said above, if you don't practice writing the code yourself you are not learning. not enough at least to satisfy my expectations.

>> if you are unable to make a contribution without the help of an LLM then you are not ready to contribute.

> Again, this is your opinion and you have no way of proving it. I can prove the opposite.

whether you are ready to contribute to my project or not is not something i need to prove. it is a choice based on my preference which depends on the amount of trust you have earned. you can not prove to me that you are ready to contribute. this is not a standardized test that if you pass you automatically qualify. you can only convince me by earning my trust. this is a human decision, based on feelings.


>because if you could understand it, then you could write it yourself.

I accept most things you said there as valid opinions, but this is where the logic goes wrong.

I use LLMs to give me more from the only resource (now that my basic and mid-level needs are largely met) that ultimately matters: time. That means that I need to waste far less time in front of the computer, typing code, and use far more time doing more useful things, like hobbies, art, being with my children.

But as I said before, every project is obviously allowed to make their own rules, and contributors should obey those rules. There are plenty of projects that take both AI deniers and plenty of projects who prefer AI aficiandos.

At least for now. My belief is that one those groups will fade away like horseback riding did, but we'll see. Perhaps you have heard the famous stages quoted by many different people in different forms: first an idea is ridiculed, then it's attacked, then it's accepted. Some open-source communities have clearly entered the attacking phase in the last year so.


you are saying that even if you understand the code, using an LLM saves you time writing it. fair enough[*]. the problem on my side still is that if you didn't write the code yourself, i have no evidence that you actually understood it. the only way to prove that you understand the code is to write it yourself. that's where the trust building comes in. you may actually understand the code, but i can't trust that you do.

[*] in my opinion it takes more time to verify that the LLM code is correct than it takes to write it yourself. based on that, if you save time using an LLM then you didn't spend enough time to verify that the code is correct.

Some open-source communities have clearly entered the attacking phase in the last year so

i feel it's more like defense, but yes.


How can AI possibly be better than “the best” when the corpus of training data now includes its own slop in addition to all the code by new devs/lazy devs/bad devs scattered all over the internet? Law of averages applies here.


Because LLM models are obviously much more than the sum of their parts.


Oh, which parts are those? Do tell!


Don't use "the corpus", but use thinking, source code of the libraries and existing software, documentation, tools, best practices.

Billion times faster than a human, no tiring, no miscalculation, no brain-fart, no cheating.


The post that inspired this post [0] says:

> So while one could in theory be a valid contributor that makes use of LLMs, from the perspective of contributor poker it’s simply irrational for us to bet on LLM users while there’s a huge pool of other contributors that don’t present this risk factor.

> The people who remarked on how it’s impossible to know if a contribution comes from an LLM or not have completely missed the point of this policy and are clearly unaware of contributor poker.

The point isn't about the 3000 line PR, it's about do we think the submitter is going to stick around.

[0] https://kristoff.it/blog/contributor-poker-and-ai/


It seems to be trivially easy for everyone but people heavily invested into LLM to spot LLM slop


Jacob is part of the core team, not a random outside contributor.


Very different context: that PR is from a maintainer, and trusted member of Zig, which surely discussed the implementation/design internally as well


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: