More

rwallace · 2026-05-31T14:08:41 1780236521

The one way I know of for denormals to arise in real calculations is when you have a process that converges on zero. In which case, values will pass through the denormal range on the way from the normal range down to zero. And converting denormals to zero is obviously correct.

What other cases have you seen of denormals arising in real calculations?

adrian_b · 2026-05-31T14:29:02 1780237742

If you compute a limit numerically, the decision that you have reached the limit must normally be taken long before there is any possibility of underflow, i.e. of generating denormals.

Typically you compute the difference between 2 adjacent terms of the convergent sequence and you decide that you have reached the limit when adding the difference to the current term to get the next does not change it (or when the relative error is smaller than some threshold). At this time the difference will still be many orders of magnitude greater than a denormal.

If the limit happens to be zero, then what you describe can happen. The programs where this can happen normally combine two different criteria to decide that the limit has been reached, i.e. that either the relative error is small enough, i.e. like I said that adding the difference to the previous term does not change it, or that the absolute error is small enough, i.e. that the difference is smaller than some threshold. The absolute error used as threshold is normally chosen based on what is physically meaningful in that problem, i.e. depending on what kind of physical quantity corresponds to the values of the terms of the convergent series.

An example of applications where the users must configure both relative errors and absolute errors, to be used as criteria of convergence, are the SPICE-like programs used to simulate electronic circuits, where the user must configure both a generally-applicable relative error, and absolute errors for different kinds of physical quantities, e.g. an absolute error for voltages and an absolute error for electric currents, which will be used, respectively, when a sequence of voltages converges towards zero or a sequence of currents converges towards zero, so that the absolute error criterion will be satisfied before the relative error criterion is satisfied.

In any case, in a correct program the attempt to find the limit should always stop before underflows become possible, so denormals should never be generated if the underflow exceptions are masked.

Denormals can appear in a lot of cases when almost equal values are subtracted, e.g. in the solution of many kinds of badly conditioned equations, but in most such cases there may exist alternative formulae that avoid underflows, i.e. the generation of denormals.

In general, when denormals appear, this signals bugs in the program, which must be investigated and fixed. The purpose of denormals is to allow the programmer to not fix the bugs, without causing catastrophic errors, like those that can happen when underflows generate null results, i.e. when "denormals are flushed to zero".

Careful programmers should nonetheless fix any bugs that generate denormals.

rwallace · 2026-05-31T15:19:11 1780240751

Ah! So let me paraphrase to make sure I understand. In your experience, the real use case of denormals is not that you want to compute with them. It's that they are a useful error signal, because they typically indicate that the equations are badly conditioned, which means the results cannot be trusted. So in that scenario, silently flushing them to zero is bad, but so is silently computing with them. What you really want is for denormals to raise an exception?

adrian_b · 2026-05-31T15:55:05 1780242905

They are indeed a useful error signal, like also infinities or NaNs.

Moreover, they are much more benign than infinities or NaNs, which signal bugs that you most likely have to fix.

Denormals cause only very small errors, which may be acceptable in most applications. Therefore you may choose to ignore such bugs and not fix them, because a modified algorithm that avoids underflows may be significantly more complex than the original algorithm.

Configuring the non-standard option to flush denormals to zero is bad for two reasons. Not only it removes evidence that something bad has happened, but unlike with denormals, this can cause huge errors, even infinite errors.

When the operations are done according to the IEEE standard, every operation has a limit for the relative error and it is possible to do a numeric analysis of some computational algorithm and estimate the maximum errors that can affect its results.

When underflows generate neither exceptions nor denormals, all guarantees are removed and you can no longer predict anything about the final errors of an algorithm.

rwallace · 2026-05-24T21:57:35 1779659855

Because they believed only a small fraction of customers would care a lot about a better Basic interpreter, whereas the vast majority would care about price. As for giving things away for free, Commodore was already engaged in a cutthroat price war with Texas Instruments that almost sank the company. They could not afford to give anything else away for free.

rwallace · 2026-05-23T20:15:48 1779567348

I didn't know about that event. What did he say to get booed that time?

rwallace · 2026-02-27T13:45:59 1772199959

I looked into that, concluded the spoiler is Specter.

Basically, you have to have out of order/speculative execution if you ultimately want the best performance on general/integer workloads. And once you have that, timing information is going to leak from one process into another, and that timing information can be used to infer the contents of memory. As far as I can see, there is no way to block this in software. No substitute for the CPU knowing 'that page should not be accessible to this process, activate timing leak mitigation'.

zozbot234 · 2026-02-27T14:27:38 1772202458

OTOH, out of order/speculative execution only amounts to information disclosure. And general purpose OS's (without mandatory access control or multilevel security, which are of mere academic interest) were never designed to protect against that.

A far greater problem is that until very recently, practical memory safety required the use of inefficient GC. Even a largely memory-safe language like Rust actually requires runtime memory protection unless stack depth requirements can be fully determined at compile time (which they generally can't, especially if separately-provided program modules are involved).

rwallace · 2025-12-31T14:37:53 1767191873

Happy new year from Ireland!

rwallace · 2025-11-27T05:58:01 1764223081

Sure. If you feel the need to write "this is shitty code", fair enough, I'm fine with making allowances for that kind of language. But please leave it at that, instead of also insulting the people who wrote it. There are, unfortunately, plenty of ways for bad incentives to result in competent people creating bad products.

rwallace · 2025-10-10T02:01:58 1760061718

In general, if a program is not a Web browser, I do not want that program getting clever with displaying a URL as an active link. Just show it as the URL, https prefix and all, so I can see where it will be taking me if I copy it into the browser. (This is one of my very few gripes with Google docs.)

rwallace · 2025-10-08T03:46:10 1759895170

I disagree with your second sentence above. I think it is shallow dismissals that are inappropriate here. Conversely, I have upvoted a number of long explanations of interesting topics.

If you don't have time to write a detailed refutation, that's entirely understandable! But if you do, I would love to read it.

I did appreciate your paragraph length comment on the matter elsewhere in this thread.

FrankWilhoit · 2025-10-08T11:25:14 1759922714

My original comment was one of those "if you know, you know" things.

Topics age out very quickly on this site.

I have two lengths: aphorism or six pages. The graf that you so kindly praise raises more questions than it answers and I am not comfortable with that.

I am also not comfortable with spoon-feeding my own views, which are evidently idiosyncratic. At medium length, that is the effect. To marshal citations and evidence in order to re-attain objectivity really does explode the level of effort. It is worthwhile, but this is a quick-hit forum, things roll off the front page usually in about an hour or two.

The flip side of that is that there will be other opportunities.

rwallace · 2025-10-08T16:25:57 1759940757

Fair enough! If you ever have the time and inclination to write it up as a blog post or such, I would love to read it.

FrankWilhoit · 2025-10-08T21:07:33 1759957653

The topic of business-process analysis crops up often enough. I'll throw a few bricks the next time it does.

The topics of the roles of accountants and auditors crop up much less often, which is puzzling all by itself. But you absolutely cannot talk about ERP adoption without getting into those things up to the elbows.

rwallace · 2025-10-04T14:12:14 1759587134

To clarify a couple of things:

I'm aware of the reasons Linux uses doubly linked lists. I'm still of the opinion this design decision made a lot of sense in the 1990s, and would make less sense in a new project today. You may disagree, and that's fine! Engineering is constrained by hard truths, but within those constraints, is full of judgment calls.

I'm not a member of the evangelical Rust community. I'm not proclaiming 'thou shalt rewrite it all in Rust'. Maybe you have good reasons to use something else. That's fine.

But if you are considering Rust, and are concerned about its ability to handle cyclic data structures, there are ways to do it, that don't involve throwing out all the benefits the language offers.

vacuity · 2025-10-04T17:11:27 1759597887

Linux's use of pointer-heavy datastructures is largely justified even today. The low-level kernel bookkeeping requirements rule out many typical suggestions. Modern hardware suggests certain patterns in broad strokes, which are sound for mostly any software. But the kernel has the unique role of multiplexing the hardware, instead of just running on the hardware. It is less concerned with, e.g., how to crunch numbers quickly than it is with how to facilitate other software crunching numbers quickly. As someone else noted elsewhere in the top-level thread, unsafe Rust is the appropriate compromise for things that must be done but aren't suitable for safe Rust. Unsafe Rust is one of the best realizations of engineering tradeoffs that I've seen, and neither C nor safe Rust justify its absence. Rust is only a "systems" language with unsafe, and that doesn't drag Rust down but rather strengthens it. "Here is the ugliness of the world, and I will not shy from it."

autoreleasepool · 2025-10-04T20:29:23 1759609763

Beautifully put, poetic even. The only problem is that, in (*actual).reality, unsafe Rust is difficult to read, difficult to write, difficult to maintain, difficult to learn, has zero features for ergonomics, and has poorly defined rules.

C and Zig have a big advantage for writing maintainable, easy to read unsafe code over unsafe Rust.

rstuart4133 · 2025-10-06T05:39:49 1759729189

> The only problem is that, in (*actual).reality, unsafe Rust is difficult to read, difficult to write, difficult to maintain, difficult to learn, has zero features for ergonomics

All true.

> and has poorly defined rules.

The rules aren't so poorly documented. The options for working with/around them, like say using UnsafeCell, do seem to be scattered across the internet.

But you omitted a key point: unlike C and Zig, unsafe code in Rust is contained. In the Rust std library for example, there are 35k functions, 7.5k are unsafe. In C or Zig, all 35k would be unsafe. If you are claiming those 35k unsafe lines in C or Zig would be easier to maintain safely than those 7.5k lines of unsafe Rust, I'd disagree.

vacuity · 2025-10-04T20:44:37 1759610677

I agree that unsafe Rust is not comfortable or simple, but I think it is usable when appropriately applied. It should very scarcely be used as a performance optimization. The Rust devs are quite invested in minimizing unsoundness in practice, particularly with Miri. In coming years, I expect unsafe to be further improved. Over the entire ecosystem, Rust with judicious use of unsafe is, in my opinion, vastly superior to C, unless the C is developed stringently, which is rare.

jstimpfle · 2025-10-04T15:10:05 1759590605

So, as an example, you'd be happily spending an extra allocation and extra pointers of space for each item in a list, even when that item type itself is only a couple of bytes, and you potentially need many millions of that type? Just so your design is not "from the nineties"?

An extrusive list needs at least 1 more pointer (to the item, separate from the link node), and possibly an additional backpointer to the link node when you want to unlink that node. It also adds allocation overhead and cache misses.

Intrusive lists are one of the few essential tools to achieve performance and low latency.

Or were you thinking of dynamically reallocating vectors? They are not an alternative, they are almost completely unusable in hardcore systems programming. Reallocating destroys pointer stability and adds latency, both very bad for concurrency.

prngl · 2025-10-04T16:00:12 1759593612

I’m sorry, I did not intend to accuse you of being part of the evangelical community. Your article only prompted the thought I shared.

On the technical point, I think I do disagree, but open to changing my mind. What would be better? I’m working on an async runtime currently, written in C, and I’m using several intrusive doubly linked lists because of their properties I mentioned.

rwallace · 2025-10-04T18:23:40 1759602220

No problem!

As to what would be better - this is also a reply to your sibling comments above - I don't have a single across-the-board solution; the equivalent of std::vector everywhere is fine for some kinds of application code, but not necessarily for system code. Instead, I would start by asking questions.

What kinds of entities are you dealing with, what kinds of collections, and, critically, how many entities along each dimension, to an order of magnitude, p50 and p99? What are your typical access patterns? What are your use cases, so that you can figure out what figures of merit to optimize for? How unpredictable will be the adding of more use cases in the future?

In most kinds of application code, it's okay to just go for big-O, but for performance critical system code, you also need to care about constant factors. As an intuition primer, how many bytes can you memcpy in the time it takes for one cache miss? If your intuition for performance was trained in the eighties and nineties, as mine initially was, the answer may be larger than you expect.

jstimpfle · 2025-10-05T10:23:50 1759659830

Even if you just go for big-O, don't forget that a resizable array won't give you even amortized O(1) delete in many cases. This alone is likely prohibitive unless you can bound the elements in the container to a small number.

And if you're trying to trade away good big-O for better cache locality, don't forget that in many cases, you're dealing with stateful objects that need to be put into the list. That means you likely need to have a list or queue of pointers to these objects. And no matter how flat or cache-friendly the queue is, adding this indirection is similarly cache-unfriendly whenever you have to actually access the state inside the container.

rwallace · 2025-10-05T15:09:38 1759676978

Or unless delete is a rare operation. So yeah, to make the best decisions here, you need to know expected numbers as well as expected access patterns.

As far as I can see, you are indeed going to incur one extra memory access apart from the object itself, for any design other than just 'Temporarily flag the object deleted, sweep deleted objects in bulk later' (which would only be good if deleted objects are uncommon). It still matters how many extra memory accesses; deleting an object from a doubly linked list accesses two other objects.

It also matters somewhat how many cache lines each object takes up. I say 'somewhat' because even if an object is bulky, you might be able to arrange it so that the most commonly accessed fields fit in one or two cache lines at the beginning.

rwallace · 2025-09-13T22:51:12 1757803872

Yes, that was exactly the reason. CJK unification happened during the few years when we were all trying to convince ourselves that 16 bits would be enough. By the time we acknowledged otherwise, it was too late.