Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Know thy .NET object memory layout (mattwarren.org)
77 points by matthewwarren on Sept 9, 2014 | hide | past | favorite | 9 comments


If I'm understanding this correctly - the JITer does the same thing automatically that we used to do manually in C. That is, place larger variables first inside our structs, so that unnecessary padding isn't added to achieve word alignment.

If order is otherwise unimportant to you (coding standard demanding alphabetic order, offsets that must match a layout provided by someone else, etc), you could do it yourself when creating your class/struct.


Yeah that sounds right, basically the JITter is free to re-order as it want. The order you write the fields in your class (in a .cs file) enfore anything.

If you really need a specific memory layout, you need to use a trick like the one in the article, or resort to LayoutKind Sequential[0].

[0] http://support.microsoft.com/kb/922785


I was trying to figure out why I could write a ring buffer in Go and it would perform so much faster than the equivalent C# implementation.

This is part of the answer. Despite an attempt to align a struct to leverage the cache, it was getting rearranged.

Of course, there is also less inlining and the schedule behind the async command really doesn't like it when you get more active pseudo-threads than cores.

In the end, the data flow TPL was the only implementation that was roughly in the same ball park.


Yeah the .NET JITter is free to do what it wants with the field layout, mostly to pack/align things for better perf.

BTW have you looked at Distuptor.NET[0], the .NET port of Disruptor? It's a high-speed ring buffer written in .NET, that might contain some tricks to get your perf closer to the Go version.

[0] https://github.com/disruptor-net/Disruptor-net


Forgive me. What do you mean by "pack/align things for better perf"? Specifically, in what use cases would this yield better performance? And by what measurements?


It's related to the CPU, there's a nice wiki article[0] that goes into it a bit more. Also in [0] it mentions that SSE instructions will only work with 128-bit aligned data, so sometimes it's a requirement. Alignment can make memory access more efficient, but means that data can take up more space. Packing is the opposite of this, rearranging fields, so that they take up the smallest space possible.

So the 2 things are a trade-off.

[0] http://en.wikipedia.org/wiki/Data_structure_alignment


Data need to be aligned to multiples of its size on most CPU architectures to access them without performance degradation (or at all, even). Reordering members for smaller object size mostly makes sense so that more of them can fit into cache.


Why does the object layout matter for this use case? Essentially recording a histogram as described boils down to a single line to record a new observation: this.counts[Math.Floor(offset + scale * Math.Log(value))]++; Maybe using Interlocked to become thread-safe and making the index calculation integers-only if this saves some cycles. Besides getting offset and scale into the same cache line, where does the object layout matter?


The HdrHistogram class is part of a hierarchy, to make the code more sane, to make implementing specialised versions easier (Short, Long, etc) and to enforce the memory layout: - AbstractHistogramBase [0] - AbstractHistogram [1] - Histogram (or IntHistogram, ShortHistogram)

AbstractHistogram contains all the "Hot" fields, i.e. the ones that you want to get into memory at the same time, with the least possible work. These fields are used repeatedly when storing values, which is the "Hot Path" for HdrHistogram.

AbstractHistogramBase contains a lot of fields that are only used when you are iterating or displaying results, so the idea is that they are kept out of the way when recording values.

[0] https://github.com/HdrHistogram/HdrHistogram/blob/master/src...

[1] https://github.com/HdrHistogram/HdrHistogram/blob/master/src...




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: