If the source language is C++, another option might be to use AI agents to port to a memory-safe subset of C++ [1]. For the most part, this involves surgical changes and glorified find-and-replace operations. And I'm guessing way fewer tokens :)
If the source language is legacy C, then another option might be (deterministic) transpilation to a memory-safe subset of C++ [2]. The resulting code wouldn't necessarily be performance-optimal, but it can be used for the majority of code that isn't really performance-sensitive.
You might be interested in the scpptool feature to help convert C code to a subset of C that will also compile as C++ (under clang++ at least) [1]. While many of the necessary modifications are fairly trivial, some of them aren't completely so. For example, C++ does not allow `goto`s that would skip over the declaration/initialization of a variable that would be accessible after the jump. So getting the C code to work as C++ can involve some (automatic) code restructuring.
Another annoying detail is that C++ doesn't seem to like forward references of `enum`s. That is, while
struct A* a_ptr;
is fine in both C and C++ even before `struct A` has been defined, apparently
enum A* a_ptr;
is not cool in C++ until after `enum A` has been defined.
One arguable benefit of keeping your C code compatible with (or at least convertible to) C++, is that you can theoretically use scpptool's auto-translation feature as build step to produce memory-safe executables from C code via transpilation to a memory-safe subset of C++.
> It's in many cases as simple as renaming a file from .c to .cpp.
That is rather optimistic, but, for example, scpptool has a feature [1] that auto-converts from C to a subset of C that can (hopefully) be compiled with clang++. If the original C source uses C11 extensions, clang++ seems to generally produce warnings rather than compile errors.
> But for writing something from scratch it's better to use Rust.
scpptool attempts to make C++ a more viable option by enforcing a memory and data race safe subset using a similar safety strategy.
I think the concern is that the writing may be on the wall for (the current memory-unsafe version of) Coreutils. Despite the bugs and incompatibilities, Canonical seems to have decided that the memory safety of uutils is worth it. And those two downsides, the bugs and incompatibilities, will likely attenuate quickly, compelling the other distros to follow suit in adopting uutils before long.
So the continued popularity of Coreutils might, I think, depend on Coreutil's near-term publicly announced and actual memory safety strategy. As I suggested in my other comment, there are (somewhat nascent) options for memory safety that do not require a rewrite of the code base. (For linux x86_64 platforms, depending on your requirements, that might include the "fanatically compatible" Fil-C.) And given the high profile of Coreutils, there are likely people willing to work with the Coreutils team to help in the deployment of those memory safety options.
I don't know if you're aware, but there is a demonstration of wget (a fellow "gnu utility", right?) being auto-translated to a memory-safe subset of C++ [1]. Because the translation essentially does a one-for-one substitution of potentially unsafe C elements with safe C++ counterparts that mirror the behavior, the translation should be much less susceptible to the introduction of new bugs and behaviors in the way a rewrite would be.
With a little cleaning-up of the original code, the code translation ends up being fully automatic and so can be used as a build step to produce (slightly slower) memory-safe executables from the original C source.
Filesystem access is mostly treated by users as serialized ACID transactions on "files in directories."
"Managing this resource centrally" is where unix syscalls came from. An OS kernel can be used like a specialized library for ACID transactions on hardware singletons.
People then got fancy with virtual memory, interrupts, signals, time-slicing, re-entrancy, thread-safety, and injectivity.
It doesn’t matter, whether you call the "kernel library" from C, C++, Fortan, BASIC, Golang, bash, Rust, etc.
Interestingly, I recently auto-translated wget from C to a memory-safe subset of C++ [1], which involves the intermediate step of auto-converting from C to the subset of C that will also compile under clang++. You end up with a bunch of clang++ warnings about various things being C11 extensions and not ISO C++ compliant, but it does compile.
Plug: In theory you could auto-convert to a memory-safe subset of C++ as a build step. Auto-converted code would have some run-time overhead, but you can mark any performance-sensitive parts of the code to be exempt from conversion. And you get lifetime and type safety too. For full coverage, performance-sensitive parts of the code can be manually converted to the safe subset to minimize overhead. (Interfaces in extern C blocks remain unconverted by default to maintain ABI compatibility.)
As long as we're plugging our projects, I'll mention the scpptool-enforced memory-safe subset of C++. Fil-C would be generally more practical, more compatible and more expedient, but the scpptool-enforced subset of C++ is more directly comparable to Rust.
scpptool demonstrates enforcement (in C++) of a subset of Rust's static restrictions required to achieve complete memory and data race safety [1]. Probably most notably, the restriction against the aliasing of mutable references is not imposed "universally" the way it is in (Safe) Rust, but instead is only imposed in cases where such aliasing might endanger memory safety.
This is a surprising small set of cases that essentially consists of accesses to methods that can arbitrarily destroy objects owned by dynamic owning pointers or containers (like vectors) while references to the owned contents exist. Because the set is so small, the restriction does not conflict with the vast majority of (lines of) existing C++ code, making migration to the enforced safe subset much easier.
The scpptool-enforced subset also has better support for cyclic and non-hierarchical pointer/references that (unlike Safe Rust) doesn't impose any requirements on how the referenced objects are allocated. This means that, in contrast to Rust, there is a "reasonable" (if not performance optimal) one-to-one mapping from "reasonable" code in the "unsafe subset" of C++ (i.e. traditional C++ code), to the enforced safe subset.
So, relevant to the subject of the post, this permits the scpptool to have a (not yet complete) feature that automatically converts traditional C/C++ code to the safe subset of C++ [2]. (One that is deterministic and doesn't try to just punt the problem to LLMs.)
The problem isn't dedicating public resources to trying to getting LLMs to convert C to Safe Rust after investments in the more traditional approach failed to deliver. The problem is the lack of simultaneous investment in at least the consideration and evaluation of (under-resourced) alternative approaches that have already demonstrated results that the (comparatively well-funded) translate-to-Rust approach thus far hasn't been able to.
> Profiles cannot achieve the same level of safety as Rust
So the claim is that the scpptool approach[1] can, while remaining closer to traditional C++, and not requiring the introduction of new language elements. Since the scpptool-enforced safe subset of C++ is an actual subset of C++, conforming code continues to build with your existing compiler. It just uses an additional static analyzer to check conformance.
For the 90% or whatever of C++ code that is not actually performance sensitive, the associated SaferCPlusPlus library provides drop-in and "one-to-one" safe replacements for unsafe C++ elements (like standard library containers and raw pointers). (For example, if you're worried about potentially invalid vector iterators, you can just replace your std::vector<>s with mse::mstd::vector<>s.) With these elements, most of the safety is enforced in the type system and not reliant on the static analyzer.
Conforming implementations of performance-sensitive code would be more restricted and more reliant on the static analyzer for safety enforcement. And sometimes requires the use of library elements, like "borrowing objects", which may not have analogies in traditional C++. But overall, even high-performance conforming code remains very recognizable C++.
The claim is that the scpptool approach is a straightforward path to full memory (and data race) safety for C++, and the one that requires the least code migration effort. (And again, as an actual subset of existing C++, not technically dependent on standard committees or compiler vendors for its implementation or deployment.)
If the source language is legacy C, then another option might be (deterministic) transpilation to a memory-safe subset of C++ [2]. The resulting code wouldn't necessarily be performance-optimal, but it can be used for the majority of code that isn't really performance-sensitive.
[1] https://github.com/duneroadrunner/scpp_code_migration [2] https://github.com/duneroadrunner/SaferCPlusPlus-AutoTransla...
reply