From Newsgroup: comp.programming
In article <106lbus$155i0$
1@dont-email.me>,
David Brown <
david.brown@hesbynett.no> wrote:
On 29/07/2025 20:27, Dan Cross wrote:
In article <106apsa$2nju3$1@dont-email.me>,
David Brown <david.brown@hesbynett.no> wrote:
On 29/07/2025 14:16, Dan Cross wrote:
[snip]
I personally don't know enough Rust to make any reasonable comparison
with other languages. I also think there is scope for all sorts of
languages, and it seems perfectly reasonable to me for Rust to be
"better" than C while also having C be "better" than Rust - different
languages have their strengths and weaknesses.
I agree with this: modern C certainly has its place, and it
would be foolish to think that the many billions of lines of C
(or C++, for that matter) in existence today are simply going to
vanish and be replaced with well-written, idiomatic Rust
tomorrow.
But no one serious is suggesting that. What I think a number of
foks _are_ suggesting is that experience is proving that we get
better, less buggy results out of Rust than equivalent C.
Actually, I think many people /are/ suggesting that Rust be used instead
of C as though it were a clear and complete "upgrade" and that those who >write C today should switch to Rust and magically create "safer" code
(for some value of "safer").
Oh, many people are saying it, and they are even saying it
seriously, but they are not serious people. By which I don't
mean people who are serious about replacing C with Rust, but
rather, people who are serious in the sense of being both
qualified and having the judgement to understand the tradeoffs.
Given those definitions, I maintain what I said: no one who is
serious is actually suggesting that we just dump all C code in
the world and replace it with Rust. Ok, maybe a few are, but I
think those people mean on an extremely long timeframe, measured
in decades, at a minimum.
Now, I /do/ think that many people who write C code today could write
code that is substantially better code (fewer bugs, more efficient >development, or other metrics) if they used different languages. (I
also think the software world could benefit if some people stopped >programming altogether.) I don't see many areas for which C is the
ideal choice of language for new code.
But I don't think Rust is the magic bullet that a number of its
advocates appear to believe. I think many of those C programmers would
be better off switching to C++, Python, or various other languages (with >Rust being included as one of those).
I agree with all of the above points.
Well, I don't know about encouraging folks to quit; some folks
may realize it's not for them and opt out, but I kind of look at
someone who struggles to become a competent programmer as a
failure of that person's mentors, teachers, managers, and so on.
That's a topic for another time, though.
(To be clear, I am not saying that /you/ are claiming Rust is anything
like that.)
No problem; I understood that. :-)
But one thing that bothers me is that Rust advocates almost invariably
compare modern Rust, programmed by top-rank programmers interested in
writing top-quality code, with ancient C written by people who may have
very different abilities and motivations.
Rust is the new, cool language - the programmers who use it are
enthusiasts who are actively interested in programming, and talented
enough to learn the language themselves and are keen to make the best of >>> it. C, on the other hand, has been the staple language for workhorse
tasks. The great majority of people programming in C over the decades
do so because that's what they learned at university, and that's what
their employers' pay them to write. They write C code to earn a living, >>> and while I am sure most take pride in their jobs, their task is not to
write top-quality bug-free C code, but to balance the cost of writing
code that is good enough with the costs and benefits to customers.
So it is an artificial and unfair comparison to suggest, as many Rust
enthusiasts do, that existing C code has lots of bugs that could be
prevented by writing the code in Rust - the bugs could be prevented
equally well by one of those Rust programmers re-writing the code in
good, modern C using modern C development tools.
You have a point that transcends any sort of Rust<->C debate.
Indeed, it is difficult to compare C'23 to C'89, let alone pre-
ANSI "typesetter" C, let alone the C that, say, 6th Edition Unix
was written in. Those are all very different languages.
Yes, I agree with that. Languages have changed over the last few
decades, even when within the confines of just a single nominal language >(like "C", "C++", "Python", or any other living language). The way we
use languages, and what we do with them, has also changed - again, even
if you simply stick to a single language variant (such as C90). And the >tools have changed hugely too.
Agreed.
That said, there are some things that are simply impossible to
represent in (correct, safe) Rust that are are known to be
problematic, but that you cannot escape in C. The canonical
example in the memory-safety domain is are Rust's non-nullable
reference types vs C pointers; the latter can be nil, the former
cannot. And while Rust _does_ have "raw" pointers (that can be
null), you have to use `unsafe` to dereference them. The upshot
is that in safe rust, you cannot dereference a NULL-pointer;
perhaps Andy Hoare's "billion dollar mistake" can be fixed.
That is all true (other than Tony Hoare's name),
Oops! Thanks.
and it is definitely
one of Rust's many positive features. However, it is easy to exaggerate
the importance of this for many reasons:
1. As I understand it, "unsafe" Rust code is common, especially in
low-level code.
I write a lot of low-level Rust code, and I don't think that's
quite accurate. Using unsafe at all? Sure, but canonically one
still tries to minimize it, and wrap it up in a safe interface.
Of course, in real world low-level code, we have to do things
like manipulate machine and device registers, manipulate address
spaces, write memory allocators, and so forth. Some amount of
`unsafe` code is generally going to be required. And this is
qualitatively different than those writing (say) userspace code,
but even there, we need to do things like invoke system calls,
write memory allocators, and deal with `mmap`. But it doesn't
follow that every other line is `unsafe`, or that `unsafe` in
low-level code needn't be significantly more common than in high
level code.
Here's a somewhat trivial example; this is the code for writing
text to the CGA device from rxv64 (that is, printing to the
text-mode graphics interface):
https://github.com/dancrossnyc/rxv64/blob/main/kernel/src/cga.rs
This isn't a huge module; about 100 lines. But it does do some
slightly fiddly stuff to maintain the state of the screen,
position the cursor, scroll text, and so on, all while actually
writing to the (memory-mapped) display; some other code has
already mapped that into the kernel's virtual address space.
There are precisely 4 uses of `unsafe`: one wraps an intrinsic, (`volatile_copy_memory`), and intrinsics are always unsafe.
Another is creating a `NonNull` object, which encapsulates a
pointer into a "new type", the existence of which asserts that
the pointer is non-null; the pointer in this case is being taken
from a constant that names the (fixed) virtual address of the
start of the CGA MMIO region. A third instance wraps unpacking
that pointer, and converting it into a reference to a mutable
slice of bytes. This is `unsafe` simply to ensure that the
programmer acknowledges that the rules for converting a pointer
to a (mutable) reference are being upheld, as the language has
no insight to do that at compile time itself. The fourth and
final wraps a series of `outb` calls, that set the cursor
location. `outb` is considered an "unsafe" function because it
is possible to (ab)use it to do things that could, in theory,
violate memory safety (ie, reprogram DMA addresses on IO devices
and things like that).
All in all, 4 `unsafe` blocks covering a total of 8 statements
out of ~100 lines of code to drive a memory-mapped device does
not seem excessive, or especially common, to me. In the kernel
as a whole, out of about 7K lines of code, the `unsafe` keyword
appears exactly 369 times, probably covering 10% or less of
total code. I could probably make that substantially fewer, but
I choose not to in part because it would diminish the
pedagogical of the system.
I think it is a good thing to have the separation of
"safe" and "unsafe" code, and isolating riskier coding techniques is >beneficial. But perhaps rather than dividing a Rust program into "safe"
and "unsafe" parts, it would be better still to divide the program into >low-level efficiency-critical code written in C, and write the majority
of the code in a higher level managed language where you don't really
have pointers, or at least where any bugs in pointer-style concepts
would be caught immediately. Again, I am not saying Rust's philosophy
is bad here, merely that there are alternatives that could be better and >Rust's benefits are often over-sold.
Over the years, this has been done many times. A relatively
recent example may be Biscuit (
https://pdos.csail.mit.edu/projects/biscuit.html)
But again, I don't see why it follows that I'd write the
"low-level" parts in C. Indeed, one could structure a system
largely along the lines you described, writing the "low-level efficiency-critical" parts in Rust. Tock did something like
this (
https://tockos.org), limiting by policy the places where
`unsafe` can be used. More recently, ASTERINAS (
https://www.usenix.org/conference/atc25/presentation/peng-yuke)
enforces a similar division. Both are in Rust.
Managed languages are wonderful, but they do have downsides.
Rust tries to thread the needle in giving you precise control
over things like memory allocation and deallocation, without
the overhead, but with a measure of type- and memory-safety you
usually only find in managed languages.
2. It is perfectly possible to write C code without having errors due to >null pointer dereferences. I do not recall ever having had a null
pointer bug in any C code I have written over the last 30+ years. I
have occasionally had other pointer-related errors, typically as a
result of typos and usually found very quickly.
While impressive, this is exceedingly rare.
Indeed, I was just debugging a NULL pointer dereference bug in
the Linux kernel earlier today:
https://lore.kernel.org/linux-hams/CAEoi9W4FGoEv+2FUKs7zc=XoLuwhhLY8f8t_xQ6MgTJyzQPxXA@mail.gmail.com/
These kinds of mistakes
are avoided with due care and attention to the coding, appropriate
testing, sensible coding standards, and other good development
practices. Good tools can help catch problems at compile time (such as >compiler warnings about uninitialised variables, or extensions to mark >function parameters as "non-null"). They can also catch bugs quickly at >runtime, such as using sanitisers. I am a big fan of any language
feature that can make it more difficult to put bugs in your code (the
power of a programming languages mainly comes not from what it allows
you to write, but what it stops you from writing). But things like null >pointer bugs are not a symptom of C programming, but of poor software >development - and that is language independent.
I strongly disagree with this: it simply doesn't scale, and as
projects get bigger, with more engineers working on them
concurrently, we inevitably run into bugs.
Consider the null pointer deref bug in Linux mentioned above:
what's going on there? It's not that these people are dumb, or
that they are not using capable tools; indeed, the Linux kernel
lock annotation stuff has been a real advance. But this _is_ an
example of someone making a change that violated someone else's
assumptions, across a very large system, and manifested as a nil
pointer in a bit of code that is both obscure and also difficult
to test (to really test AX.25 you need an RF path, and for that,
you need a license and special equipment).
Could better programming practices have caught this? Maybe.
But a language that doesn't let you express the manifestation of
the bug in the first place obviates the need for that.
3. Alternative existing and established languages already provide
features that handle this - the prime example being C++.
People keep saying that, and yet available data suggests that
C++ is not nearly as effective as people claim in this area.
Microsoft may be much maligned, but they employ some damned good
engineers, and they're betting on Rust, not C++, to solve a lot
of these sorts of problems.
I also see little in the way of comparisons between Rust and modern C++.
Really? I see quite a lot of comparison between Rust and C++.
Most of the data out of Google and Microsoft, for instance, is
not comparing Rust and C, it's actually comparing Rust and C++.
Where Rust is compared to C most often is in the embedded space.
I work with embedded development, so maybe that colours my reading!
Fair point. You may find something like this interesting:
https://security.googleblog.com/2024/09/deploying-rust-in-existing-firmware.html
Many of the "typical C" bugs - dynamic memory leaks and bugs, buffer
overflows in arrays and string handling, etc., - disappear entirely when >>> you use C++ with smart pointers, std::vector<>, std::string<>, and the
C++ Core Guidelines. (Again - I am not saying that C++ is "better" than >>> Rust, or vice versa. Each language has its pros and cons.)
And yet, experience has shown that, even in very good C++ code
bases, we find that programs routinely hit those sorts of
issues. Indeed, consider trying to embed a reference into a
`std::vector`, perhaps so that one can do dynamic dispatch thru
a vtable. How do you do it? This is an area where C++
basically forces you to use a pointer; even if you try to put a
`unique_ptr` into the vector, those can still own a `nullptr`.
Sure, there are still occasions when you need raw pointers in C++. It
is far from a "perfect" language, and suffers greatly from many of its
good ideas being add-ons rather than in the original base language,
meaning the older more "dangerous" styles being still available (and
indeed often the default). But just as you should not need raw pointers
in most of your Rust programming, you should not need raw pointers in
most of your C++ programming.
Except that you do, if you want to do (say) dynamic method
dispatch across a collection of objects. If you don't use
inheritence you don't have to care, but that's a pretty
fundamental part of the language.
I'd say this is qualitatively different than the situation in
Rust, where most of the time you don't need the dangerous thing,
while it's baked into core language features in C++.
So while I appreciate that comparing these two projects might be more
useful than many vague "C vs. Rust" comparisons, it is still a
comparison between a 10-20 year old C project and a modern Rust design.
The most immediate first-impression difference between the projects is
that the Rust version is sensibly organised in directories, while the C
project jumbles OS code and user-land utilities together. That has,
obviously, absolutely nothing to do with the languages involved. Like
so often when a Rust re-implementation of existing C code gives nicer,
safer, and more efficient results, the prime reason is that you have a
re-design of the project in a modern style using modern tools with the
experience of knowing the existing C code and its specifications (which
have usually changed greatly during the lifetime of the C code). You'd
get at least 90% of the benefits by doing the same re-write in modern C.
Not really. rxv64 has a very specific history: we were working
on a new (type-1) hypervisor in Rust, and bringing new engineers
who were (usually) very good C and C++ programmers onto the
project. While generally experienced, these folks had very
little experience with kernel-level programming, and almost none
in Rust. For the OS bits, we were pointing them at MIT's course
materials for the 6.828 course, but those were in C and for
32-bit x86, so I rewrote it in Rust for x86_64.
In doing so, I took care to stay as close as I reasonably could
to the original. Obviously, some things are different (most
system calls are implemented as methods on the `Proc` type, for
example, and error handling is generally more robust; there are
some instances where the C code will panic in response to user
action because it's awkward to return an error to the calling
process, but I can bubble thus back up through the kernel and
into user space using the `Result` type), but the structure is
largely the same; my only real conceit to structural change in
an effort to embrace modernity was the pseudo-slab allocator for
pipe objects. Indeed, there are some things where I think the
rewrite is _less_ elegant than the original (the doubly-linked
list for the double-ended queue of free buffers in the block
caching layer, for instance: this was a beautiful little idea in
early Unix, but its expression in Rust -- simulated using
indices into the fixed-size buffer cache -- is awkward).
I can certainly see that some things are more inconvenient to write in
C, even in modern C standards and styles - Rust's Result types are much >nicer to use than a manual struct return in C. (C++ has std::expected<> >that is very similar to Result, though Result has some convenient
shortcuts and is much more pervasive in Rust.)
Yeah. I want to like `std::expected`, but its use is awkward
without the syntactic sure of the `?` operator. There's the old
saw about syntactic sugar and cancer of the semicolon, but in
this case, I think they got it mostly right.
The Rust port did expose a few bugs in the original, which I
fixed and contributed back to MIT. And while it's true that the
xv6 code was initially written in the mid 00's, it is still very
much used and maintained (though MIT has moved on to a variant
that targets RISC-V and sunsetted the x86 code). Also, xv6 has
formed the basis for several research projects, and provided a
research platform that has resulted in more than one
dissertation. To say that it is not representative of modern C
does not seem accurate; it was explicitly written as a modern
replacement for 6th Edition Unix, after all. And if it is not
considered modern, then what is?
Modern C would, IMHO, imply C99 at the very least. It would be using
C99 types, and it would be declaring local variables only when they can
be properly initialised - and these would often be "const". It would be >using enumerated types liberally, not a bunch of #define constants. It >would not be declaring "extern" symbols in C files and other scattered >repeated declarations - rather, it would be declaring exported
identifiers once in an appropriately named header and importing these
when needed so that you are sure you have a single consistent
definition, checked by the compiler. It would not be using "int" as a >boolean type, returning 0 or -1, or when an error enumeration type would
be appropriate. It would be organised into directories, rather than
mixing the OS code, build utilities, user-land programs, and other files
in one directory. Header files - at a minimum, those that might be used
by other code - would have guards and include any other headers needed. >Files would have at least a small amount of documentation or
information. All functions would be either static to their files, or >declared "extern" in the appropriate header.
Much of this is, as you say, stylistic. Enumerations for
constants might be an improvement, I suppose, but in C they
confer few benefits beyond `#define`; they're basically `int`s
up through C18, and still default to `int` unless defined with a
fixed underlying type in C23.
That said, I mostly agree with you; certainly, using
appropriately sized types everywhere would already been a
considerable improvement.
And for code like this that is already compiler-specific, I would also
want to see use of compiler extensions for additional static error
checking - at least, if it is reasonable to suppose a compiler that is >modern enough for that.
This I must push back on. This code is or (or, rather, should
not) be _compiler specific_ so much as _ABI specific_.
There seem to be a handful of specific compiler attributes used
here: mostly in marking bits of code `__attribute__((noreturn))`
(though we have a standard way to write that now) and aligned
(we also have a standard way to do that now, too). There's a
small amount of inline assembly, which _could_ just be in a `.S`
file, and a coule of calls to `__sync_synchronize()` intrinsics,
which could be replaced with some sort of atomic barrier, I'd
imagine.
But herein lies another problem with this approach: once you
start using a lot of compiler-specific extensions, you're not
exactly programming in C anymore, but some dialect defined by
your compiler and enabled extensions. You limit portability,
and you create a future maintenance burden if you ever need to
switch to a different compiler.
As a counter-example, we did the Harvey OS a few years ago.
This was a "port" of the Plan 9 OS to mostly-ISO standard C11,
replacing the Plan 9 C dialect and Ken Thompson's compilers. It
worked, and we made it compile with several different versions
of several different compilers. This discipline around
portability actually revealed several bugs, which we could find
and fix.
Now, I realise some of these are stylistic choices and not universal.
And some (like excessive use of "int") may be limited by compatibility
with Unix standards from the days of "everything is an int".
None of this means the code is bad in any particular way, and I don't
think it would make a big difference to the "code safety". But these
things do add up in making code easier to follow, and that in turn makes
it harder to make mistakes without noticing.
That's fair, but if we're talking about how the language affects
safety of the resulting program, mostly stylistic changes aren't
really all that helpful. Even using the sized types from e.g.
`stdint.h` only gets you so far: you've still got to deal with
C's abstruse implicit integer promotion rules, UB around signed
integer overflow, and so on. For example, consider:
uint16_t mul(uint16_t a, uint16_t b) { return a * b; }
Is that free of UB in all cases? The equivalent Rust code is:
pub fn mul(a: u16, b: u16) -> u16 { a.wrapping_mul(b) }
There's plenty of crazy C code out there from the days of "all
the world's a VAX" that we can pick on; safety wise, xv6 is
actually pretty decent, though.
I hear this argument a lot, but it quickly turns into a "no true
Scotsman" fallacy.
Agreed - that is definitely a risk. And it also risks becoming an
argument about preferred styles regardless of the language.
+1e6
This is less frivilous than many of the
other arguments that are thrown out to just dismiss Rust (or any
other technology, honestly) that often boil down to, honestly,
emotion. But if the comparison doesn't feel like it's head to
head, then propose a _good_ C code base to compare to Rust.
I don't know of any appropriate C code for such a comparison. I think
you'd be looking for something that can showcase the benefits of Rust - >something that uses a lot of dynamic memory or other allocated
resources, and where you have ugly C-style error handling where
functions return an error code directly and the real return value via a >pointer, and then you have goto's to handle freeing up resources that
may or may not have been allocated successfully.
Ideally (in my mind), you'd also compare that to C++ code that used
smart pointers and/or containers to handle this, along with exceptions >and/or std::expected<> or std::optional<>.
I feel like this has been done now, a few times over, with data
published from some of the usual FAANG suspects, as well as in a
growing number of academic conferences. Then you've got things
like
https://ferrocene.dev/en/ as well.
I am totally unconvinced by arguments about Rust being "safer" than C,
or that it is a cure-all for buffer overflows, memory leaks, mixups
about pointer ownership, race conditions, and the like - because I
already write C code minimal risk of these problems, at the cost of >sometimes ugly and expansive code, or the use of compiler-specific code
such as gcc's "cleanup" attribute. And I know I can write C++ code with >even lower risk and significantly greater automation and convenience.
If Rust lets me write code that is significantly neater than C++ here,
then maybe it worth considering using it for my work.
First, I think you are totally justified in being skeptical. By
all means, don't take my (or anyone else's) word about the
language and what we express as benefits. Indeed, I was
extremely skeptical when I started looking at Rust seriously,
and frankly, I'm really glad that I was. I started from a
default negative position about the claims, but I did try to
give the thing an honest shot, and I can say with confidence
that it surprised me with just how much it really _can_ deliver.
That said, I am still not an unapologetic fan boy. There are
parts of the language that I think are awkward or ill-designed;
it does not fix _all_ bug, or seek to. I just think that, for
the application domain that I work in most often (kernel-level
code on bare metal), Rust is the best available language at this
time.
It is not my child, though: if another language that I felt met
the area _better_ suddenly showed up, I'd advocate switching.
But the contrapositive of skepticism is that one has to be open
to evidence that disconfirms one preconceptions about a thing.
So it's fine to be skeptical, but my suggestion (if you are so
inclined) is to pick a problem of some size, and go through the
exercise of learning enough Rust to write an idiomatic solution
to that problem, then maybe seek out some folks with more
experience in the language for critique. That is, really dig
into a problem with it and see how it feels _then_. I'd
honestly be a bit surprised if you came away from that exercise
with the same level of skepticism.
First, however, the language and tools need to reach some level of
maturity. C++ has a new version of the language and library every 3
years, and that's arguably too fast for a lot of serious development
groups to keep up. Rust, as far as I can see, comes out with new
language features every 6 weeks. That may seem reasonable to people
used to patching their Windows systems every week, but not for people
who expect their code to run for years without pause.
That is not accurate.
A new "edition" of the language is published about once every
three years (we're currently on Rust 2024, for instance; the one
before that was 2021, then 2018 and 2015). A new stable version
of the compiler comes out every 12 weeks, with a beta compiler
every 6 weeks, I believe (I don't really follow the beta series,
so I may be wrong about the timeline there) and a nightly
compiler approximately every day.
Usually each beta/stable compiler promotes an interface that
_was_ experimental to being "stable", but usually that's
something in a library; you can only use the experimental
("unstable)" interfaces from a nightly compiler.
Intefaces give you a means for long-term support. We have code
that uses 2021 and 2018, and the language hasn't changed so much
that it breaks.
The tooling is very good, and I would argue superior to C++'s in
many ways (error messages, especially). There is robust and
wide support in debuggers, profilers, various inspection and
instrumentation tools, etc; part of that is due to the wise
decision to (mostly) use C++-compatible name mangling for
symbols, and build on the LLVM infrastructure.
There are a number of very good references for the language; the
official book is available online, gratis:
https://doc.rust-lang.org/book/
(Disclaimer: Steve Klabnik is one of my colleagues.)
I'd say, give it a whirl. You may feel the same, but you may
also be pleasantly (or perhaps unpleasantly!) surprised. In any
event, I'd like to know how it turns out.
- Dan C
--- Synchronet 3.21a-Linux NewsLink 1.2