Now I have four services running on production, all written with Rust. If it compiles, it usually works. Of course you have these late night sessions where you write that one unwrap() because, hey, this will never return an error, right? And bam...
I'm seriously waiting that tokio train to be stable and a unified way of writing async services without needing to use some tricks with the channels or writing lots of ugly callback code. Also the native tls support is coming and the dependency hell with openssl would be gone forever.
If you need http server/client, I'd wait for a moment for Hyper to get their tokio branch stable and maybe having support for http2 by migrating the Solicit library.
Asynchronous nonblocking interfaces are more general-purpose than synchronous blocking interfaces. I can't speak for this library or Rust specifically, but in my experience well-designed asynchronous libraries allow you to interact with them in a synchronous style as well, if you wish.
Netty is an asynchronous, event-driven network framework for Java, and it's perfectly possible to expose synchronous blocking abstractions on top of it. The mechanism is pretty simple: the asynchronous framework exposes a future representing the result of an operation, and to provide a synchronous interface you simply block on the completion of that future before returning. Client libraries can handle this for you, providing the interface of e.g. a regular blocking HTTP client on top of Netty async IO.
This approach can be convenient, since it's possible for both synchronous and asynchronous style code to coexist easily in the same application. The application designer can incrementally change parts of the application into asynchronous style as performance needs dictate. For example, you might choose to serve typical small RPC requests using blocking workers in a thread pool, but when you need to stream the content of a large file across the network you could use a separate nonblocking worker pool that interacts with both the file system and network asynchronously.
The ability to interact in a blocking way via futures means that asynchronous facilities can serve both synchronous and asynchronous needs, making them the better choice for most frameworks today. While it used to be the case that async IO frameworks took a performance penalty compared to well-implemented sync IO ones, from what I understand that gap has been closed, and the highest performance frameworks are now all async IO. For example, check out the TechEmpower Web Framework Benchmarks. Most or all of the top performers use asynchronous approaches: https://www.techempower.com/benchmarks/#section=data-r13&hw=...
One is not more powerful than the other: they are considered "duals" and this has been proven in the literature (back in 1978, no less); most of the supposed downsides of threads are due to people assuming a specific implementation of threads (many if not most of which suck).
Here are some papers that would normally be assigned reading in a graduate level Computer Science course in Operating Systems as background reference.
But like, it should be obvious: with a lightweight co-routine library you can convert anything that is synchronous into something that is asynchronous with no more if not less overhead than you would get from context switches as you are forced to incur from returning and calling a new function to implement event processing. This is no more onerous than using that same co-routine library to implement blocking on a future (to convert an asynchronous API into a synchronous one).
The fact that two styles or concepts are formally dual does not make them equally practical or useful in all circumstances.
Consider: in calling conventions, the continuation passing style is dual to the "direct" calling convention (i.e., call stack with return values); the call-by-name style is dual to call-by-value style; Lambda Calculus and Turing Machines are dual in their ability to compute all effectively calculable functions.
These dualities do not mean it's equally practical to build systems in both ways. Sometimes one approach ends up being more practically useful.
Most programmers prefer to use the direct calling convention, and find complex continuation passing style to be difficult to read and maintain. JavaScript programmers may be familiar with the pain of CPS due to excessive use of callbacks (not strictly CPS but has similar drawbacks). Similarly, writing code purely in call-by-name style can be confusing and have difficult to predict performance impacts (e.g., Haskell lazy evaluation semantics).
In their article the article "On the Duality of Operating System Structures", Lauer and Needham present a similar conclusion [3]:
> "The principal conclusion we will draw from these observations is that the considerations for choosing which model to adopt in a given system [...] [are] a function of which set of primitive operations and mechanisms are easier to build or better suited to the constraints imposed by the machine architecture and hardware."
In that passage they are describing message passing vs. procedure call systems, and I interpret this to be their acknowledgment that, though the systems are dual, one architecture or another is more appropriate in certain circumstances.
Getting back to our original topic: this thread was about the decision of a Rust library to offer async or sync IO as its choice of primary primitive. I think async is the better general-purpose choice, because it's clean, simply, and straightforward to expose a synchronous interface on top of an async interfaces with futures; and the other way around is messy and difficult.
Can you elaborate on the lightweight co-routine library that can be used to convert anything synchronous into async? I'm curious about that, because Rust previously had support for coroutines (green threads), and decided to remove them due to a number of problems [1]. Meanwhile, Rust developers were able to devise a zero-cost futures abstraction on top of asynchronous IO [2]. Unlikely the problematic green threads strategy, this approach doesn't impose any complicated constraints on the systems that use it (FFI requirements), and doesn't add runtime overhead.
What co-routine library would you recommend that avoids the downsides in [1]?
Yes, I know, async I/O is the new cool thing. Here's an async I/O program from 1972.[1] John Walker wrote this. EXEC 8 had the IO$ system call, which, unlike IOW$, returned immediately. A "completion routine" was called when the I/O operation finished. Note how similar those libraries are to what's used today, now that people are reading Dijkstra again. The problem, of course, is that a callback system dominates the architecture of the entire program.
(When I moved from UNIVAC mainframes to UNIX, things seemed so sequential. No threads. No async I/O.)
Huh, so are you implying that hyper should stay synchronous so that it doesn't appear to just be copying things from 40 years ago?! This comment sounds like you think that it was good back then, but now you don't know what to think of a library that is aiming to switch to asynchronous IO and/or don't know why it's a good thing?
(It's also not like the comment you're replying to said that async IO is a recent invention, your low-effort sarcasm as a response is unfortunate.)
Where pretty much anything related to concurrency is concerned we've been busy reliving the 70's for most of the last decade. Locking, asynchronous I/O, you name it.
Hell even Microsoft had I/O Completion ports back in what, 2003 or so? Or am I wrong and it was a lot earlier? The coolest things in Javascript land were all done by Microsoft first and everyone (me especially) can't bring themselves to acknowledge that.
Isn't the NT async IO API just a front for a kernel side thread pool though, and may block depending on worker thread availability? They say[1] "if you issue an asynchronous cached read, and the pages are not in memory, the file system driver assumes that you do not want your thread blocked and the request will be handled by a limited pool of worker threads"
Things may be different for socket IO, but there Unix had select() much earlier, around 4.2BSD (1983)
Web servers and GUI apps are both long-running, event driven programs that need to do IO or slow computations while staying responsive to new events. It's not surprising that they are both well supported by the same programming model. The Win32 API is ugly but the methods of app/OS interaction it supports are fundamentally sound for high performance interactive programs.
I assume you're talking about why reqwest is necessary (i.e. why hyper is moving to async), rather than the single paragraph mentioning asynchronicity as a possible future direction for reqwest?
Async I/O is important for more than just multiplexing a million requests. Hyper is moving to it for that purpose, and also because it is a more natural way for working with in-flight I/O, e.g. cancelling requests or selecting over them.
Simply put, IO as accessed by the Os is asynchronous by nature. The synchronicity you are used to is a nice artifact and lie the OS tells you to make simple programming easier. Underneath, the OS is doing everything asynchronously, and just waiting until complete to return, otherwise we would be throwing away massive quantities of compute cycles waiting for IO to complete.
Why let the OS and other programs running at the same time reap all that benefit? You too can program such that while you would normally be waiting for IO and some other programs was utilizing the CPU, your own cycles can be used and you can be accomplishing so much more.
I don't think everything has to be async, but if there's ever a place to use it then it's HTTP. There are too many systems that rely on an external web service and collapse in a ball of threads if that external web service ever gets a bit slow.
> I'm seriously waiting that tokio train to be stable and a unified way of writing async services without needing to use some tricks with the channels or writing lots of ugly callback code
Isn't if let similar to Swift's if let where it either unwraps safely or does something else? I really wish I could disable forced unwrapping as it mostly leads to mistakes by less experienced or overconfident programmers and the amount of extra code by using guard instead if you program smart is negligible.
Though Swift's `if let` is AFAIK hardcoded to their built-in optional type, whereas when Rust lifted the idea they made it work with any enum. I believe Swift recently gained `if case let` as an equivalent to how `if let` works in Rust.
It was new in Swift 2 I think. Didn't use it yet. But it's basically a single case statement lifted from switch, nothing more. But enums and switch statements can be really complex beasts by themselves in Swift.
If Rust only allows it on enums that would be extremely weird.
In a switch statement I sometimes want to switch on the object, then see if I'm allowed to unwrap it to a certain class and immediately use it afterwards. Useful for parsing an array of mixed object types that should be processed differently.
But I honestly think Swift allows people to write elaborate illegible codegolf-y bullshit sometimes. Sometimes the Swift compiler still chokes on too complex expressions and you need to add some explicit types or separate out in several statements what you were trying to do in one statement. Usually a good warning that your code is hard to read even by humans.
If-let can be used with refutable patterns (the refutation/counterexample to the pattern is when the else case runs). Irrefutable patterns can be used with regular let.
"Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?"
Unwrap implies "either this operation should succeed or we should panic". Doing anything other than panic seems like it's terribly difficult to determine what that should be. And I might've put unwrap() there because that's really what I want to happen. For `mv`: don't let's dare go ahead and do the unlink() if the link() failed!
Tangent: errors and exceptions aren't bad, they're how computers work. I've encountered folks who, when faced with runtime errors pepper their code with "if (!NULL)" or truly evil things like "except:pass"/"catch (...) { }" which rarely make sense anywhere but the base of the stack and even then don't usually. If you've ever asked yourself "but how did we even get here‽" it may be because someone dropped something totally incongruous like that in a related module.
> "Pray, Mr. Babbage ..." (for those who do not know the quote)
“On two occasions I have been asked, ‘Pray,
Mr. Babbage, if you put into the machine wrong
figures, will the right answers come out?’ . . . I am
not able rightly to apprehend the kind of confusion
of ideas that could provoke such a question.”
I think the idea would be the program would fail to compile, and you would need to go back and replace the unwrap with proper error handling. The goal would be to allow the use of unwrap during development, but require the final polish before the code goes into production.
unwrap is proper error handling. It says "Try to do this. If it fails, panic". Its like an assert on an invariant that the compiler requires.
If my script depends on a database connection, I might connect to a database and unwrap() it so the script errors out if the database isn't available. If I wrote that logic myself I would just be awkwardly rewriting unwrap.
> If my script depends on a database connection, I might connect to a database and unwrap() it so the script errors out if the database isn't available.
I think that's a bad example, as it is one of those things that can really fail at runtime and which should be properly handled. Even if handling means printing an error message and stopping the process with an exit code - but not crashing.
I think unwrap is for things that really should not happen if everything is implemented correctly.
Thats fair. I said script for that reason. A better example would be assert equivalents - if an API could return null (Option) under normal circumstances but you know it can never be null based on how you're using it, unwrap() makes sense. That contract could only be violated if there's a bug in the implementation. If thats the case all guarantees are out the window and usually the best / only thing you can do is to crash and allow the process to restart.
Also in those case (in my experience) having a human-readable error message is rarely useful. When assertions are violated I almost always have to consult the code anyway. And 80% of my asserts are never hit. I usually don't bother preemptively writing decent error messages. File name and line number is the right information, and panic provides that anyway.
> If thats the case all guarantees are out the window and usually the best / only thing you can do is to crash and allow the process to restart.
I've found that there are a lot of minor bugs in the implementation that, in something client-facing (e.g. not on a server somewhere that can simply be taken out of the load balancing rotation until it restarts or whatever) probably shouldn't crash.
Report and log errors remotely - to be fixed - and skip some logic that relied on those guarantees, but not crash.
> Also in those case (in my experience) having a human-readable error message is rarely useful.
I'll settle for developer-readable, then ;). Panic can format error messages, and itself provides context information (the file and line you mention) as a decent means of reporting fatal errors. Some assertions are obvious enough as to their reason and cause from context - as you say, they don't need a message.
But I've also found taking the 10 seconds or so to think of a decentish error message pays off quite frequently. Even if I'm pretty sure it's unnecessary. Sometimes it may save me only a minute of context switching by telling me exactly what the problem was (instead of roughly describing some assumption made for unknown reasons), sometimes the only way I can make progress is by adding more logging and messaging and reproducing the problem because I couldn't suss out exactly what was happening - and deciding this could take a lot longer than a minute if I know it's hard to reproduce.
To be more accurate, unwrap can be a legitimate means of error handling when used in an application, as opposed to a library. But if you're writing a library, then unwrapping rather than using Result is a surefire way to make your users hate you. :)
Ah! I stand corrected. "library" is a pretty sane use case for barring unwrap(), one that cargo knows is the current goal. I still don't think it's general enough but maybe it's worth a warning.
Aside: CPython's gdbm support is provided by libgdbm that calls exit() for you if if finds something it's not happy with (corrupted database, e.g.). O.o
Sometimes you've proven some invariant in some other way, so you know that unwrapping is guaranteed to not panic. Although in those cases I prefer to use .expect("this will not fail because of blah") instead in the spirit of self-documenting code.
Early exploration and tests amount to a lot of code, and the language needs to make those parts pleasant to write as well. I think .unwraps() are especially common there.
I imagine `println!()` is another thing whose design is influenced by meeting the needs of early exploration implementations (and it's another example of a library function that handles errors with panics, for example, but it's not just the thing that makes me think so).
Yes, I would envision an allow attribute of some sort.
This is simply a contract with compiler that you didn't copy and paste some example code somewhere in your codebase that fails with an unwrap. It's about extending the "if it compiles it runs" near-guarantee that we so love about Rust.
Ideally a program would fail fast and be restarted if it reached an unrecoverable state, with supervision trees like Erlang. Also ideally, unwrap would be used for exceptional states, not only ones that are unlikely to fail until something goes wrong, like a port being closed or a file unreadble or not present.
It doesn't work transitively (so if a crate you depend on unwraps you can't protect yourself), but https://github.com/llogiq/metacollect plans to fix that
To put this into perspective, this would also necessarily disable expressions of the form `xs[i]` where `xs` is a slice. Why? Because `xs[i]` is equivalent to `*xs.get(i).unwrap()`.
In other words, banning unwrap isn't really that productive because an unwrap, when properly used, is an expression of a runtime invariant.
The problem is that unwrap can be very easily misused as an error handling strategy in a library, and in that case, it's pretty much always wrong. But that doesn't mean using unwrap in a library is wrong all on its own, for example.
Sometimes I do wish I could disable the indexing syntax, though. :P At least in my own code, I find that I naturally reach for iterators rather than doing any manual indexing.
unwrap has legitimate use-cases, and it's not clear what "disable" it would be, as it changes the type of the thing it returns. You could write a lint to fail the build, if you want, I guess...
.unwrap() is only the right choice if you need to optimize for binary size and can't afford the cost of the precise error message you would pass to .expect(). There are situations where you can't possibly continue running the application if an error occurs, but you shouldn't rely on a backtrace (which you might not manage to capture, e.g. if RUST_BACKTRACE is unset or you don't have symbols) as your only method of communication with your future self.
This is not true. For example, consider this code:
if foo.is_some() {
let foo = foo.unwrap();
} else {
// other code
}
Here, I _know_ that foo is some. The extra error message from expect will _never_ be seen.
Now, this is a contrived example, and would better be written with `if let` in today's Rust, but this is the _kind_ of situation in which unwrap is totally, 100% cool, but the compiler can't know.
I write Swift daily and I just don't force unwrap anymore, ever. I don't think a hard crash is very usable in a production application, a lot of people disagree and want a hard crash while testing but I think for those bugs that slip through the user experience between for example "loading the first screen but my avatar isn't set" is so much better than "loading the first screen and the app kills itself" just because you force unwrapped the URL of the avatar from the JSON response that had a slight problem in production.
Well perhaps we should have something that logs the error in production but keeps on trucking and crashes the application when the debug or test flag is set?
panics are explicitly for unrecoverable errors, so recovering from them and keeping on going means that you're not using the right kind of error handling. If that's the behavior you'd want, then you wouldn't want to use unwrap.
The error message in this case might be something like "foo became None after verifying it to be Some". This could happen, for example, if incorrect unsafe code in another thread concurrently mutates foo through a raw pointer. My point is that of course while writing them you don't think your unwraps will fail, but if they do, it's good to have a reminder of what's going on. Even if the expect never fails, the message provides additional documentation for those reading the code.
You'll like Rust, because concurrent mutation of a value is impossible if you hold a `&` or `&mut` to the value. So this can in fact be ruled out by the programmer.
It's only impossible in safe code. Unsafe cade can violate those rules all day long. You can't guarantee that there's no unsafe code running concurrently.
Any use of `unsafe` that breaks unrelated safe code is broken and buggy; if that scenario would happen like you describe it, the code is breaking Rust's aliasing rules: that's possible using `unsafe` but invalid and leads to UB.
I'm not talking about 'uses of unsafe', I'm talking about code that is unsafe. Much of that code is not even written in Rust, so there's no 'unsafe' to use.
Ok, so code that is memory unsafe (broken!). One must still say "unsafe" to bring it into Rust (to use ffi, or make a safe wrapper); so there is still a clear location in the Rust code that is to blame.
> Now, this is a contrived example, and would better be written with `if let` in today's Rust, but this is the _kind_ of situation in which unwrap is totally, 100% cool, but the compiler can't know.
If the author knows it's safe, they should be able to express how they know in a way that the compiler can understand. Certainly I think there's a large space of use cases where the extra guarantee provided by forbidding unwrap would be well worth the cost of outlawing some "legitimate" cases, especially if we're just talking about doing so on a project level. (Though maybe they're not the rust target audience).
What if I want to have a reference to the last element in a vector that I just pushed? Without a push method that immediately returns such a reference, this will always involve an unwrap. And this is not just some weirdly constructed example, I've needed to do this in real code a couple of times already.
Pushing an element to a vector could return a guaranteed-non-empty vector. Admittedly it's unpleasant to write non-empty collections in a language that lacks HKT, since you have to reimplement a lot of stuff, but I'd consider that an argument for HKT rather than an argument for unwrap.
If you try to reply on HN too quickly, it hides the reply button as to discourage quick back-and-forths.
I mention in the post that this specific code would be best written with if let, but that it's not about the specifics, it's about the general pattern.
This is pedantic. It seems clear from context that this conversation is about panicking on the None/Err cases, and unwrap is a shorthand for "unwrap or except or match with a panic branch."
Right, it would likely use clippy but it would essentially be a --production target or profile, intended for builds where the binary will be run in production.
And by 'and friends' I mean calls that panic for the same reason as unwrap, such as expect or ok.
The goal is to stop code from reaching production inadvertently, not to prevent all sources of panics.
But if we accept the premise that "it's acceptable in some cases to have unwrap() in code that targets 'production'" then it wouldn't make sense to have a production profile that bars its use. The word "production" is in the global namespace and I think you want something more specific to your use case.
Rather, one could define a rust coding guide for themselves that deems unwrap() inappropriate for production use. (and in that case use the lint Steve suggests).
Seems like you'd want a lint rule where if unwrap is used, it must have a comment preceding it (of some formal syntax) describing why it's necessary or appropriate. Thus the build can have all uses of unwrap known as explicitly allowed, allowing all the possible sites of panics to be enumerated and known, a useful property to have
I'm seriously waiting that tokio train to be stable and a unified way of writing async services without needing to use some tricks with the channels or writing lots of ugly callback code. Also the native tls support is coming and the dependency hell with openssl would be gone forever.
If you need http server/client, I'd wait for a moment for Hyper to get their tokio branch stable and maybe having support for http2 by migrating the Solicit library.