Early Impressions of Rust

If you have any interest in programming languages, you’ve probably heard of Rust by now. I had been meaning to try it out for quite some time, since the claims that the borrow checker can make whole classes of memory errors impossible with zero runtime cost are very intruiging. Unfortunately, each time I got far enough to examine some sample code, I would immediately recoil in horror at the syntax and put it away again.

A few months ago, I finally decided to push through my negative syntax impressions and give it a real try. Now that I’ve spent a couple of months with Rust and written a few thousand lines of code, there’s a lot to like. The borrow checker and ownership tracking are indeed powerful. There are many high-quality libraries available, and pervasive use of things like Option and Result make error handling very clean. In many ways, Rust feels like a more practical version of Haskell: You get (most of) the most useful monads and (some of) the most popular algebraic data types, but without the pain of rewriting your algorithms into fully functional versions.

But oh, that syntax! It has so many strange warts that make the language harder to learn and harder to read. Let’s look at a few examples.

First, many of the keywords and common data structures are abbreviations, such as fn, mut, pub, dyn, impl, mod, Vec, &str. Not only does your brain have to fill back in the full word from each abbreviation, but at least some of them are expected to be pronounced as though all the missing letters were present. For example, the Vec documentation says “written as Vec<T> and pronounced ‘vector’.” What’s the story here? Were the designers being charged by the keystroke? Just spell them out and let people’s editors do completion if it’s so horrifying to type a few extra characters (or should I say chars?). Even worse, some of the important keywords and operators mix these abbreviations with symbols, such as using &mut to mean a mutable reference.

Next up: Poorly-named common data types. For example, if I write let a = [1, 2 3], I would normally expect to get a vector initially containing 1, 2, and 3. Not in Rust! That syntax makes a into a read-only array containing exactly those three elements. If I want a to be modifiable after initialization, I have to write something like let a = vec![1, 2, 3] (which presumably generates some code to create a correctly-sized Vec and then copy the static array elements into it). Why would you reserve the shortest syntax for the less-useful fixed array?

Similarly, String represents an owned string, &String represents a reference to a string, and &str represents a read-only string slice. A slice of a vector (oops, Vec<T>) or an array is spelled &[T], so why is a slice of a String spelled &str? Like the previous point, this smacks of some kind of keystroke parsimony.

Why does calling a macro require !? One explanation I heard is that it’s a marker to indicate that extra code is going to be generated, but that doesn’t make sense. Calling a function runs an arbitrary amount of code, so why would I need extra scrutiny for the code generated by a macro? Don’t get me wrong: Rust macros are way better than C preprocessor macros. But from an ergonomics point of view, why would I want to write println!() instead of println()? Why would I want to have to remember that it’s a macro instead of a regular function call?

Next, Rust is expression oriented, but it still has statements. The only difference between an expression and a statement is the presence of a trailing semicolon. When you combine this with using the last expression in a block as the result of the block, this makes a mess of moving code around. Like the := operator in go, it’s very easy to break a working sequence of code by inadvertently introducing or removing a statement where an expression is needed, or vice versa. Also like :=, the compiler can nearly always tell you exactly what you did wrong and where to add (or remove) the semicolon. So why have this distinction at all? Why not make everything an expression?

There are a bunch of shorthands that save a few keystrokes at the expense of making the behavior less obvious for the reader. A couple of examples:

Using [3; 5] to mean [3, 3, 3, 3, 3]. I definitely see the use of reusing an initializer expression without repeating it, but who would ever guess what this means from reading it? Since when does a semicolon imply repetition? Python uses [3] * 5 for approximately the same thing and perl uses (3) x 5. Perhaps one of those could have been reused, or even something like [5 of 3].
Allowing struct { field } to mean struct { field: field }. This gets extra confusing if you mix local variables with explicit assignments; it’s really easy to mis-read it as an ordered list of field assignments instead of the implicit mapping. It also makes refactoring harder because renaming a variable or a struct member might unexpectedly require you to introduce the explicit field label.

Rust also reuses the same syntax to mean multiple things. Each of the features where this happens is separately useful, but why should they be spelled the same when they’re completely unrelated? For example, consider ..:

buf[..len] is a slice of the first len elements of buf.
struct { ..rhs } creates a struct with all the otherwise unset fields set from rhs
( x, .. ) inside a pattern match deconstructs the matched object to extract x and ignore the rest of the fields.

Conversely, it also provides mutiple syntaxes for the same thing. As far as I can tell, all of these produce an identical trait bound:

fn f(a: impl T)
fn f<U: T>(a: U)
fn f<R>(a: R) where R: T

As it turns out, all of those complaints are really syntactic complaints. Yes, they increase Rust’s already-steep learning curve. Yes, they put more burden on the reader to mentally simulate Rust’s rules instead of just reading what’s presented. Even so, I don’t think any of them actively thwart useful programming patterns. Even though I don’t love the syntax, I’m going to continue on and see if I can learn to like it. The borrow checker is very cool, and it’s nice to have both automatic memory management and deterministic variable lifetimes.