Cookies   I display ads to cover the expenses. See the privacy policy for more information. You can keep or reject the ads.

Video thumbnail
Hi everyone! Welcome.
Today, I'm here to try to convince you that Rust is something you should consider.
I'm not here to tell you that you must use Rust.
I'm not here to tell you that Rust is the right answer.
But I am trying to tell you about the ways in which you should consider Rust
like, why Rust is an option that is worth your thinking
and also some of the things that might make you hesitate
and things that you should think carefully about before adopting Rust.
Before we start I just want to say a little bit about myself
I am a PhD student at MIT. I work in the parallel and distributed operating systems group.
I am building a highly distributed database called Noria
that is about 80 thousand lines of Rust code
and I've been working on that for about five years, so the beginning of 2015.
I started building this in Rust when Rust was shortly after it's 1.0 release.
I've also done a lot of live streaming of Rust. I've done about 150 hours of building real Rust code online.
This includes things like porting the concurrent hash map from the Java standard library to Rust.
I also contribute to the Rust standard library, and I contribute to Tokio,
one of major Rust asynchronous runtimes.
You can find me a bunch of places on the web and you'll see that I have a bunch of experience with other
similar languages as well, like C++ and C. But also languages like Python and Java
which are often things that are being considered in the same breath as Rust.
So why are we here today?
Well we're here because you want to know what Rust is, in some sense.
and whether it's a viable candidate for whatever it is you might want to use it for.
And so in this talk I am going to give you some comparisons against these other languages,
at a relatively high level of what the strengths and weaknesses of Rust are
compared to those languages. I'll also tell you about some of Rust's features
and their advantages, and also some of Rust's drawbacks, and how you should think
about those in considering to adopt Rust
I'm also going to look at some of the long-term viability of using Rust. Rust is a young language,
and it's worthwhile to think about its future trajectory when deciding whether to use it for anything major.
I am NOT going to teach you Rust.
That is not going to be the goal of today's talk.
I will show you some code on certain slides and walk you through some of it to show particular features
but even though it's our natural instinct to try to understand all of the code,
I urge you to instead listen to the words that I say, rather than try to exactly understand how the thing works.
I'm happy to deal with that in the Q&A after.
So, first I want you to meet Rust. Rust the language.
I want you to know what it is. Rust is a systems programming language that was made by
Mozilla, and it had its 1.0 release about six years ago today.
and systems programming is not a very well-defined term, but you can think of it as
systems programming being for programs that care about their runtime.
and when I say runtime, I mean either the environment in which they're executing, or their performance, or both.
Rust has this slogan of fast, reliable, and productive, pick three. as sort of their their marketing pitch.
Or this notion of "fearless concurrency". And the basic premise here is that they want to argue
that other languages force you to make ... they make these trade-offs where you
give up one of these three, whereas in Rust you do not have to.
Rust is primarily community driven as a language.
It is developed out in the open. And even though it's sort of shepherded by Mozilla,
it is essentially the work of a large group of volunteers both associated with Mozilla and not.
And many people working for other companies.
For the technical bits, Rust is a compiled language.
There's no bytecode involved; it is actually compiled to machine code.
And it's a language that has strong static typing, so no typing occurs at runtime,
it's all statically checked by the compiler.
And it is imperative, but it does have some functional aspects to it.
So, similar to very functional languages you know of, like Haskell for example, that are purely functional,
Rust is an imperative language like Java, and C, and C++, but it does have some of
those functional aspects you can mix in if you so wish.
There's no garbage collection and no runtime in Rust,
and we'll get back to some of the implications of that later on.
Rust also has a very elaborate type system that they use to try to give a lot of guarantees at compile time
that other languages can only try to provide at run time. You'll see that Rust is a
relatively type rich language, where types are used for enforcing a lot of these guarantees statically.
So let's go through a quick comparison against the languages so you may have heard of
and how Rust differs from those languages.
The first among these is Python. If you compare Rust to Python, Rust is much faster.
And the primary reason for this of course is that Rust is a compiled language, not an interpreted language.
But in addition, the fact that there is no runtime means that there's much less overhead.
You see much lower memory use in Rust than you do in Python, and you also get things like multi-threading
much more easily than what you get in Python, where the sort of global executor
lock and these types of things get in the way of multi-threaded performance.
In Rust you also have these things called algebraic data types and something
called pattern matching, which I will get back to a little bit later. But these are
mechanisms that allow you to write really nice code even though you are
using types for everything. So the types do not so much get in your way as enable
you to do things you couldn't otherwise do.
Also because you have this comprehensive static typing in Rust,
you don't end up with as many runtime crashes as you often get in Python.
Rather than your program running for a while and then crashing
because you forgot to check whether something was None, or like something is the wrong type,
This just cannot happen in Rust.
In Rust, the compiler will check this at compile time — that your code does the right thing.
If we then take a step forward and compare against Java.
So, against Java, again, Rust has no runtime, and so there's no overhead from the JVM.
You see much lower memory use.
You also see generally higher performance, because you don't have to pay any of the costs that are
associated with the JVM and the interpretation of the bytecode.
Rust also has this notion of zero cost abstractions.
In Java, if you want to add an interface or you wanted to add additional classes,
often this comes at a runtime performance cost. Whereas in Rust many of these abstractions
are actually zero cost — they are compiled away. They go away at compile time,
and at runtime it's as if the abstractions weren't there, but you can still use them in your code.
Also in Java you may have seen things like the concurrent modification exception,
or in general just the notion of data races, that comes up a lot.
In Rust, these go away. Rust guarantees at compile time
that there are no data races, and we'll get back a little bit to exactly how that works.
So those kind of exceptions just cannot happen anymore.
Also this notion of pattern matching, which we'll get back to later, which enables you to use this
rich type system in a very intuitive and natural way.
Rust also has a unified build system that comes with the compiler tool chain.
So rather than have to set up like maven or ant or something like this,
the Rust build system — Cargo — just comes with all of this built-in, and all the libraries, all the tools,
use the one build system.
This also makes dependency management a lot easier, and I'll get back to some of the details,
for that later, but in general in Rust it is very easy to add external
dependencies whether those are in other libraries you have developed or someone
in your team has developed, or those that exist on the public Internet in some kind of repository.
And finally, the comparison against C and C++ is actually very straightforward.
There are no segfaults in Rust.
There are no buffer overflows in Rust.
There are no null pointers in Rust.
And there are no data races in Rust.
So a lot of the pain points that people run into with writing that kind of low-level native code,
just go away at compile time in Rust.
The powerful type system is also really nice.
You no longer end up with sort of void pointers everywhere like you often do in
C, or all of the nasty type classing you often write in C++. In Rust these are all
things that the language provides nicer abstractions for.
And again the unified built system and the dependency management means that it's really easy
to write modular code bases, and to not have to worry about like writing make
files and cmake files and ninja files and basil files and all these other
build things you often need to build in these large complicated C and C++ code bases.
So I want to include one more language in this list, and that is Go.
Go often comes up as a contender in these kind of settings because it in some sense feels similar to Rust.
It tries to take the existing sort of large languages and provide an alternative.
And the fact that they're used so heavily by Google, makes a lot of people want to use them and
feel like this is a language that should be well suited.
And Go is an excellent language for many reasons.
Compared to Rust though, again, we have this observation that in Rust there is no GC,
there is no runtime, and so Rust can naturally get higher performance than what you can get out of the Go code.
You also do not have null pointers in Rust, whereas you do in Go. In Go you have to
sort of manually remember to check that things aren't nil, and that there aren't errors.
In Rust, as I'll show later, this just goes away as a problem.
You have much nicer error handling in Rust as well,
whereas in Go you have to constantly repeat this pattern of checking for errors as you go through your code.
You have completely safe concurrency in Rust.
When you have multi-threaded code you will know that it is right. You will know that it has no data races.
Whereas in Go, even though concurrency is very easy,
that concurrency is also really easy to shoot yourself in the foot with.
And again, the stronger type system, the zero cost abstractions, the dependency management,
many of the same things we've seen for these previous languages apply to Go just as much.
So let's dive in a little bit more and look at what some of these features are.
What are these features that make Rust a good language when compared to these other languages.
What are features that Rust provide.
And the first thing is that Rust is a modern language.
And it's hard to emphasize exactly how important this is. Rust is a nice language to use.
It does not feel like a low-level systems language.
The three things I want to highlight here are the fact that there are nice and efficient generics.
You can think of this sort of as generics in Java or type classes in C and C++.
You have algebraic data types and pattern matching, which I'll show you
some examples of, and you also have a really modern tool chain that just ships
with the language. And so a lot of these pain points that exist with many of the
older languages are just gone in Rust.
Let's look at generics first.
So what I'm gonna show you is a bit of Rust code and it's a relatively straightforward method.
It takes a predicate which is going to be a function and it's gonna try to find
an item that matches that predicate in some implementation of a vector.
You'll notice first of all that this vector is generic over T, so the vector
can hold elements of type T where we don't specify in advance what that type
T is. The user gets to choose what that type T is. And similarly the find method
is generic over the type of the predicate.
And in particular you see that
the the predicate is of type function from a pointer to T to a boolean.
This is generic code, but what you should know about this in Rust is that this
code gets compiled as if the generics weren't there.
If you write this code and then someone calls find on a vector of u8s or a vector of some struct,
those will be compiled as entirely separate types,
they'll be optimized separately just like C++ type classes, but unlike Java generics.
You'll also see that in the body of the function, when we call if predicate,
that will actually get compiled with the function that is called in as the predicate.
All of the cost of that function call is something that the compiler can optimize, it can inline that function,
because it's compiled for each instance of T and P.
This means that you can have generics that effectively go away in terms of runtime performance.
They get optimized as if you had written the code without generics, leading to really fast runtime performance.
There are some other nice things about generics in this code so for example you see that I wrote for v in self.
This is because the MyVec type implements Iterator, and so you can just use the for keyword
to iterate over them. And you'll see this in a lot of Rust code as you start using it that there
are these behaviors you can assign to types, and this is similar to like C++
operator overloading or certain special interface types in Java, that the
language takes advantage of types that implement those traits.
The other thing you will see is that the return type of this function is Option reference to T.
It is not a pointer to T. It is an Option reference to T.
And you'll see that if the function finds something it returns Some and that thing. Otherwise it returns None.
I'll get back to why that is really exciting, but for now think of this as:
you can't get back a null pointer from this function it is not possible.
You must get back either Some with a pointer or None.
The second thing I want to talk about is this notion of algebraic data types, and pattern matching more generally,
and you may be familiar with this if you've worked with more functional languages is often where people see this.
Although I know Scala has a little bit of it, Ruby has a little bit of it, Python is getting a little bit of it.
The basic idea here is that you can define enumerations, or types, that contain other types.
So if you're used to enumerations from other languages, like an enum in Java or an enum in C, those enums can
just have a bunch of constants. They cannot contain other things whereas in Rust, they can.
The Option type is one example of this.
Option is an enum that is generic over some T, and it either contains the Some variant and that T, or
it contains None. If you're from sort of the C and C++ background, this is a tagged
union, but the compiler deals with all of the code for it for you. So in the case
of our find method, when we call it, the return type of that function is an
Option reference to T. This means you cannot just blindly use it as a reference to T.
You must use it as an Option reference to T.
You do not have a choice, and you can't skip the null check so to speak.
The language then gives you pattern matching so that you can match over types that are these enums.
So in this case we say if let's Some(f) is, and then a call to find, which the code inside that block
will only get executed if find indeed found something, and then f will be assigned to that value.
Similarly, we can define our own enums.
So here's an example of an enum for some decompression result.
And you'll see that one variant is Finished, and it contains
the sort of amount of bytes read or decompressed or something like that.
It contains one error variant that is an input error and one that is an output error.
And then if we call some function decompress that returns one of these,
if we use the match keyword to match over the result,
we can then sort of give patterns that we want to match the return type to
and these can nest arbitrarily if you have enums inside enums inside enums.
And it lets you sort of tease out only the specific values and variants that you care about.
More importantly, the compiler will check that you have exhaustively matched.
You cannot write one of these match statements and not include every variant.
This means that if someone adds a variant to an enum later on, your code will not break.
The compiler will tell you "you also need to update this match over here".
This is invaluable if you do things like large refactoring of your code, because in those cases, if someone
changes code or renames a type, the compiler will tell you all the places you need to conform to that change.
And this makes rewrites actually quite pleasant in Rust.
You make a change and then you follow the compiler errors until it compiles again.
I also want to talk about this notion of modern tooling.
I mentioned before that Rust comes with a build system that handles
dependencies for you, but it does a lot more. By default, in Rust, the compiler
knows about things like tests and documentation. For example, you can
take any function in your code and you can annotate it with this #[test] attribute
and then the compiler and the build system will know that this is a unit test.
That function, when you run cargo test, will run this as a test and if it fails then it will be considered a test failure
for that run.
It will also automatically get access to any local and private fields so that you can write it as a unit test.
You can also place these tests in separate files or outside your source directory,
and then they will be compiled as integration tests, which only have access to your public API.
And the compiler and build system know about all of this, and know how to run all of them.
Similarly the compiler knows about documentation. If you write anything
above a function or type or module that has three slashes instead of two,
it will be automatically compiled as the documentation for that item.
If you have code blocks inside of that documentation, they are automatically compiled as integration tests.
This means that you cannot have examples in documentation that are wrong,
because then the tests would not pass.
This is a really helpful thing and trying to ensure that your documentation is up to speed.
This is often why you will see Rust code pretty well documented
because the documentation also act as tests.
There's also a single tool that builds an HTML version of your entire documentation,
and it will interlink that with any dependencies you might have, all of the types will be interlinked
with one another, and it will show the result of all this kind of example code in documentation.
So we've talked a little bit about the ways in which Rust is a modern language and why it's nice to use,
but some of you might wonder well beyond it being nice to use, what does it give me?
What is the sort of selling feature of the language?
And one of the primary ones is that Rust gives you safety by construction.
The way to think about this is that it's harder to misuse, or mis-program, something written in Rust.
And there are three primary reasons for this.
The first is that pointers in Rust are checked at compile time for a number of different invariants.
The second is that Rust gives you thread safety at compile time.
This is the notion of no data races that I talked about.
And finally, in Rust, you cannot have hidden states.
I'll come back to exactly what that means, but things like null pointers I consider hidden states.
I'll show you later.
So let's first look at this notion of checking pointers at compile time.
Here's a bunch of code — I'm gonna walk you through it.
The Rust compiler ensures that every value in your program has a single owner
and that owner is responsible for freeing that resource, whatever it might be.
If it's heap allocated memory for example, then the owner is responsible for freeing that memory.
You can think of this in sort of C++ terms as RAII: Resource Acquisition Is Instantiation .. Initialization.
and the compiler, at compile time, checks the following two properties for every single variable that you have
every single value. First: there is only ever one owner.
There cannot ever be two owners of one value, and this means that if you ensure that this property holds
There cannot ever be double frees. It is just not possible.
For example, with the code like this where you have let x is a new vector
and then you move x into y
and then you try to drop x
the compiler would say this is not allowed; y is the owner of this value.
You are trying to drop a value that you do not own.
The second property that it checks is that no pointers outlive the owner.
That is, if the owner is moved or dropped, there cannot be references to it.
And if you guarantee this invariant, then you cannot ever have a use after free.
It is just not possible.
Here we have some code where I create a vector,
and then I create a reference to the first element in the vector
and then I move the vector.
So I move x into y, and then I try to use the pointer that I initially took into x.
The compiler will detect that this is a reference that lived past when its owner went away
and will reject the code as being invalid, because the pointer might at this point be dangling.
and the compiler will say this is not okay.
This is what is known as the borrow checker.
And some of you may have heard of this when hearing about Rust code in the past.
And it can be a little bit of a pain to get used to these rules
and to write your code in such a way that you guarantee this.
But by giving these guarantees, so many of the runtime problems and runtime data corruption
that you might see in compiled languages just go away.
There's more to this though. Pointers and variables in Rust by default are immutable.
Here I construct a vector, and then I print the length of the vector
that's all fine, and then I try to push to the vector.
That push will not be allowed by the compiler, because the vector is not marked as mutable.
Everything is immutable by default
And this also applies to methods.
Whenever you have a function; so here I have a function called accidentally_modify, and you'll see that it takes
a reference to a vector of i32s. It is allowed to print the length, but it is not allowed to call some other method
that requires mutable access to the vector, because it itself was not given mutable access
so it cannot give mutable access to anyone else.
Even the code that owns the value is not allowed to do this,
because the variable at the top was declared as not mutable.
You can think of this sort of like the const keyword in many other languages
but a little bit on steroids. In this setup the const sort of applies transitively;
that you cannot modify even anything that's reachable beyond.
This gives you much better guarantees about what people can and cannot do with values you give them.
You can safely give an immutable reference to some other code you don't know
and you know they can't modify it.
The other thing is this notion of thread safety, and this is often touted as one of Rust's primary
features, is the fact that you cannot have data races at runtime — it's just not possible.
and the reason for this is that Rust types know whether it's safe for them to cross thread boundaries.
They know whether it's safe for a value to be given away to another thread
or whether it's safe for two threads to access to the value at the same time.
For example, Rc and Arc are two reference counted wrapper types
sort of like smart pointers in C++ or every pointer in Java
Hmm, not quite true, but sort of true...
And the difference between these two types is that Rc does not use atomic CPU instructions
to manage its reference count. Whereas Arc does.
Rc is therefore not thread safe, whereas Arc is.
If you shared an Rc between threads, then they might race and overwrite each other's reference counts.
With Arc this cannot happen. And the compiler knows this — if you try to create an Rc of some value
and you try to spawn a thread and give it the Rc, the compiler will say no this value is not thread safe.
You're not allowed to do this.
If you try to do the same with an Arc, an A-Rc, then the compiler will let you do that because it is thread safe.
And this also applies to things like references. So, there's an additional
pointer rule that I did not mention to you earlier, which is that there can only ever
be either one mutable reference or any number of immutable references to any
given value at any given point in time. This ensures there can't be data races.
You can either have one writer, or you can have many readers, but you can never overlap them.
And the compiler checks this, even across thread boundaries. Here, for example, I have a mutable vector.
And then I spawn a thread, and I try to modify the vector in that thread
and I also try to modify it in the original thread.
The compiler will not let this code compile, even though vector itself is fine to send to another thread
a mutable reference to it is not.
Or it might be safe to send, but then I can't also try to use it in this thread.
It would detect that there's a potential for two mutable pointers to the same value and disallow this code.
In fact the guarantees are so strong that there's a library called rayon,
that is just a library you can choose to use that gives you parallel iterator operators.
This is something like you have a hash map or a vector or something like that and instead of iterating over it
in one thread, you want to iterate over it and do a map or something on all the values in parallel.
And rayon can, as a library, safely implement this for any type on which it's safe to do.
Because it knows about this inherent notion of whether a value is thread safe or not.
And the final property here is this notion of no hidden states.
And this will be particularly familiar if you come from Java.
If you come from C, C++, and Python, you will also recognize some of these patterns
but in Java it'll be particularly well felt.
And that is this notion of: Rust uses the type system to ensure that you check every case.
You do not have null pointers in Rust.
Anytime where you see a reference, it is guaranteed not to be null.
To the point where the compiler will make optimizations based on the fact that it is not null.
For example, we talked about the Option type. There's also a Result type, which is generic over two
types: the Ok type and the Err type. And basically any method that can fail returns one of these.
And when you get one, you can't just blindly use it as one type or the other. You have to explicitly match
on what type it was, and the compiler checks that you actually handled the result correctly.
This means not only can't you, but you don't have to, remember to put in null checks.
The compiler will force you to.
You cannot accidentally ignore an error; the compiler will force you to deal with that error.
You can choose to deal with that error by throwing it away, but you can't just accidentally forget to.
And so here we have some examples of this code. For example, if I call a find I can't just now use the v
as a reference to T, because the compiler will say it's type is Option. Its type is not reference.
Similarly, if I call something like I try to parse a string,
the result is a Result type that you need to deal with the error case of.
There's also this nice operator that's called a question mark, or the try operator. So, this operator is equivalent
to a match where if it's Err, then return that error, otherwise continue with the value inside of the Ok.
Sort of a short path for bubbling errors up to the caller.
This makes it really pleasant to write code the deals with errors in Rust.
And you'll find this if you try to run through some of the examples in the book,
this is actually really easy to handle even in large code bases.
That was the safety by construction argument and there are other languages that also provide this kind of safety
even in Java you might feel like you have some of these safety guarantees already
not all of them, but some. But the nice thing about Rust, at least in my eyes, is that it combines the safety
with the ability to work with really low-level details of how your program executes.
It sort of gets out of your way in many ways.
First of all, there's no GC or runtime — we'll talk a little about what that means.
You have really strong control over how the memory of your program works
and what its runtime performance profile is like.
And, you can write very low-level code in Rust
without having to escape to some other language or use FFI or something like that.
So first of all, the lack of a runtime, the lack of a garbage collector, buys you something really nice.
It buys you no garbage collection pauses, of course. It buys you lower memory overhead in general.
GC tends to add a decent amount of memory overhead, and some additional overhead to have control
instructions that the garbage collector relies on.
You can issue system calls directly — you can do things like fork and exec — which you often can't do in
managed languages because the runtime requires that it controls the runtime not you.
You can even run on a system that doesn't have an OS,
because there is no runtime, there's no code apart from your own that you need to deal with.
And so you can run on an embedded device if you so choose.
You also have the advantage you get essentially free FFI calls. These are calls to other languages through
some kind of API — you can link again C code — and it costs you nothing, because there is no runtime that you
need to inform of the fact that you're about to call out to some other language, which is often the case if you use
something like Java's JNI, or Python's C++ bindings, or even Go's cgo for example.
The other thing here — I mentioned control — and you get really low level control over both allocation and
dispatch in Rust. In Rust, you can write code like let x is 42 and let y is like some statically sized buffer
and they'll all be allocated on the stack, just like in C.
Rust will never automatically heap allocate anything for you unlike sort of Java or Go.
But you can opt into heap allocation if you wish.
There's the keyword Box.
The Box type is something that will call malloc for you, and when that type gets dropped it'll call free for you.
So this is one way to get heap allocated things, or if you declare a vector;
vectors, just like in C++, are also heap allocated, and can grow and shrink on the heap.
But crucially you can even swap out the entire allocator, globally, in your program if you wish.
You can implement a particular interface for some type that you control, and then you can make that be used by
anything that allocates in your program, whether those are boxes or vectors or hash maps or anything else.
So people have used this to use jemalloc, for example, or use Google's new tcmalloc or Microsoft's mimalloc,
and just try them out in Rust. It just works.
You can also opt into dynamic dispatch.
So I mentioned how, by default, we get this monomorphization as it's called:
when you have generic code, you get multiple copies of that code compiled for each type that's used.
You can tell the compiler: I would rather use a vtable, and use dynamic dispatch the same way that Java does
and pay that runtime performance cost, in order to have a smaller code footprint by opting into dynamic dispatch
And Rust leaves you in control of which of these you want to do.
And you can get really low-level with Rust code.
Not just that you have pointers, but you can do things like cook up a pointer from a number.
The kind of really low-level code that people often drop in to C or even assembly in order to write.
In Rust, you can do this. Rust has a keyword called unsafe,
and what unsafe is for, is for invariants that the compiler cannot check for you.
So in this case, by writing this unsafe block, I am telling the compiler that I know
that, at this point in time, I have exclusive access to the 80 by 24 bytes following that pointer.
That's what that unsafe block is asserting.
And the Rust compiler will go: "okay, you promised, so here is a mutable slice", and now you can do whatever
you want with that slice. Just as if you were in normal safe Rust code. And all of the normal guarantees hold
So once you leave the unsafe block,
you can use that to modify things, but the compiler will still prevent you from doing things like dereferencing
just like a random number as a pointer. It will prevent you from writing out of bounds of an array, like that
will still be bounds-checked. But you have this sort of escape hatch for when you need that low-level control.
The other advantage of having this unsafe keyword as an escape hatch, is that if you ever run into
memory errors during your run time, you know exactly what code to audit.
Similarly, just if you're given some random code base from someone else
you know that if you worry about memory corruption — which you generally always should — then the code
to look at is anything that contains unsafe. Any code that contains the unsafe keyword
that is where the danger zone is. But at least you know that ahead of time.
All the safe code, you know does not have that problem.
The safe code will always be memory safe.
And for those of you are wondering, yes, you can also write assembly code directly
because, it's a compiled language, it's given to LLVM, you have all of that control.
There's no runtime to say you can't.
So, what does this low-level control buy you?
Well, it gives you the advantage to write code that you would otherwise have to drop to C to do
but it also gives this really nice advantage of being able to be compatible with other languages.
And there are many ways in which Rust is compatible, but it's sort of …
… it's built to play nicely with others in many different ways.
First and foremost, you get zero overhead FFI — like I talked a little bit about before.
You also have great web assembly support for those of you who know about that space.
And it also works with many of your traditional tools you'll be used to from the C and C++
and to some extent Python and Java as well.
For the zero overhead FFI, take a look at this code.
This is Rust code that calls into a C function. So in this case, I say extern C, just like you would in C++.
I declare the method signature — the equivalent Rust method signature for the C method signature.
And then I can call it a few lines below.
I have to wrap that call in unsafe, because the Rust compiler doesn't know what the C code will do
— it might arbitrarily corrupt memory —
but as long as you assert that, yes, the C call is safe, then the Rust compiler will let you call it.
And when you compile this code, it will be as if there was no language boundary here.
This is going to be a straight up like …
ld is gonna link these two together, and they'll be co-optimized with like LTO.
It'll be no overhead to that call, beyond what you would get in calling a function in C as well.
Similarly, you can export Rust functions through some kind of C ABI.
I can declare a Rust function — it's with no_mangle here so the compiler makes the the name of the
function actually be the same in the binary — and then I can do whatever I want inside of that function; I'm just
writing normal Rust code. But it's callable from C.
I have to make sure that I use C compatible types in the arguments for example, but apart from that
this is just regular Rust code. And it doesn't even have to be unsafe,
because if I get called from C, at that point I'm in Rust world, and the Rust rules apply.
What is really nice about this is that this works with any language that can interact with a C ABI.
That includes C, and C++, and Java's JNI, and Go's cgo, and Python's C++ bindings, and Ruby has something similar.
All of these languages have this common, low-level native language to talk to one another, which is the C ABI
and Rust plugs into that, and with no overhead.
The other thing that's nice is that there are tools … the tools are called bindgen and cbindgen … that generate
these for you. So, in particular, with bindgen, you can take a C header file, you can call bindgen on it,
and it will give you Rust code that contains Rust equivalents of all of the C types, and Rust equivalents of
all of the C methods, and just generates sort of an API for you that is directly callable from Rust. Similarly, with
cbindgen, you can take a bunch of C methods like these,
and then it just generates a C header file for you, which you can then use from C or from any other language.
This also gives you great interoperability with something like Python… You use cbindgen to generate
C headers, and then you point Python at the C headers, and now you can call Rust code directly from Python.
The other thing that's nice here is that the FFI makes incremental rewriting
really easy. What you get, is the ability to take some module of your existing
project, implement it and Rust, and then just sort of swap it out.
If you have a C or C++ code base, then you just rewrite that module in Rust and
export it through the C ABI, or if you're writing Java code or Python code and you
want some module to be faster, you write it in Rust, and then you call it through
the JNI or the Python C++ interface or anything like that, and now you have
that fast module that you could just incrementally replace in your code base.
The other thing that's nice about compatability is, for those who you are
watching the web space, Rust has really good interoperability with something
called web assembly. Web assembly is an interesting new effort that's… you can
think of it a little bit like the JVM, but built into browsers. The idea is to
use the… sort of… use JavaScript as an assembly language for the interpreter
that the browser already has for native code. And that includes things like
sandboxing and cross-OS compatibility layers — it's a very interesting new space
that's now developing, and we're seeing a lot of cross-OS applications being
developed. And Rust has one of the best integrations for the web assembly there
is. Rust has a working group that works directly on this, and also contributes to
the specification, and so if you're in this space, Rust is great. You can even
take a Rust program, and you can export it directly as an npm module,
so that it's callable by nodejs code, and just importable normally from the npm dependency tree.
The final one here is that it works with traditional tools.
As developers, we've worked for years in this field of like: I have a program, I
have a problem, and I know the tools that I use to solve the problem. And so
switching language often has a pretty high cost, because the same tools might not work.
And that is true to some extent for Rust too — it is a different programming language after all —
but many of the tools you're used to will continue to work.
For example, perf still works.
Rust is just compiled using LLVM to machine code, and so there's no reason perf shouldn't work.
It generates DWARF symbols, that is all fine. gdb works. lldb works.
Which means that in most of your IDEs, your debugging is just gonna work just fine.
Valgrind works — you can do all the memory checking you're used to be able to do directly on a Rust binary as well.
LLVM sanitizers work — if you write really low-level concurrency code this can be nice to know.
Much of your security checking infrastructure, if you have some of that, all of that auditing can continue to work
because Rust just produces binaries that are indistinguishable from a C++ binary.
And this is really handy, especially in a large setting where you have large pieces of software and you want
automated tools to work on it, it's nice to be able to just slip in Rust code there, and have it work.
There's another aspect of this, which is, it's not just compatible with other languages,
but it also comes with really nice tooling that makes the move from whatever language you were used to before
to this new language — and also just working in that space — very comfortable. Rust comes with…
I mentioned some of the good modern tooling…
but it also comes with many of the things that we're used to from our existing languages.
These are things like dependency management, standard tools like formatters, and also…
I'll get into macros a little bit later because that's a particularly interesting tool that the Rust toolchain comes with.
So first of all, I mentioned that Rust supports this notion of dependency management as a built-in in the toolchain.
So here's an example of the kind of file you can write in a Rust project — it's called a Cargo.toml file — and it
specifies all of the dependencies of your project has.
Here, you'll notice that I have a dependency on the regex library version 1.3.3
the rayon library version 1.2, and a git dependency on some library called csv.
And what's really nice about this is the Rust compiler, when you compile your program, will automatically fetch
these dependencies, and build them for you. It knows about the dependency tree and how they integrate, and
ensures that it builds any given dependency only once, even if it's transitively included by different paths.
It also knows about versioning. So for example, if you specify that you have a dependency on regex 1.3.3,
it will happily download version 1.3.7, because by semantic versioning, that should be equivalent to the
one you have, but with bug fixes and such. It will also accept version 1.4, but not version 1.2. It will not accept
version 2, because in semantic versioning that is no longer compatible with the API you were working with.
And so it'll automatically let you use the most up-to-date
versions of your dependencies that your code is still compatible with.
Normally these are pulled from a central repository called crates.io, but you can also spin up your own
private repository if you have libraries, or "crates" in the Rust ecosystem lingo, that you don't want to be public to the world.
There's also support for having custom build steps, so these could be things like I depend on a C library
then you can write essentially a build file, which is also written in Rust, as a program that invokes cmake
or make or ninja or maven to build some artifact that your pipeline depends on.
Or even just to dynamically generate some of these dependencies.
The Rust toolchain also knows about the difference between different types of builds. By default it does a
debug build, with debug symbols. You can tell it to build an optimized build, a small build…
When you're in cargo test, it knows that there are some dependencies that might only be used for testing
and it'll only download and build those if you are running the tests, not if you're just building the library or binary.
All in all, this just makes for a very smooth experience whenever you're working with third-party code.
Or, code that's not your own, that depends on external code that you do not necessarily have control over.
The second point here are the standard tools that we've come to expect exist for our programming languages.
These all come shipped with Rust.
They're not third-party, they're just developed by the… not necessarily the same Rust team,
but they're developed within the Rust organization. Things like there's cargo
format, which is a standardized code formatter for Rust code.
There's cargo doc, which generates the documentation from the documentation comments I showed you earlier.
There's cargo clippy, which is a built-in linter that handles all possible types of linting
whether that is for safety or just being idiomatic.
There's also RLS and the rust analyzer, which give you really nice compiler integration for IDEs.
So you get like go to definition, auto-completion, and all of that stuff that you're used to.
One thing in particular I want to mention here is, because all of this is so integrated,
there's a website called docs.rs, which is a site that automatically generates all of the documentation for
every version of every library that is uploaded to the standard repository. And it's automatically interlinked
between all those libraries, and with the standard library, so if you upload a library to crates.io, your
documentation will automatically be built, and be made public, and any type you exposed there, you can just
click on the type and you'll be taken to the documentation of that type in some other library
or that type in the standard library. And all of this is integrated and just works out of the box.
And this means that the documentation experience in Rust is often extremely good
and one thing that people that come from other languages are amazed by.
And the last thing has a similar kind of flavor.
Rust has really good support for writing Rust programs that manipulate Rust programs.
This is often known as metaprogramming — being able to manipulate the AST, the syntax tree of your program.
In C++ or C, you might be used to these as macros, where you can #define basically anything to basically
anything else. But Rust macros are a lot more controlled than that. They're not this weird extra like replacement
language — they're fully fledged Rust programs that have well-defined syntax for transforming Rust trees.
For example, here's how you write the assert equals macro. Here I'm saying that this macro is gonna take
two arguments, and those arguments are both going to be parsed as expressions.
So they're not gonna be identifiers or types, they're going to be expressions.
And then this is the body — I'm just gonna assign the expression to one variable, then assign
the other expression to another variable, and if they're not the same, I'm gonna panic.
But, the nice thing about this is…
Rust enforces that the input is valid Rust.
The output is valid Rust.
And it checks it at compile time, and gives you nice errors if you do not follow those rules.
Also, the macros are what is known as "hygienic": any identifier you define inside the macro
cannot contaminate the space outside the macro.
And similarly, any identifier you have outside the macro, you cannot modify inside the macro.
Unless you pass it in.
For example, if the place where I called assert equal, I have a variable
called left, or a variable called right, this code will still work exactly the way I expected.
Because the macros just live in their own space; they're hygienic compared to the code they're called from.
Rust also has something called "procedural macros", where you write a Rust function from a stream of tokens
to a new stream of tokens. And this lets you do more elaborate rewritings of Rust code.
One of the really cool things you could do with this is, there's a library called "serde". So, "serde" is a library that
provides serialization and deserialization for any type automatically for any format.
Think about that for a second.
This is a library, you can take any type, like the type I've defined here, you can just say:
"Give me an implementation of serialization and deserialization for this type", and it will just do it.
It will generate the code to walk this type, and it will do so in a generic way where I can then take this type
and pass it to a JSON serializer, or a CSV serializer, or a binary encoding serializer, or a gRPC serializer
and it will just work.
And it does the same for deserialization.
It is magic the first time you use it. And it is implemented using these macros.
It is given as input this entire type's definition, and what it produces is essentially a visitor over all the fields
and it knows about all their types, and then it generates the
appropriate implementations to walk those fields for the given format.
Okay, so, that is the tooling that we see in Rust, and there's one last thing I
want to talk about when it comes to primary features of the Rust language.
And that is: Rust has built-in support for asynchronous code.
If you come from the language of Python you might think of something like twisted
that lets you write asynchronous code. In JavaScript, like promises.
I don't know whether Java has something similar, but I believe they do.
In C++ you also have std::future, which is sort of becoming a thing slowly but surely.
And the idea here is that you have some piece of code that you want to execute
but at various points it might block. And instead of it blocking, you want something else to run instead.
Often, the way this is implemented is using some kind of event loop. You can think of things like goroutines
for example, where you want many of these to run at the same time and cooperatively schedule with each other
without necessarily going to the kernel.
Rust has language level support for this, where the language has specific keywords
that lets you generate these types of functions. And what's neat about Rust compared to
other languages that have similar functionality, is that you can choose your own runtime.
There's not one runtime that's dictated by the language the way there is in nodejs or Go or Java.
Here, instead, you define what's known as an executor,
that can take these things from the standard library — these "futures" — and run them, and cooperatively schedule them.
And you can choose which one you like best.
Which one has the performance characteristics you need, or the operating system support you need.
This is a space that's still evolving — futures and executors landed relatively recently in the standard
library — but it's an ecosystem that's working very well.
And is surprisingly performant as well.
So we've talked so far about a number of the advantages and nice features that
Rust has. But I told you in the beginning of this talk that my goal is not to tell
you that Rust is the right language. My goal is to give you the information you
need to decide whether Rust is right for you. And as a part of that is talking
genuinely and frankly about some of Rust's drawbacks.
And so I'm gonna walk you through some of the common complaints we see about Rust,
and some of the reservations that people have about adopting Rust.
The first of these is the Rust learning curve.
Rust is a very different language from what you're used to.
The syntax isn't so different….
Like, the code I've shown throughout this I think you can generally read and understand what it does…
but the borrowed checker is a weird way to think about your code.
Very often, you'll hear the phrase "fighting with the borrow checker", and it's because it's a somewhat…
…different… way to reason about the behavior of the pointers in your program.
And oftentimes, the compiler will reject your code because it says
that it's unsafe, but you're not used to thinking about your code in that way.
Usually, the borrow checker is right, and you have a bug,
but it can be hard to learn to understand the borrow checker, and to work with it rather than against it.
In some sense, it can feel unfair, and hard, at times, even though it is trying to help you.
Rust also does not have an object-oriented model.
And for people who come from Python or C++ or Java — or many other languages like this —
who are used to thinking about classes and objects, this can be pretty alien.
It's a different way to write code.
You still have things like interfaces, although they have slightly different names and slightly different semantics
so you can express similar concepts, it's just that they don't work the same.
And this is often another barrier that you have to overcome in order to learn the language.
Mostly, these changes are made for very good reasons. Things like the zero cost abstractions we talked about,
and the ability to ensure thread safety at compile time. But the cost is that it takes longer to learn this language
than others you may have picked up over the years.
You can think of this as sort of upfront cognitive load; you have to spend more time getting your program
to compile. But when it compiles, it is more likely to run correctly.
And this is a shift in your mental model in programming that can be hard to get used to in the beginning.
The second is the ecosystem.
Rust is a relatively young language.
The ecosystem is somewhat small, and it has relatively few maintainers, although I will say
this is something that is changing. As we've seen increasing company support for Rust, and more
companies using Rust, the number of libraries is growing quite large and quite quickly.
The biggest place where we see this missing is in integration with enterprise libraries in particular.
Like integrations with things like enterprise databases like Oracle or DB2 and those sort of things
where the open source community don't generally need those bindings, while an enterprise community would.
There are some really high quality libraries out there though. For example, Rust has one of the best regular
expression libraries of all in its its crates.io ecosystem. Both in terms of performance and API and functionality.
And so it's a space that's definitely worth watching. I've mentioned some of the libraries already like
rayon and serde, that can do things you just couldn't do in a library in other languages.
And that is really exciting.
There's also, as an artifact of this, a lot of churn in the ecosystem at the moment.
Because many of these crates are in the early stages, the API's are changing, and you might have to work a
little bit to keep up. But we are seeing that too stabilize. And more and more of these libraries are now hitting
the 1.0, stable releases, stable APIs that you can then depend on.
I will also say, as a note on the ecosystem, the ecosystem is very friendly. In general I've found that
if you have trouble with Rust code, or if you're looking for a library, or you need support on a library, you will
usually find people very willing to help. You'll also find that the documentation is really good for most things.
Not all! But for most, for some of the reasons we've discussed already.
And that helps that transition into even relatively small and new libraries.
I've also touted this idea of no runtime as a benefit of Rust, and it really is, but there are some downsides to it.
The first of these is that when you don't have a runtime, you also don't get runtime reflection.
If your program is, like, in the middle of executing something, in Java you can just sort of attach to it
and then use the runtime to inspect various things about its execution.
In a program that doesn't have a runtime, you can't really do the same.
You can use gdb or lldb or any of these kinds of tools to try to tease out that information, but the lack of
a runtime means that that information is less rich. It also means that the program can't really introspect
itself at runtime. You don't have reflection on values in the same way you have in Java.
Oftentimes you don't need it because of the type system, but it is something to be aware of.
It also means you don't get runtime assisted debugging. Oftentimes you want a runtime there to help you inspect
values that you didn't know when you compile the program that you wanted to look at.
And you can get this, again, with gdb and those types of things, but it might be something you find that you miss.
So that's worth keeping in mind.
Compile times are another one — so, because Rust is a fully compiled language and uses LLVM, there's sort of a
long pipeline to get to the final binary, and everything has to be compiled.
The Rust compiler is getting pretty good at doing incremental compilation, so only compiling the things
that have changed. But you should be aware that the compile times are somewhat longer than what you
might be used to in something like Python, where there's just no compile time at all. Or if you come from Java or
C++ where many of the compilers have seen like decade's worth of optimization to make them really fast.
Rust still has a little bit to go there.
And part of the challenge here is, in Rust, there aren't really pre-built libraries you can download.
For many of these other ecosystems you can download like a .so or a .dll file and just link against it, and then
you're done. In Rust, you can't really do that — you sort of need to compile from source.
And the reason is because of things like generics. If I want to call a function in your library, and your library
function is generic, I need to compile the version of your method for the type I'm using. And that type might be
defined in my library. So I need the source in order to do that compilation. This is something where other
languages have similar problems. Swift, for example, recently has introduced a proposal for how to do this.
But it turns out it's really hard to build these pre-compiled artifacts when you have a generic language.
The upside of course is that when you compile it with generics, and you compile the source yourself, you can
often get optimizations that the compiler can do,
that they couldn't do if you just linked directly against a dynamically linked library.
Another one is vendor support.
Often, when you work with very large software projects you have dependencies on things that you didn't build.
That were just given to you by a vendor.
Intel has a library, or Oracle has a library, that you just have to use the way it is, and you can't rewrite it.
It's just a giant pile of C++, or even just a dynamically linked library file you just have to link with.
And there you run into some challenges doing that with Rust. The FFI interface works pretty well as long as the
library you get has an API that isn't too complicated. And you can always manually write the integration
if you wish. But sometimes, working with these large vendored things can be a bit of a pain. And wrapping
them can also be a bit of a pain. Although writing the glue is something that's possible, and also something
we've seen the ecosystem step up to. Often, for these vendor libraries, you will find people who have written
Rust libraries that are just sort of a shim to the C++ vendor library.
Similarly, if you work with a vendor that has some kind of tooling that they provide when you use their library
that tooling might not work if you're working through Rust code
because they might assume that you're just writing C++ code.
And so that's something worth keeping in mind if you are forced into a position where you have to
work with one of these giant piles of vendored stuff.
The last one here is Windows.
The Rust compiler and the Rust standard library have full Windows support.
And it's a tier one platform they have support for, and everything like that.
However the ecosystem, and the libraries in particular,
because it's mainly open source developers, is very Linux and macOS focused.
And there's relatively little support for Windows in at least some of these low level libraries.
For most of them, they just depend on the standard library, and get Windows support for free.
But for things that implement like, the asynchronous executors
or anything that interacts with the operating system directly, there you might find the Windows support
is a little bit more lacking if that is important for your particularly use case.
I want to end the presentation on a note about long term viability, because Rust is a relatively young language.
The ecosystem is young. And when choosing to adopt a language that's that young,
you want to think about whether it has the potential to keep going for a long time.
Is it sustainable to start using it for a big project you're considering?
And I would say that Rust is a pretty good bet into the future, and there are a couple of reasons for that.
First, it was rated the most loved programming language four years in a row in the StackOverflow
developer survey. That seems to be pretty promising.
Also, we've seen a lot of adoption by big companies in the recent years. This is everything from
Microsoft to Google to Facebook to Amazon to CloudFlare, Mozilla of course, Atlassian, even npm
are using Rust code in some of their stuff now.
And so this means that there's increasing company and enterprise buy-in into the language, which will both help
develop the enterprise part of the ecosystem, and help the language itself build more maturity over the years.
The fact that Rust has a really good interoperability story helps a lot too, because it means you don't have to
rewrite your project wholesale in Rust in order to get some of its benefits. You can rewrite that core
concurrency… that little core that needs to be highly concurrent in Rust, because you need the safety there
or you need the performance there, and then leave the rest of your application the way it was.
You can do this incremental rewrite, which helps reduce the risk a lot.
The increased company involvement in Rust itself is also pretty encouraging — we're seeing more developers
from these large companies help out with the standard library, help out with the compiler and the infrastructure
and that of course helps a lot too — it means that they have some buy-in into these continuing to be developed
and remaining and becoming even more high-quality.
And finally, the Rust, sort of, world is expanding a lot. There are now 10 yearly conferences spanning the globe
and more and more cropping up every year. There are just hundreds of meetups in big cities around the world
and so the developer community is expanding of people who either know Rust or are interested in Rust.
And I can say personally too, that Rust is the first language in many years that I'm excited to work with.
I actively want to rewrite old projects in Rust because I enjoy working with the language.
And that is not very common.
I also think that the Rust ecosystem has this really good focus on developer experience.
The rust compiler developers have been excellent at writing good compiler error messages.
This might seem silly and trivial, but it's just… it makes such a big difference, especially when you're new
to the language. The compiler will tell you what you did wrong, and how to fix it, rather than getting pages of
cryptic output from GCC, you now get a compiler that highlights your code and says
"that variable is misspelled, maybe you meant this" And that makes a huge difference when you're working
on a large refactor, or you're new to the language and not super familiar with the borrow checker.
This helps. And the fact of the rust team focuses on this is a pretty good sign.
And with that, thanks for listening. I hope you will consider Rust at least, and that's all I have. Thank you!