Why Rust
By Tomasz Kuczma
Usually, I’m skeptical about new technologies and programming languages. I take every novelty with a pinch of salt. Not because of the steep learning curve and lack of time but because I saw too many examples where promises were not delivered. The majority of the new technologies provided only minor improvements which were not really worth migration time.
When I heard about Rust for the first time, my feelings were the same: if you need strongly typed high-performance native language without GC, why not just modern C++11 (or newer) instead of a new language? Is it really worth creating a new language instead of providing a C++ library that can be added to existing projects? A few months ago, I finally dived deeper into Rust and started writing some code in it. Rust extremely positively surprised me and become my choice for high-performance applications. In this article, I would like to present to you my take on why Rust.
The right language to the right job
First of all, I have to present my 4 programming languages categories. I use them to decide what language is the right one to solve particular problem. Rust is no exception, so let me present it to set the tone for this article.
Those categories are:
No. | Category | Task description | Language of my choice | Other languages |
---|---|---|---|---|
1 | Quick scripting and automation | Write some mini program to automate things, repeatable actions. E.g. script that cleans dev-env and rebuilds the entire application, a script that automates builds and deployments, simple ETL pipeline, simple job that run once a day and check a few services, logs, etc. Also, ad-hoc analyze like logs and traffic analyze. | Python, sometimes Bash (mostly aliases) | Ruby |
2 | Quick and robust business logic implementation | Write enterprise application that needs to be available, maintained, and developed by a group of engineers. E.g. web application, banking application, server application, robust ETL pipeline with critical data | Java or Kotlin (maybe even Rust) | C#, Scala |
3 | High performance | The performance of the application is first-class citizen, one of the main requirements. E.g. low latency application like high-frequency trading, Linux kernel, system-level application, limited execution environment like embedded devices (Internet of Things), expensive execution environment like GPU computing (Deep learning), multimedia processing application (Youtube, gaming). Also, companies that run global scale systems (Google, AWS, Facebook) might need high performance to reduce the cost. Simply, even 1% of performance improvement gives huge savings if you multiple it by hundred of thousands of servers. | Rust (before C++) | C, Assembler |
4 | Don’t have a choice | Sometimes you don’t have a choice. Somebody (e.g. architect) forces some solution or you simply join the team that maintains and develops e.g. Java application. There is no option to drop 10 years of development in Java and start fresh in Rust or Python. Or, you use a framework/solution that supports only 1 language (e.g. Lua scripts in Nginx). Also, if you want to do something specific e.g. transform one XML into another, it might be a good solution to just go with XSLT instead of reinventing the wheel. | Lua, XSLT, AWK… |
Note that orders of those categories matter. Every category increase requirement significantly from “it does the job” through “being robust” and “high performant” to “its specific solution” and also increases development cost. I still have some problems deciding if Rust fully belongs to the second category but definitely belongs to the third. It might be something in between. You will get it in the end of this article.
Also, note that this table has a lot of personal preferences e.g. I didn’t put JavaScript there as I’m not a front-end engineer (for me it’d land in category 4, for others it’d be 2).
Why not modern C++?
Now, since you know that Rust is a language with high performance as a first-class citizen (no GC, compiled to native code), it is good to answer the question from the introduction. Is it worth mastering a new language instead of the modern C ++ that exists on the market for years? Where is the benefit?
The answer is yes as it addresses so many concerns in other programming languages (including C ++) that it really brings new quality. It is a major upgrade in the coding standards. It’s hard to explain all the reasons in just a few words as all of them are very mature decisions coming from experience. Moreover, all Rust’s benefits are provided by a few core features which testify to conscious design tenets and provide necessary simplicity. Let’s then start with the most obvious improvement over C++ - memory management.
Memory management in C++
Before I explain Rust memory management, you have to understand the problem with C++ one.
C++ uses RAII pattern.
In short, if you create an object in the scope, it will be destroyed at the end of the scope.
The compiler will take care of it (generate proper code). It also prevents resource leak (e.g. memory leak or lock leak) if an exception is thrown. The compiler will always generate the proper destruction code. E.g how std
library leverage RAII pattern in std::lock_guard
:
void my_function(int foo) {
static std::mutex mutex;
std::lock_guard<std::mutex> lock(mutex); // lock is automatically acquired as part of lock_guard constructor
std::string message = "Bar " + foo;
// critical section that can throw exception
// lock is automatically released here as part of lock_guard destructor even if exception is thrown
// no extra code is required
// message string destructor is also called here by compiler and memory is released
}
If you return a value from a function then it is copied to the new object (bytes are copied e.g. array is copied or structure is copied) or moved (array pointer is copied to the new pointer and old pointer is set to NULL
) before memory is free. Of course compiler has a field for optimization here.
In theory, everything works. Problems start when you need to manage memory manually via pointers for performance reasons.
There are few memory management bugs that can occur:
- Double free - you call
free
twice - Use after free - you
free
memory and then use it - null pointer - read value of
NULL
pointer (Java’s favorite) - use of uninitialized memory
- Memory leak -
free
was never called but you stopped using that memory so it’s still considered as used by your application from an OS point of view (note that this is not going to crash your app immediately, it “just” loses a resource over time which might be acceptable for some use cases).
Some would say that if you use references in C++ that wouldn’t happen. Nothing could be more wrong. All those bugs can happen while using references too. Moreover, even though technically null references are illegal by C++ standards, nothing prevents you from creating the following code:
int& as_ref(int* ptr) {
return *ptr;
}
int& null_ref = as_ref(nullptr);
Extreme code to demonstrate the point. I’d imagine that in production code, casting to reference is hidden in thousands lines of code
So in the end, it is the programmers’ responsibility to manage memory properly which might be not so simple task, especially in complex, concurrent code.
Memory management in Rust
Rust also uses the RAII pattern to manage memory but uses different terminology than C++.
There is always a single object that “owns” memory and this is where RAII is applied.
You can copy that object (and corresponding memory) so you have 2 owners but each owns only 1 memory area.
You can also move ownership to a different method (e.g. via return
).
Taking a reference to an object is called “borrowing” as this is de facto what you do - just borrow this memory area without taking ownership (and the corresponding obligation to free memory where it is not used).
Single owner solves “double free” and “memory leaks” problems.
What about “use after free”? Rust prevents the use of borrowed memory (reference) after the memory is free (owner is destroyed) with a mechanism called borrow checker (or lifetime). Compiler checks (based on variable scope) if the reference can live longer than the owner.
That’s it. This way Rust provides memory management at compile time! No garbage collection mechanism (like in Java) is used here and no extra code generated (e.g. smart pointers with reference counting). This is the biggest benefit of Rust. It moves the responsibility of memory management from humans (programmers and code reviewers) to machines (compiler) with zero overhead in produced code. It allows to fail fast code at compile time so in the very beginning of the release cycle instead of failing in production. This also moves code review to a higher level since you don’t have to look for memory management bugs.
Technically, Rust provides a way to access unsafe
raw pointers and some solutions for smart pointers (RC
and ARC
) but I’ll mention that later.
Myth: Pros can manage memory in C++
Some people would say that if you are a good programmer you can manage memory in C++ without bugs so Rust is not needed. Unfortunately, it is not what the data says.
Philipp Oppermann analyzed Linux kernel CVEs (Common Vulnerabilities and Exposures) in his presentation . In slides 13 and 14, you can see that almost half of the bugs are related to memory management.
Moreover, Nature (in this article ) says:
According to researchers at Microsoft, 70% of the security bugs that the company fixes each year relate to memory safety
IMHO, this myth is clearly busted :)
Rust design
Besides safe memory management Rust also have few other major features that make it a modern robust language.
Zero cost struct
Let’s start with something big. In Rust, you can create a new type (struct) to increase the safety and readability of your code without paying performance cost at runtime.
struct Pair {
x: i32,
y: i32
}
impl Pair {
fn sum(&self) -> i32 {
self.x + self.y
}
}
pub fn print_sum(x: i32, y: i32) -> i32 {
let p = Pair{x, y};
p.sum()
}
pub fn print_sum_raw(x: i32, y: i32) -> i32 {
x+y
}
In the above code, I created new type Pair
, constructed an object, and called a method that sums all the fields.
Rust compiles that exactly to the same native code as the “raw” version when I just use the +
operator.
Assembler code generated for this is:
example::print_sum:
lea eax, [rdi + rsi]
ret
example::print_sum_raw:
lea eax, [rdi + rsi]
ret
You can play with it by yourself in this online tool
.
Of course, gcc -O3
will do the same for C++ code
in this simple example.
That feature can prevent mistakes worth millions of dollars without any runtime cost and I mentioned Rust as a solution for them in one of my previous articles: How the type error cost NASA $ 327 million .
Note that Rust compiler is based on LLVM so all improvements made in LLVM will be noticeable in Rust programs too. The compiler can do tons of optimizations thanks to immutability (see next section) and a single-owner pattern. Rust compiler also prefers stack allocation whenever possible instead of the heap one.
Imutablitilty by default
Variables in Rust are immutable by default.
That aligns with a modern approach in programming and also increases code safeties and is checked by the compiler (e.g. compilation fails if you take mutable reference to an immutable object).
You have to add mut
keyword to create mutable variable.
let immutable_counter = 0;
immutable_counter += 1; // fails at compilation
let mut counter = 0;
counter += 1; // works fine
Non-nullability by default
This is the feature that I’ve also fallen in love with as an experienced Java engineer and the reason why I started looking into Kotlin in the past.
Rust does not have an abstraction of uninitialized memory (especially NULL
) as C++.
Simply, memory always has to be initialized before it is used and this is checked by the compiler.
So how can we express optional values?
Just as Option
:
let mut foo = None; // no value, equivalent of `Option::None`
foo = Some(10); // value 10, quivalent of `Option::Some`
It’s similar to Java’s Optional
but in the opposite to Java (and Kotlin’s platform type), you can never assign null
to the “normal” variable.
That gives extra safety and avoids hundreds of lines of “null checks” in every public method as I saw people do in Java (paranoid defensive programming).
Certainly, Optional
has been introduced to Java to give a clear way to express when a method returns “no result”.
But still, the compiler does not stop us from assigning null
to the Optional
in the “forbidden” by the specification way:
import java.util.Optional;
...
Optional<Boolean> foo;
foo = Optional.of(true); // state 1
foo = Optional.of(false); // state 2
foo = Optional.empty(); // state 3
foo = null; // state 4
At some point, somebody in rush can take a shortcut to express the 4th state with null
instead of using the new enum
type and the compiler will accept that.
In Rust, this is not possible.
Enums
In fact, mentioned above Option
is not a language keyword but just an enum type.
Enums in Rust are a mix of c-like union
and c-like enum
.
Option
is defined as:
pub enum Option<T> {
None,
Some(T),
}
So you have enumerated type with values None
and Some
but also if a value is Some
it has extra associated data.
It is not possible to read “T” value if Option
is None
.
In this example, tuple-based enum is used but struct-based enum (enum that stores struct
) is possible too.
If you ever tried to play with union
s in C/C++, you know that usually you also have to create an enum
to know what type is really stored in the union.
Rust does that in one type definition which also increases the safety of the code as you cannot interpret value incorrectly.
Exceptions and errors handling
Rust does not handle exceptions in the “classic” way known from C++ or Java.
All the errors (exceptions) have to be returned but the function and to achieve that Rust leverage enum (similar to Option
):
pub enum Result<T, E> {
Ok(T),
Err(E),
}
This aligns with Rust’s defensive approach as all errors need to be handled explicitly and there is no hidden unchecked exception like in Java.
It looks similar to Go lang approach but it is moved to the next level in my opinion.
Note that not every method needs to return Result
as not every method produces errors:
fn try_div(a: u32, b: u32) -> Result<u32, &'static str> {
if b == 0 {
Err("Division by zero")
} else {
Ok(a / b)
}
}
fn sub(a: u32, b: u32) -> u32 {
a - b
}
UTF-8 strings
Rust’s String
s are UTF-8 encoded.
That is a good choice from the performance of network-based applications point of view as this is how usually text data is transferred over the wire.
In the opposite to Java that keeps String
s UTF-16 encoded (or Latin-1 if you use compact strings since Java 9), Rust’s way helps avoid transcoding the representation from/to UTF-8.
It is also ASCII compatible!
Slices (C++’s array views, std::span, std::string_view)
Rust provides the concept of “slice” that looks very familiar to array views in C++ (implemented as std::span
and std::string_view
).
Basically, the idea is to provide a subview of an array so you can only borrow a fragment of the array (without copying memory).
It’s done by storing a pointer to the first element and length of the sub-array.
The simple use case is e.g. skipping prefix in the string or first array element, splitting a String
, grouping array elements into fixed batches, etc.
fn parse(input: &str) -> Result<(), &'static str> {
//input is a string slice
let split = input.split_once('=');
match split {
Some((left, right)) => {
//left and right are slices of input
... // do something with them
Ok(())
}
None => Err("Missing delimiter ="),
}
}
fn main() {
let result = parse("foo=bar");
}
A careful reader will notice that it is very useful for fast parsing data streams.
Macros made right
In the opposite to C++’s macros, Rust’s macros do not work just on the text.
That fits the defensive approach and helps to write robust metaprograms.
It is not a side feature!
It is leveraged in the core libraries.
Even println!
is a macro to make sure at compile time that the first argument has a proper format that matches the number of arguments.
Macros are also used to design DSL (domain specific language) in libraries like:
Look how simple is arguments parsing with proper structops
macros:
use structopt::StructOpt;
#[derive(Debug, StructOpt)]
#[structopt(name = "MyApp", about = "This is my super app")]
struct Opt {
#[structopt(short = "d", long = "debug")]
debug: bool,
#[structopt(short = "f", long = "file")]
input_file: String,
}
fn main() {
let opt = Opt::from_args();
println!("Parsed arguments {:?}", opt);
}
What I love about them is the simplicity of code generation similar to the annotation-based approach in Java (especially Lombok and Spring framework). You can read more about macros in the official doc .
Compiler & Documentation
Rust provides one of the best documentation I’ve ever seen. In fact, for the first time, I’m willing to look up some things in documentation instead of just “guessing” methods from IDE hints. The compiler also keeps the bar very high. It has meaningful error descriptions and hints on how to fix errors. It’s a completely different world comparing to a C++ compiler that can produce output looking like “random characters”, especially for templates.
Minor
There are also some minor things I like:
- Rust programs are drop-in replacements for C++ apps (native apps). If you can run c++ code you can replace that app with Rust one
- It is high-level object-oriented language that allows entering low level in a safe way when needed
- It is more likely that an average Rust app will be faster than C++ because you can experiment with optimization and fail safely at compile time instead of in production. Risk in case of failure is low so programmers should be more willing to optimize.
- Rust comes with a build and dependency management system named Cargo. It also has some default approaches for testing. Adding a new library is just 1 line. That definitely simplifies the work comparing to C++ and CMake (IHMO).
Grain of salt
I’m not going to lie. I’m really impressed with Rust and that is visible in this article. But still, it has some (small IHMO) drawbacks so let me mention them to be fair:
- Size of “hello world” program is 3.2 MB as Rust uses static linking by default (including Rust runtime). Fortunately, you can use compiler switch to bypass that:
cargo rustc --release -- -C prefer-dynamic
- Binding to C/C++ is not error-prone of course - mostly because of uncatched exceptions.
- Generally every leaving “safe” world is painful and of course risky. I experienced that in my program that uses low-level x11 xlib API).
- There is a steep learning curve to learn the Rust way - lack of classic inheritance model and exceptions know from C++ and Java.
Intentionally skipped
I intentionally skipped some Rust features that are worth mentioning if you are going to learn this language. If you do so, I would leave them for the end:
- raw pointers and
unsafe
- RC, ARC aka smart pointers (reference counting)
- asynchronous Rust and
Tokio
library
Who is using Rust?
It is always good to know if other people are looking into similar things. That helps judge if a technology is just a temporary trend or something bigger. Right now, many FAANG companies are using Rust or addopting it:
- Facebook used Rust to rewrite their code repository system: https://engineering.fb.com/2021/04/29/developer-tools/rust/
- Google is going to introduce support in Android: https://security.googleblog.com/2021/05/integrating-rust-into-android-open.html?m=1
- Tilde reported that they reduce memory consumption in one of their services from 5GB memory to 50MB by migrating from Java to Rust: https://www.rust-lang.org/static/pdfs/Rust-Tilde-Whitepaper.pdf
- Nature reports that scientists are moving into Rust for performance and safety: https://www.nature.com/articles/d41586-020-03382-2
It doesn’t look like temporary hype.
Summary
Rust is a language that delivers performance and provides multi-level safety at compile time at its core.
All the features are coherently designed around it (e.g. immutability by default, borrow checker mechanism, Option
and errors are enums). All of that compose a solid language that provides high-level mechanisms (metaprogramming, templates, objects, code generation) and performance (slices, zero cost structs) when needed.
Benefits:
- Strongly and statically typed
- Compiles to native code
- GC free, zero overhead safe memory management checked at compile time
- Embedded safety on multiple levels
- Zero runtime overhead on those mechanisms
In my opinion, Rust is the future of high-performance app development (language category 3) as it provides the safety that C++ is missing. It might also become a popular solution for quick and robust business logic implementation (language category 2) as it definitely has the potential to do so.
Let me know in the comments what do you think about Rust? Have you used it yet?
Software engineer with a passion. Interested in computer networks and large-scale distributed computing. He loves to optimize and simplify software on various levels of abstraction starting from memory ordering through non-blocking algorithms up to system design and end-user experience. Geek. Linux user.