Rust Smart Pointers
Box<T>
have single ownersBox<T>
allow to store data on the heap.Box<T>
is a pointer, point to data on the heap.- When a box goes out of scope, it will be deallocated (both for the heap and stack).
- Implementing the
Deref
trait allows to customize the behavior of the dereference operator*
.
- Implementing the
Drop
trait, Rust automatically calleddrop
when instances went out of scope.- Not allowed to explicitly call
drop
.- Can drop a value early with
std::mem::drop
.- Variables are dropped in the reverse order of their creation.
Rc<T>
allows a single value to have multiple owners.Rc<T>
allows only immutable borrows checked at compile time.Rc<T>
only gives immutable access to that data.Rc<T>
only for use in single-threaded scenarios.Rc::clone
does not make a deep copy, it does not take much time.Drop
trait decreases the reference count automatically when anRc<T>
value goes out of scope
RefCell<T>
have single owners.RefCell<T>
allows immutable or mutable borrows checked at runtime.RefCell<T>
will panic at runtime when try to violate borrow rules.RefCell<T>
only for use in single-threaded scenarios.
1. Using Box to Point to Data on the Heap
Boxes allow to store data on the heap rather than the stack.
Will use them most often in these situations:
- have a type whose size can’t be known at compile time
- have a large amount of data and want to transfer ownership but ensure the data won’t be copied
- want to own a value and care only that it’s a type that implements a particular trait rather than being of a specific type
fn main() {
let b = Box::new(5);
println!("b = {}", b);
}
Define the variable b
to have the value of a Box
that points to the value 5
, which is allocated on the heap. When a box goes out of scope, as b
does at the end of main
, it will be deallocated. The deallocation happens both for the box (stored on the stack) and the data it points to (stored on the heap).
// This code does not compile
enum List {
Cons(i32, List),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let list = Cons(1, Cons(2, Cons(3, Nil)));
}
This code does not compile, Because Rust can’t figure out how much space to allocate for recursively defined types.
Put a Box<T>
inside the Cons
variant instead of another List
value directly. The Box<T>
will point to the next List
value that will be on the heap rather than inside the Cons
variant.
enum List {
Cons(i32, Box<List>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil))))));
}
Because a Box<T>
is a pointer, Rust always knows how much space a Box<T>
needs: a pointer’s size doesn’t change based on the amount of data it’s pointing to.
We now know that any List
value will take up the size of an i32
plus the size of a box’s pointer data. By using a box broken the infinite, recursive chain, so the compiler can figure out the size it needs to store a List
value.
Boxes provide only the indirection and heap allocation, they don’t have any other special capabilities.
The Box<T>
type is a smart pointer because it implements the Deref
trait, which allows Box<T>
values to be treated like references. When a Box<T>
value goes out of scope, the heap data that the box is pointing to is cleaned up as well because of the Drop
trait implementation.
2. Treating Smart Pointers Like Regular References with the Deref
Trait
Implementing the Deref
trait allows to customize the behavior of the dereference operator *
By implementing Deref
in such a way that a smart pointer can be treated like a regular reference, you can write code that operates on references and use that code with smart pointers too.
// This code does not compile
// Using Box<T> Like a Reference
struct MyBox<T>(T);
impl<T> MyBox<T> {
fn new(x: T) -> MyBox<T> {
MyBox(x)
}
}
fn main() {
let x = 5;
let y = MyBox::new(x);
assert_eq!(5, x);
assert_eq!(5, *y); // error: type `MyBox<{integer}>` cannot be dereferenced
}
MyBox<T>
type can’t be dereferenced because we haven’t implemented that ability on our type. To enable dereferencing with the *
operator, we implement the Deref
trait.
use std::ops::Deref;
impl<T> Deref for MyBox<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
&self.0
}
}
Without the Deref
trait, the compiler can only dereference &
references. The deref
method gives the compiler the ability to take a value of any type that implements Deref
and call the deref
method to get a &
reference that it knows how to dereference.
When we entered *y
, behind the scenes Rust actually ran this code *(y.deref())
.
The *
operator is replaced with a call to the deref
method and then a call to the *
operator just once, each time we use a *
in our code. Because the substitution of the *
operator does not recurse infinitely.
Implicit Deref Coercions with Functions and Methods
Deref coercion converts a reference to a type that implements the Deref
trait into a reference to another type. For example, deref coercion can convert &String
to &str
because String
implements the Deref
trait such that it returns &str
.
fn hello(name: &str) {
println!("Hello, {name}!");
}
fn main() {
let m = MyBox::new(String::from("Rust"));
hello(&m);
// If Rust didn’t implement deref coercion
// hello(&(*m)[..]);
}
When the Deref
trait is defined for the types involved, Rust will analyze the types and use Deref::deref
as many times as necessary to get a reference to match the parameter’s type. The number of times that Deref::deref
needs to be inserted is resolved at compile time, so there is no runtime penalty for taking advantage of deref coercion.
How Deref Coercion Interacts with Mutability
- From
&T
to&U
whenT: Deref<Target=U>
- From
&mut T
to&mut U
whenT: DerefMut<Target=U>
- From
&mut T
to&U
whenT: Deref<Target=U>
3. Running Code on Cleanup with the Drop Trait
Specify the code to run when a value goes out of scope by implementing the Drop
trait. The Drop
trait requires you to implement one method named drop
that takes a mutable reference to self
.
struct CustomSmartPointer {
data: String,
}
impl Drop for CustomSmartPointer {
fn drop(&mut self) {
println!("Dropping CustomSmartPointer with data `{}`!", self.data);
}
}
fn main() {
let c = CustomSmartPointer {
data: String::from("my stuff"),
};
let d = CustomSmartPointer {
data: String::from("other stuff"),
};
println!("CustomSmartPointers created.");
}
Rust doesn’t call drop
explicitly because Rust would still automatically call drop
on the value at the end of main
. This would cause a double free error because Rust would be trying to clean up the same value twice.
Dropping a Value Early with std::mem::drop
fn main() {
let c = CustomSmartPointer {
data: String::from("some data"),
};
println!("CustomSmartPointer created.");
drop(c);
println!("CustomSmartPointer dropped before the end of main.");
}
4. Rc, the Reference Counted Smart Pointer
Enable multiple ownership explicitly by using the Rust type Rc<T>
, which is an abbreviation for reference counting. The Rc<T>
type keeps track of the number of references to a value to determine whether or not the value is still in use. If there are zero references to a value, the value can be cleaned up without any references becoming invalid.
We use the Rc<T>
type when we want to allocate some data on the heap for multiple parts of our program to read and we can’t determine at compile time which part will finish using the data last. If we knew which part would finish last, we could just make that part the data’s owner, and the normal ownership rules enforced at compile time would take effect.
Rc<T>
is only for use in single-threaded scenarios.
Using Rc to Share Data
Trying to implement this scenario using our definition of List
with Box<T>
won’t work
// This code does not compile
enum List {
Cons(i32, Box<List>),
Nil,
}
use crate::List::{Cons, Nil};
fn main() {
let a = Cons(5, Box::new(Cons(10, Box::new(Nil))));
let b = Cons(3, Box::new(a)); // error: use of moved value: `a`
let c = Cons(4, Box::new(a)); // error: value used here after move
}
when try to use a
again when creating c
, not allowed to because a
has been moved.
enum List {
Cons(i32, Rc<List>),
Nil,
}
use crate::List::{Cons, Nil};
use std::rc::Rc;
fn main() {
let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
let b = Cons(3, Rc::clone(&a));
let c = Cons(4, Rc::clone(&a));
}
Change definition of List
to use Rc<T>
in place of Box<T>
, Each Cons
variant will now hold a value and an Rc<T>
pointing to a List
. When create b
, instead of taking ownership of a
, it will clone the Rc<List>
that a
is holding, thereby increasing the number of references from one to two and letting a
and b
share ownership of the data in that Rc<List>
. it will also clone a
when creating c
, increasing the number of references from two to three. Every time call Rc::clone
, the reference count to the data within the Rc<List>
will increase, and the data won’t be cleaned up unless there are zero references to it.
Could have called a.clone()
rather than Rc::clone(&a)
, but Rust’s convention is to use Rc::clone
in this case. The implementation of Rc::clone
doesn’t make a deep copy of all the data like most types’ implementations of clone
do. The call to Rc::clone
only increments the reference count, which doesn’t take much time. Deep copies of data can take a lot of time. By using Rc::clone
for reference counting, we can visually distinguish between the deep-copy kinds of clones and the kinds of clones that increase the reference count.
Cloning an Rc<T>
Increases the Reference Count
fn main() {
let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil)))));
println!("count after creating a = {}", Rc::strong_count(&a));
let b = Cons(3, Rc::clone(&a));
println!("count after creating b = {}", Rc::strong_count(&a));
{
let c = Cons(4, Rc::clone(&a));
println!("count after creating c = {}", Rc::strong_count(&a));
}
println!("count after c goes out of scope = {}", Rc::strong_count(&a));
}
At each point in the program where the reference count changes, we print the reference count, which get by calling the Rc::strong_count
function.
This code prints the following:
count after creating a = 1
count after creating b = 2
count after creating c = 3
count after c goes out of scope = 2
The Drop
trait decreases the reference count automatically when an Rc<T>
value goes out of scope. Using Rc<T>
allows a single value to have multiple owners, and the count ensures that the value remains valid as long as any of the owners still exist.
5. RefCell and the Interior Mutability Pattern
Interior mutability is a design pattern in Rust that allows you to mutate data even when there are immutable references to that data, To mutate data, the pattern uses unsafe
code inside a data structure to bend Rust’s usual rules that govern mutation and borrowing. Unsafe code indicates to the compiler that we’re checking the rules manually instead of relying on the compiler to check them for us
Enforcing Borrowing Rules at Runtime with RefCell
The RefCell<T>
type represents single ownership over the data it holds
With references and Box<T>
, the borrowing rules’ invariants are enforced at compile time. With RefCell<T>
, these invariants are enforced at runtime. With references, if you break these rules, you’ll get a compiler error. With RefCell<T>
, if you break these rules, your program will panic and exit.
Because RefCell<T>
allows mutable borrows checked at runtime, you can mutate the value inside the RefCell<T>
even when the RefCell<T>
is immutable.
RefCell<T>
is only for use in single-threaded scenarios and will give you a compile-time error if you try using it in a multithreaded context.
Mutating the value inside an immutable value is the interior mutability pattern.
Interior Mutability: A Mutable Borrow to an Immutable Value
// This code does not compile
fn main() {
let x = 5;
let y = &mut x; // error: cannot borrow as mutable
}
Using RefCell<T>
is one way to get the ability to have interior mutability, but RefCell<T>
doesn’t get around the borrowing rules completely: the borrow checker in the compiler allows this interior mutability, and the borrowing rules are checked at runtime instead. If violate the rules, will get a panic!
instead of a compiler error.
Exaple:
pub trait Messenger {
fn send(&self, msg: &str);
}
pub struct LimitTracker<'a, T: Messenger> {
messenger: &'a T,
value: usize,
max: usize,
}
impl<'a, T> LimitTracker<'a, T>
where
T: Messenger,
{
pub fn new(messenger: &'a T, max: usize) -> LimitTracker<'a, T> {
LimitTracker {
messenger,
value: 0,
max,
}
}
pub fn set_value(&mut self, value: usize) {
self.value = value;
let percentage_of_max = self.value as f64 / self.max as f64;
if percentage_of_max >= 1.0 {
self.messenger.send("Error: You are over your quota!");
} else if percentage_of_max >= 0.9 {
self.messenger
.send("Urgent warning: You've used up over 90% of your quota!");
} else if percentage_of_max >= 0.75 {
self.messenger
.send("Warning: You've used up over 75% of your quota!");
}
}
}
// This code does not compile
#[cfg(test)]
mod tests {
use super::*;
struct MockMessenger {
sent_messages: Vec<String>,
}
impl MockMessenger {
fn new() -> MockMessenger {
MockMessenger {
sent_messages: vec![],
}
}
}
impl Messenger for MockMessenger {
fn send(&self, message: &str) {
self.sent_messages.push(String::from(message)); // error: `self` is a `&` reference, so the data it refers to cannot be borrowed as mutable
}
}
#[test]
fn it_sends_an_over_75_percent_warning_message() {
let mock_messenger = MockMessenger::new();
let mut limit_tracker = LimitTracker::new(&mock_messenger, 100);
limit_tracker.set_value(80);
assert_eq!(mock_messenger.sent_messages.len(), 1);
}
}
Can’t modify the MockMessenger
to keep track of the messages, because the send
method takes an immutable reference to self
. Also can’t use &mut self
instead, because then the signature of send
wouldn’t match the signature in the Messenger
trait definition.
Using RefCell<T>
to mutate an inner value while the outer value is considered immutable.
#[cfg(test)]
mod tests {
use super::*;
use std::cell::RefCell;
struct MockMessenger {
sent_messages: RefCell<Vec<String>>,
}
impl MockMessenger {
fn new() -> MockMessenger {
MockMessenger {
sent_messages: RefCell::new(vec![]),
}
}
}
impl Messenger for MockMessenger {
fn send(&self, message: &str) {
self.sent_messages.borrow_mut().push(String::from(message));
}
}
#[test]
fn it_sends_an_over_75_percent_warning_message() {
// --snip--
assert_eq!(mock_messenger.sent_messages.borrow().len(), 1);
}
Keeping Track of Borrows at Runtime with RefCell
When creating immutable and mutable references, use the &
and &mut
syntax, respectively. With RefCell<T>
, use the borrow
and borrow_mut
methods, which are part of the safe API that belongs to RefCell<T>
. The borrow
method returns the smart pointer type Ref<T>
, and borrow_mut
returns the smart pointer type RefMut<T>
. Both types implement Deref
, so can treat them like regular references.
The RefCell<T>
keeps track of how many Ref<T>
and RefMut<T>
smart pointers are currently active. Every time call borrow
, the RefCell<T>
increases its count of how many immutable borrows are active. When a Ref<T>
value goes out of scope, the count of immutable borrows goes down by one. Just like the compile-time borrowing rules, RefCell<T>
lets us have many immutable borrows or one mutable borrow at any point in time.
If we try to violate these rules, rather than getting a compiler error as we would with references, the implementation of RefCell<T>
will panic at runtime.
// This code panics
impl Messenger for MockMessenger {
fn send(&self, message: &str) {
let mut one_borrow = self.sent_messages.borrow_mut();
let mut two_borrow = self.sent_messages.borrow_mut();
one_borrow.push(String::from(message));
two_borrow.push(String::from(message)); // error: already borrowed: BorrowMutError
}
}
Having Multiple Owners of Mutable Data by Combining Rc and RefCell
A common way to use RefCell is in combination with Rc. Rc lets you have multiple owners of some data, but it only gives immutable access to that data.
Using Rc<RefCell<i32>>
to create a List that we can mutate.
#[derive(Debug)]
enum List {
Cons(Rc<RefCell<i32>>, Rc<List>),
Nil,
}
use crate::List::{Cons, Nil};
use std::cell::RefCell;
use std::rc::Rc;
fn main() {
let value = Rc::new(RefCell::new(5));
let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil)));
let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a));
let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a));
*value.borrow_mut() += 10;
println!("a after = {:?}", a);
println!("b after = {:?}", b);
println!("c after = {:?}", c);
}
This code prints the following:
a after = Cons(RefCell { value: 15 }, Nil)
b after = Cons(RefCell { value: 3 }, Cons(RefCell { value: 15 }, Nil))
c after = Cons(RefCell { value: 4 }, Cons(RefCell { value: 15 }, Nil))
Have an outwardly immutable List value. But can use the methods on RefCell<T>
that provide access to its interior mutability so can modify data.The runtime checks of the borrowing rules protect us from data races, and it’s sometimes worth trading a bit of speed for this flexibility in our data structures.