見出し画像

#08 Ownership in Rust

Introduction

As I wrote in previous post, Rust has very unique features compared to other languages such as C, C++ and Java. Especially, ownership, which enables to make memory safety guarantees without needing a garbage collector. Thanks to this feature, programmer can write a code less care about memory. But gabage collector is not always best solution for memory management. So I will introduce ownership concept that Rust currently use to manage memory safely.

Memory, Memory management, Memory safety...

Before talking about ownership, I want to touch on memory briefly. You obviously know that memory is one of the important part of computer, that is used to store the data. Memory capacity is not infinite, especially RAM. If a software program keeps consuming the memory and is not freed, the program may ultimately crash itself or even the operating system. There is an interesting blog that tells roughly 70% of all serious security bugs in the Chrome codebase are memory management and safety bugs, according to Google engineers, and Microsoft also reports similar problem. That's why memory management is really important topic and all engineer have to care about that. However, it is really hard to think about memory during coding so ancient engineers have created two types of programming languages "Control First" focus and "Safety First" focus.

Control First

One of the reason why programmers can not avoid memory crashing is that some languages such as C, C++ can control memory manually, which are often called "unsafe" languages. Why are these languages called as a sad word? This is because these languages leave you in charge of freeing memory, your program's memory consumption is entirely in your hands. dangling pointers is one of examples.

  • Dangling Pointers
    A dangling pointer is a pointer that points to invalid data or to data which is not valid anymore, for example:

    1. Class *object = new Class(); Class *object2 = object; delete object; object = nullptr;

  • Double free bugs
    Double free bugs occur when the program tries to free a region of memory that has already been freed, and perhaps already been allocated again.

  • Certain kinds of memory leaks
    According to Wikipedia,

    1. A memory leak is a type of resource leak that occurs that occurs when a computer program incorrectly manages memory allocations in a way that memory which is no longer needed is not realeased.

Controlling memory manually is a nice feature if you never make mistakes, but as I paste the link above, there is no 100% sure. That's why we created another type of language.

Safety First

For example, when you write a program like JavaScript, Golang in web development, you don't have to much think about memory safety, or even rarely hear about that. This is because those programming languages have garbage collection. Garbage collection is a form of automatic memory management. The garbage collector attempts to reclaim memory which was allocated by the program, but is no longer referenced; such memory is called garbage. So programmer does not have to manually de-allocate memory, which helps a lot and avoid errors.
It seems very convenient and useful, right? However, relying on garbage collection means relinquishing control over exactly when objects get freed to the collector. In general, garbage collectors are surprising beasts, and understanding why memory wasn't freed when you expected can be a challenge.

How about Rust?

I introduced two ways of how manage memory. Wait, How about Rust? I have not mentioned anything about Rust yet. Rust took a third way by restricting how your programs can use pointers. The concept ownership is built into the language itself and enforced by compile-time checks. Rust introduced unique concept which is called ownership.

Ownership Rules

  • Each value in Rust has an owner

  • There can only be one owner at a time

  • When the owner goes out of scope, the value will be dropped

Every value has a single owner that determines its lifetime. When the owner is freed-dropped, in Rust terminology- the owned value is freed-dropped too. These rules are meant to make it easy for you to find any given value's lifetime simply by inspecting the code, giving you the control over its lifetime that a systems language should provide.

{                               // s is not valid here, it's not declared yet       
    let s = "Hello, World!";    // s is valid from this point forward
}                               // this scope is now over, and s is no longer valid
fn print_cities() {
    let mut nums = vec![1, 1, 1];               // allocated here
    for i in 3..10 {
        let next = nums[i-3] + nums[i-2];
        nums.push(next);
    }
    println!("{nums:?}"); 
}                                               // dropped here!
cargo run
# [1, 2, 3, 3, 5, 6, 8, 11, 14, 19]

stack frame holds these three

  • buffer (num's pointer)

  • capacity

  • length

heap holds

  • capacity

  • length
    Vector's buffer is allocated on the heap.

When the variable nums goes out of scope at the end of the function, the program drops the vector.

Stack and Heap

Those who don't know stack and heap, let me explain a little bit.
In short, the stack and heap are parts of memory available to your code to use at runtime, but they are structured in different ways.

Stack

A stack is a structure that represents a sequence of objects or elements that are available in a linear data structure. It means you can add or remove elements in a linear order. You can think stack is like stack of plates or books.... You can add and remove books from the top of the stack but you can't access any in the middle or the bottom. The stack works on the Last In First Out (LIFO) principle. You can only ever read the data from the item on the top the stack.

Heap

The heap is a section of memory that allows dynamic allocation of memory and is not bound by the same rules as the stack. Allocating and deallocating memory on the heap has a performance impact as to is not as quick as adding and removing items from the stack. But this is sutable for allocating memory to store a large amount of data.

Difference between stack and heap

FeatureStackHeapAccess SpeedFastSlowMemory AllocationHandled automatically by runtimeOnly automatically handled in high level languagesPerformance CostLessMoreSizeFixed SizeDynamic SizeVariable AccessLocal variables onlyGlobal variable accessData StructureLinear data structureHierarchical Data StructureMain issueSmall fixed amount of memoryMemory fragmentation over time

Moves

In Rust, for most types, operations like assigning a value to a variable, passing it to a function, or returning it from a function don't copy the value: they move it. So what does it mean? Let's see some code.

  let s = vec!["green tea".to_string(), "black tea".to_string(), "milk tea".to_string()];
  let t = s;
  let u = s;

Rust compiler does not allow you to compile this code. Because s variable has already moved into t variable. So it can not be used again to assign to u variable again. Since Rust compiler is very kind for explaining what kind of error.


  error[E0382]: use of moved value: `s`
      --> src/main.rs:21:13
        |
  15 |     let s = [
        |         - move occurs because `s` has type `[String; 3]`, which does not implement the `Copy` trait
  ...
  20 |     let t = s;
        |             - value moved here
  21 |     let u = s;
        |             ^ value used here after move
        |
  help: consider cloning the value if the performance cost is acceptable
        |
  20 |     let t = s.clone();
        |              ++++++++
         

In addition, compiler sometimes suggests some help to compile the code.

As compiler suggested for us, we can use clone() method.

let s = vec!["green tea".to_string(), "black tea".to_string(), "milk tea".to_string()];
let t = s.clone();
let u = s.clone();

Copy Types: The Exception to Moves

The example I have shown so far of values being moved involve vectors, strings, and other types that could potentially use a lot of memory and be expensive to copy. Moves keep ownership of such types clear and assignment cheap. But for simpler types like integers or characters, this sort of careful handling really isn't necessary.

  let string1 = "some value".to_string();   // allocate string1 to store "some value" on heap
  let string2 = string1;                    // move to string2
  
  let num1: i32 = 30;                       // allocate num1 to store 30 on stack
  let num2 = num1;                          // copy to num2

Moving a value leaves the source of the mvoe uninitialized. But whereas it serves an essential purpose to treat string1 as valueless, treating num1 that way is pointless; no harm could result from continuing to use it. The advantages of a move don't apply here, and it's inconvenient.
I will introduce new types called Copy types. Assigning a value of a Copy type copies the value, rather than moving it. The standard Copyt types include ![image](* machine integer and floating-point numeric types, the char and bool types, and a few others). A tuple or fixed-size array of Copy types is itself a Copy type.

Rc and Arc: Shared Ownership

Though most values have unique owners in typical Rust code, in some cases it's difficult to find every value a single owner that has the lifetime you need; you'd like the value to simply live until everyone's done using it. For these cases, Rust provides the reference-counted pointer types Rc and Arc.
Rc and Arc types are very similar; the difference between them is that an Arc is safe to share between threads count - whereas a plain Rc uses faster non-thread-safe code to update its reference count.

  use std::rc::Rc;
  
  let s: Rc<String> = Rc::new("milk tea".to_string());
  let t: Rc<String> = s.clone();
  let u: Rc<String> = s.clone();
   

Actually, Python uses reference counts to manage its value's lifetimes. Each of the three Rc s, t, u pointers is referring to the same block of memory, which holds a reference count and space for the String. Unfortunately, a value owned by an Rc pointer is immutable. So this code will be error when you want to compile it.

use std::rc::Rc;
 
let s: Rc<String> = Rc::new("milk tea".to_string());
s.push_str(" ,black tea"); // error: can not borrow data in an `Rc` as mutable
 

It is possible to leak vlues in Rust, but such situations are rare. Rust provides way to create mutable portions of otherwise immutable values; this is called interior mutability. I will cover it in the future's post.

Conclusion

As you can see, Rust has very unique features that other programming lanuage does not have. But this feature helps a lot for memory management and does not sacrifice efficiency like gabage collection. Thanks to it, learning Rust actually is very hard. I am too, honestly. I will try to explain what I have learned so far as much as I can.
Thank you for reading this blog. See you in next blog.

References