Skip to content

Commit

Permalink
Merge pull request #9 from orxfun/documentation
Browse files Browse the repository at this point in the history
documentation revised wrt LinkedList impl
  • Loading branch information
orxfun committed Jan 6, 2024
2 parents fdea967 + 153d7ce commit 9ba3256
Show file tree
Hide file tree
Showing 4 changed files with 23 additions and 305 deletions.
6 changes: 3 additions & 3 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
[package]
name = "orx-pinned-vec"
version = "1.0.0"
version = "1.0.1"
edition = "2021"
authors = ["orxfun <orx.ugur.arikan@gmail.com>"]
description = "`PinnedVec` trait defines the interface for vectors which guarantee that the elements are pinned to their memory locations unless explicitly changed by the caller."
description = "`PinnedVec` trait defines the interface for vectors which guarantee that elements are pinned to their memory locations with the aim to enable convenient self-referential collections."
license = "MIT"
repository = "https://github.com/orxfun/orx-pinned-vec/"
keywords = ["vec", "array", "vector", "pinned", "memory"]
categories = ["data-structures"]
categories = ["data-structures", "rust-patterns"]

[dependencies]
151 changes: 10 additions & 141 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# orx-pinned-vec

`PinnedVec` trait defines the interface for vectors which guarantee that elements are pinned to their memory locations unless explicitly changed by the caller, with the aim to enable convenient self-referential collections.
`PinnedVec` trait defines the interface for vectors which guarantee that elements are pinned to their memory locations with the aim to enable convenient self-referential collections.

## A. Implementations

Expand All @@ -15,7 +15,7 @@ There might be various situations where pinned elements are helpful.
* It is somehow required for async code, following [blog](https://blog.cloudflare.com/pin-and-unpin-in-rust) could be useful for the interested.
* It is crucial in representing self-referential types with thin references.

This crate focuses more on the latter. Particularly, it aims to make it safe and convenient to build **performant self-referential collections** such as linked lists, trees or graphs.
This crate focuses on the latter. Particularly, it aims to make it safe and convenient to build **performant self-referential collections** such as linked lists, trees or graphs.

As explained in rust-docs [here](https://doc.rust-lang.org/std/pin/index.html), there exist types `Pin` and `Unpin` for this very purpose. Through the theoretical discussions, one can easily agree on the safety. However, the solution is complicated with all words `PhantomPinned`, `NonNull`, `dangling`, `Box::pin`, etc. which are alien to the self-referential data structures we are trying to build.

Expand Down Expand Up @@ -49,142 +49,7 @@ The table below presents the complete list of traits which suffice to define all

## D. Safety

In order to safely build a self-referential collection, we have two requirements

* allowing for immutable growth, and
* disallowing non-growth structural mutations.

To give concrete examples to the arguments, we will use an example throughout the section. Assume we would like to build a tree where each node keeps a vector of references to its children (we omit the parent for brevity). Our tree will simply be a vector. Each node will be an element of this vector. However, vector is nothing but a storage. We will have our regular tree traversals through references.

We will need the index method and the push method of the vector:

* `fn push(&mut self, element: T)`.

The node will simply look as below, free of alien words. We need to add the lifetimes `'a`. However, all lifetimes will trivially be equal to each other since we know that the nodes will live together as elements of the same vector.

```rust
struct Node<'a, T> {
data: T,
children: Vec<&'a Node<'a, T>>,
}

impl<'a, T> Node<'a, T> {
fn new(data: T, children: Vec<&'a Node<'a, T>>) -> Self {
Self { data, children }
}
}
```

Finally, assume the tree we are trying to build is:

```text
a
/ \
b c
|
d
```


## D.1. Safety - Immutable Push

We need to be able to push to the vector with an immutable reference. It might (will) sound counter-intuitive. We will first discuss why it is necessary and then why it is okay (opinionated).

We immediately notice that we cannot start building the tree from the root 'a':

```rust
let mut tree = vec![];

let root = Node::new('a', vec![]); // ? root's children b and c do not exist yet!
tree.push(root);
```

So we start from the leaves:

```rust
let mut tree = vec![];

let d = Node::new('d', vec![]);
tree.push(d);

let ref_d = &tree[0];
let b = Node::new('b', vec![ref_d]); // so far so good

// tree.push(b); // this line won't compile!
```

According to the borrow checker, the reason we cannot push is that we are holding a reference to the `tree` which is `ref_d`, and we cannot mutate the tree unless we drop this reference.

But why?

There are two reasons:

1. **Correctness:** First, we are mutating a structure while holding a reference to it. This is bad, it might lead to wrong assumptions and bugs.
2. **Undefined Behavior:** Second, the underlying data of the vector can be copied around while increasing the vector capacity. This could cause `ref_d` to be a dangling reference.

Seperating these two issues is useful in understanding the reasons and providing a solution.

### C.1.a - Correctness

Here we ignore the dangling reference issue, as it will be discussed separately.

How bad of a mutation is pushing an element to a vector? Priorly defined and pushed elements will not change. Why is it different than defining a new variable in the scope?

There is actually one problem, aggregations. By pushing a new element to the vector:

* we change its `len`,
* we change the average value of some numeric property of its elements,
* etc.

If we rely on the values of these functions to remain unchanged without a `mut` reference, we would be wrong. On the other hand, such a `push` operation with an immutable reference fits well the builders or constructors where we keep creating things rather than performing computations on the created ones. This is the goal we want to achieve here.

### C.1.b - Undefined Behavior

This is very critical. Therefore, we must make sure that `push` operation does <ins>not</ins> cause implicit moves of the data. Every `PinnedVec` must provide this guarantee that all elements will be **pinned** while growing with methods such as push or extend.

[`FixedVec`](https://crates.io/crates/orx-fixed-vec) and [`SplitVec`](https://crates.io/crates/orx-split-vec) are two examples providing these guarantees.

### C.1.c - Building Acyclic Self-Referential Collections with Imp

As the requirement is established, assume that our hypothetical `Vec` provides us the immutable push method `fn imp(&self, element: T)`. Then, the following code would compile and allow us to build our tree.

```rust ignore
let tree = vec![];

tree.imp(Node::new('d', vec![]));
let ref_d = &tree[0];

tree.imp(Node::new('b', vec![ref_d]));
let ref_b = &tree[1];

tree.imp(Node::new('c', vec![]));
let ref_c = &tree[2];

tree.imp(Node::new('a', vec![ref_b, ref_c]));
```

This is already significantly simpler than alternatives such as `Pin`. With some work, we can get rid of the indexed-accesses and make the code more concise and less error prone (see [push_get_ref](https://docs.rs/orx-imp-vec/0.9.7/orx_imp_vec/struct.ImpVec.html#method.push_get_ref) method of [ImpVec](https://crates.io/crates/orx-imp-vec)).

Actually, immutable push method is sufficient to very easily build all acyclic self-referential collections such as directed acyclic graphs. It also covers the tree here.

When the collection has cyclic relations, such as regular trees where nodes have both children and parent information, or cyclic graphs, or linked lists, we need a little bit more book keeping. However, immutable push still provides the main convenience to build such data structures. See [orx_linked_list::LinkedList](https://crates.io/crates/orx-linked-list) for an example.

### C.1.d - Relation with [ImpVec](https://crates.io/crates/orx-imp-vec)

Note that `PinnedVec` does <ins>not</ins> provide the immutable push method. It only provides the required behavior.

Any `V` implementing `PinnedVec` can be wrapped by `ImpVec`. While wrapped, `ImpVec` safely allows the immutable push operation relying on the guarantees by the `PinnedVec`.

This relation allows for a clear separation of the building stage. For instance, one may

* create an empty `V: PinnedVec` (initiate)
* wrap it in `ImpVec` (enter building phase)
* build the self referential collection using the immutable push operation (build)
* unwrap the `ImpVec` and get back the built data structure as `V` (leave the building phase)

## D.2. Safety - Non-growing Mutations & Clone

With self referential collections, some other mutating methods can lead to critical problems as well. These are the methods which change positions of already pushed elements or remove elements from the vector:
With self referential collections, some mutating methods can lead to critical problems. These are the methods which change positions of already pushed elements or remove elements from the vector:

* `insert`
* `remove`
Expand All @@ -194,20 +59,24 @@ With self referential collections, some other mutating methods can lead to criti

These methods can invalidate the references among elements. Therefore, `PinnedVec` defines them as **unsafe**. One exception is the `clear` method which is safe since all elements are removed together with their references at once.

Finally, `clone` method as well is **unsafe**, since the elements of the clone would be referencing the elements of the original vector.
In addition, `clone` method as well is **unsafe**, since the elements of the clone would be referencing the elements of the original vector.

These are due to the fact that, naive implementations would cause false references. This does not mean that it is not possible to provide a safe implementation. Instead, it means that each data structure would need a different implementation (insert of a tree and linked-list cannot be implemented in the same way).
These are due to the fact that, naive implementations would cause false references. This does not mean that it is not possible to provide a safe implementation. Instead, it means that each data structure would need a different implementation (insert method of a tree and that of a linked-list cannot be implemented in the same way, they will need to update references differently).

Implementors can provide careful safe implementations, such as `orx_linked_list::LinkedList` safely implement `Clone`, although it uses any `PinnedVec` as the underlying storage.

Further, there are a few cases other than self referencing collections, where a `PinnedVec` is useful. And there is no reason to treat these methods as unsafe if the elements are not referencing each other. For this purpose, `NotSelfRefVecItem` marker trait is defined. This trait works as follows:
There are a few cases other than self referencing collections, where a `PinnedVec` is useful. And there is no reason to treat these methods as unsafe if the elements are not referencing each other. For this purpose, `NotSelfRefVecItem` marker trait is defined. This trait works as follows:

* if `V` implements `PinnedVec<T>`, and
* if `T` implements the marker trait `NotSelfRefVecItem`,
* => then, `V` also implements `PinnedVecSimple<T>` which provides the safe versions of the abovementioned methods.

`NotSelfRefVecItem` trait is implemented for most primitives; however, one needs to implement for new types to explicitly state that the type is <ins>not</ins> self-referential.

## E. Relation with the `ImpVec`

Providing pinned memory location elements with `PinnedVec` is the first block for building self referential structures; the second building block is the [`ImpVec`](https://crates.io/crates/orx-imp-vec). An `ImpVec` wraps any `PinnedVec` implementation and provides specialized methods built on the pinned element guarantee in order to allow building self referential collections.

## License

This library is licensed under MIT license. See LICENSE for details.
Loading

0 comments on commit 9ba3256

Please sign in to comment.