Patch/pinned init v2 by BennoLossin · Pull Request #986 · Rust-for-Linux/linux

BennoLossin · 2023-03-13T17:24:38Z

No description provided.

nbdd0121 · 2023-03-17T01:19:13Z

+            fn drop($self: $st, only_call_from_drop: $crate::init::OnlyCallFromDrop) {
+                let _ = only_call_from_drop;


Suggested change

fn drop($self: $st, only_call_from_drop: $crate::init::OnlyCallFromDrop) {

let _ = only_call_from_drop;

fn drop($self: $st, _: $crate::init::OnlyCallFromDrop) {

nbdd0121 · 2023-03-17T01:33:50Z

+        @ty_generics($($ty_generics:tt)*),
+        @where($($whr:tt)*),
+        // We found a PhantomPinned field, this should generally be pinned!
+        @fields_munch($field:ident : ::core::marker::PhantomPinned, $($rest:tt)*),


Maybe instead of duplicating these checks, maybe just a single check for $($(::)? core::)? $(marker::)? PhantomPinned?

No body should be defining a custom type named PhantomPinned anyway.

they should not, but it might result from some macro that takes some name from somewhere etc. (yes this is pretty contrived, but I still want to catch the simplest cases). Ideally we could use some type system solution to catch exactly the right cases, but I could not come up with one.

nbdd0121 · 2023-03-17T12:59:29Z

Tip: you can use --fixup commit_hash when committing, which allows automatic squashing with git rebase --autosquash

Add the `quote!` macro for creating `TokenStream`s directly via the given Rust tokens. It also supports repetitions using iterators. It will be used by the pin-init API proc-macros to generate code. Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Benno Lossin <y86-dev@protonmail.com>

Adds the `assume_init` function to `UniqueArc<MaybeUninit<T>>` that unsafely assumes the value to be initialized and yields a value of type `UniqueArc<T>`. This function is used when manually initializing the pointee of an `UniqueArc`. Signed-off-by: Benno Lossin <y86-dev@protonmail.com>

This API is used to facilitate safe pinned initialization of structs. It replaces cumbersome `unsafe` manual initialization with elegant safe macro invocations. -- In this section the problem that the new pin-init API solves is outlined. For a more granular explanation and additional information on pinning and this issue, view [1]. Pinning is Rust's way of enforcing the address stability of a value. When a value gets pinned it will be impossible for safe code to move it to another location. This is done by wrapping pointers to said object with `Pin<P>`. This wrapper prevents safe code from creating mutable references to the object, preventing mutable access, which is needed to move the value. `Pin<P>` provides `unsafe` functions to circumvent this and allow modifications regardless. It is then the programmer's responsibility to uphold the pinning guarantee. Many kernel data structures require a stable address, because there are foreign pointers to them which would get invalidated by moving the structure. Since these data structures are usually embedded in structs to use them, this pinning property propagates to the container struct. Resulting in most structs in both Rust and C code needing to be pinned. So if we want to have a `mutex` field in a Rust struct, this struct also needs to be pinned, because a `mutex` contains a `list_head`. Additionally initializing a `list_head` requires already having the final memory location available, because it is initialized by pointing it to itself. But this presents another challenge in Rust: values have to be initialized at all times. There is the `MaybeUninit<T>` wrapper type, which allows handling uninitialized memory, but this requires using the `unsafe` raw pointers and a casting the type to the initialized variant. This problem gets exacerbated when considering encapsulation and the normal safety requirements of Rust code. The fields of the Rust `Mutex<T>` should not be accessible to normal driver code. After all if anyone can modify the fields, there is no way to ensure the invariants of the `Mutex<T>` are upheld. But if the fields are inaccessible, then initialization of a `Mutex<T>` needs to be somehow achieved via a function or a macro. Because the `Mutex<T>` must be pinned in memory, the function cannot return it by value. It also cannot allocate a `Box` to put the `Mutex<T>` into, because that is an unnecessary allocation and indirection which would hurt performance. The current solution was to split this function into two parts: 1. A `new` function that returns a partially initialized `Mutex<T>`, 2. An `init` function that requires the `Mutex<T>` to be pinned and that fully initializes the `Mutex<T>`. Both of these functions have to be marked `unsafe`, since a call to `new` needs to be accompanied with a call to `init`, otherwise using the `Mutex<T>` could result in UB. And because calling `init` twice also is not safe. While `Mutex<T>` initialization cannot fail, other structs might also have to allocate memory, which would result in conditional successful initialization requiring even more manual accommodation work. Combine this with the problem of pin-projections -- the way of accessing fields of a pinned struct -- which also have an `unsafe` API, pinned initialization is riddled with `unsafe` resulting in very poor ergonomics. Not only that, but also having to call two functions possibly multiple lines apart makes it very easy to forget it outright or during refactoring. Here is an example of the current way of initializing a struct with two synchronization primitives (see [2] for the full example): struct SharedState { state_changed: CondVar, inner: Mutex<SharedStateInner>, } impl SharedState { fn try_new() -> Result<Arc<Self>> { let mut state = Pin::from(UniqueArc::try_new(Self { // SAFETY: `condvar_init!` is called below. state_changed: unsafe { CondVar::new() }, // SAFETY: `mutex_init!` is called below. inner: unsafe { Mutex::new(SharedStateInner { token_count: 0 }) }, })?); // SAFETY: `state_changed` is pinned when `state` is. let pinned = unsafe { state.as_mut().map_unchecked_mut(|s| &mut s.state_changed) }; kernel::condvar_init!(pinned, "SharedState::state_changed"); // SAFETY: `inner` is pinned when `state` is. let pinned = unsafe { state.as_mut().map_unchecked_mut(|s| &mut s.inner) }; kernel::mutex_init!(pinned, "SharedState::inner"); Ok(state.into()) } } The pin-init API of this patch solves this issue by providing a comprehensive solution comprised of macros and traits. Here is the example from above using the pin-init API: #[pin_data] struct SharedState { #[pin] state_changed: CondVar, #[pin] inner: Mutex<SharedStateInner>, } impl SharedState { fn new() -> impl PinInit<Self> { pin_init!(Self { state_changed <- new_condvar!("SharedState::state_changed"), inner <- new_mutex!( SharedStateInner { token_count: 0 }, "SharedState::inner", ), }) } } Notably the way the macro is used here requires no `unsafe` and thus comes with the usual Rust promise of safe code not introducing any memory violations. Additionally it is now up to the caller of `new()` to decide the memory location of the `SharedState`. They can choose at the moment `Arc<T>`, `Box<T>` or the stack. -- The API has the following architecture: 1. Initializer traits `PinInit<T, E>` and `Init<T, E>` that act like closures. 2. Macros to create these initializer traits safely. 3. Functions to allow manually writing initializers. The initializers (an `impl PinInit<T, E>`) receive a raw pointer pointing to uninitialized memory and their job is to fully initialize a `T` at that location. If initialization fails, they return an error (`E`) by value. This way of initializing cannot be safely exposed to the user, since it relies upon these properties outside of the control of the trait: - the memory location (slot) needs to be valid memory, - if initialization fails, the slot should not be read from, - the value in the slot should be pinned, so it cannot move and the memory cannot be deallocated until the value is dropped. This is why using an initializer is facilitated by another trait that ensures these requirements. These initializers can be created manually by just supplying a closure that fulfills the same safety requirements as `PinInit<T, E>`. But this is an `unsafe` operation. To allow safe initializer creation, the `pin_init!` is provided along with three other variants: `try_pin_init!`, `try_init!` and `init!`. These take a modified struct initializer as a parameter and generate a closure that initializes the fields in sequence. The macros take great care in upholding the safety requirements: - A shadowed struct type is used as the return type of the closure instead of `()`. This is to prevent early returns, as these would prevent full initialization. - To ensure every field is only initialized once, a normal struct initializer is placed in unreachable code. The type checker will emit errors if a field is missing or specified multiple times. - When initializing a field fails, the whole initializer will fail and automatically drop fields that have been initialized earlier. - Only the correct initializer type is allowed for unpinned fields. You cannot use a `impl PinInit<T, E>` to initialize a structurally not pinned field. To ensure the last point, an additional macro `#[pin_data]` is needed. This macro annotates the struct itself and the user specifies structurally pinned and not pinned fields. Because dropping a pinned struct is also not allowed to break the pinning invariants, another macro attribute `#[pinned_drop]` is needed. These two macros also have mechanisms to ensure the overall safety of the API. Additionally, they utilize a combined proc-macro, declarative macro design: first a proc-macro enables the outer attribute syntax `#[...]` and does some important pre-parsing. Notably this prepares the generics such that the declarative macro can handle them using token trees. Then the actual parsing of the structure and the emission of code is handled by a declarative macro. For pin-projections the crates `pin-project` [3] and `pin-project-lite` [4] had been considered, but were ultimately rejected: - `pin-project` depends on `syn` [5] which is a very big dependency, around 50k lines of code. - `pin-project-lite` is a more reasonable 5k lines of code, but contains a very complex declarative macro to parse generics. On top of that it would require modification that would need to be maintained independently. Link: https://rust-for-linux.com/the-safe-pinned-initialization-problem [1] Link: https://github.com/Rust-for-Linux/linux/blob/f509ede33fc10a07eba3da14aa00302bd4b5dddd/samples/rust/rust_miscdev.rs [2] Link: https://crates.io/crates/pin-project [3] Link: https://crates.io/crates/pin-project-lite [4] Link: https://crates.io/crates/syn [5] Co-developed-by: Gary Guo <gary@garyguo.net> Signed-off-by: Gary Guo <gary@garyguo.net> Signed-off-by: Benno Lossin <y86-dev@protonmail.com>

Add helper functions to more easily initialize `Opaque<T>` via FFI. These functions take a function pointer to the FFI-initialization function and take between 0-4 other arguments. It then returns an initializer that uses the FFI function along with the given arguments to initialize an `Opaque<T>`. Signed-off-by: Benno Lossin <y86-dev@protonmail.com>

`UniqueArc::try_new_uninit` calls `Arc::try_new(MaybeUninit::uninit())`. This results in the uninitialized memory being placed on the stack, which may be arbitrarily large due to the generic `T` and thus could cause a stack overflow for large types. Change the implementation to use the pin-init API which enables in-place initialization. In particular it avoids having to first construct and then move the uninitialized memory from the stack into the final location. Signed-off-by: Benno Lossin <y86-dev@protonmail.com>

BennoLossin · 2023-03-22T16:40:50Z

continued at #989

BennoLossin mentioned this pull request Mar 13, 2023

Patch/pinned init v1 #985

Closed

ojeda reviewed Mar 13, 2023

View reviewed changes

Comment thread rust/macros/quote.rs Outdated

Comment thread rust/macros/quote.rs Outdated

BennoLossin force-pushed the patch/pinned-init-v2 branch 3 times, most recently from 747af8a to deae20e Compare March 16, 2023 09:45

nbdd0121 reviewed Mar 17, 2023

View reviewed changes

BennoLossin force-pushed the patch/pinned-init-v2 branch 4 times, most recently from 4e9b017 to e618adb Compare March 21, 2023 18:18

nbdd0121 and others added 5 commits March 21, 2023 20:08

BennoLossin force-pushed the patch/pinned-init-v2 branch from e618adb to 82c88d8 Compare March 21, 2023 19:09

BennoLossin closed this Mar 22, 2023

BennoLossin deleted the patch/pinned-init-v2 branch September 14, 2023 11:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Patch/pinned init v2#986

Patch/pinned init v2#986
BennoLossin wants to merge 5 commits intoRust-for-Linux:rust-nextfrom
BennoLossin:patch/pinned-init-v2

BennoLossin commented Mar 13, 2023

Uh oh!

Uh oh!

Uh oh!

nbdd0121 Mar 17, 2023

Uh oh!

nbdd0121 Mar 17, 2023

Uh oh!

BennoLossin Mar 17, 2023

Uh oh!

nbdd0121 commented Mar 17, 2023

Uh oh!

BennoLossin commented Mar 22, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

		fn drop($self: $st, only_call_from_drop: $crate::init::OnlyCallFromDrop) {
		let _ = only_call_from_drop;

	fn drop($self: $st, only_call_from_drop: $crate::init::OnlyCallFromDrop) {
	let _ = only_call_from_drop;
	fn drop($self: $st, _: $crate::init::OnlyCallFromDrop) {

Conversation

BennoLossin commented Mar 13, 2023

Uh oh!

Uh oh!

Uh oh!

nbdd0121 Mar 17, 2023

Choose a reason for hiding this comment

Uh oh!

nbdd0121 Mar 17, 2023

Choose a reason for hiding this comment

Uh oh!

BennoLossin Mar 17, 2023

Choose a reason for hiding this comment

Uh oh!

nbdd0121 commented Mar 17, 2023

Uh oh!

BennoLossin commented Mar 22, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants