Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow files to share state. #145

Merged
merged 3 commits into from
Apr 8, 2021
Merged

Conversation

wedsonaf
Copy link

chrdev will need to be updated once we decide how we want it to share state. For now, it doesn't allow sharing.

drivers/char/rust_example.rs Outdated Show resolved Hide resolved
drivers/char/rust_example.rs Outdated Show resolved Hide resolved
drivers/char/rust_example.rs Outdated Show resolved Hide resolved
@ojeda
Copy link
Member

ojeda commented Mar 25, 2021

Can you please give some context / explanation to the changes?

@wedsonaf
Copy link
Author

Can you please give some context / explanation to the changes?

Sure. Consider the example that I added: how would you implement it without my changes? The only way to do it would be with global variables, and even then it wouldn't be great because if we registered more than one device they'd all share the global state.

My change allows us to have per-device state: when we register a device, we provide some state; then each file gets a reference to the shared state on its open method (and does with it whatever it wants).

This is a very common pattern in the kernel, and each type of device saves this shared state in a different place. For example, miscdev uses file->private_data, while chrdev uses inode->i_cdev to store pointers to miscdev and cdev. These are usually embedded in an outer struct, so using container_of we can access the shared state.

In C, this is all untyped and one could easily mess this up. In Rust, the unsafe code is wrapped and it exposes a safe API.

In binder, we need this because each device is a "universe" -- processes that can access a device can all do IPC amongst themselves, totally independently of what's happening in another device (e.g., normally we have three instances: binder, hwbinder, and vndbinder).

}

impl SharedState {
fn try_new() -> KernelResult<Arc<Self>> {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alex -- this is an example of the ugliness brought by Pin I mentioned in one of the meetings. A bunch of unsafe that wouldn't need to be unsafe if we had better support for pinning. Two aspects are relevant here:

  1. Initialisation: CondVar::new and Mutex::new are unsafe because they require a call to init after the condvar/mutex are in their final pinned location.
  2. Projection: the compiler can't (or won't) verify for us that we don't move out of the fields of a pinned structure, so we need unsafe. (Though admittedly I don't explicitly say that state is pinned, mostly because it doesn't give me anything.)

It would be nice if we could avoid this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. When you say "we had better support for pinning", do you think this is something that Rust-for-Linux can improve itself, or do you think this requires better language level support?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we could choose to ignore Pin and assume that everything that is heap-allocated is pinned, but my opinion is that this isn't desirable because Pin is actually useful. (We sort of discussed this in the meeting.)

So I think we need better language support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshtriplett as our resident friend upstream, what's the best way for us to share these observations/desires with the team?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RalfJung -- since you commented on something else related to this project, I thought I'd ask you if you have any thoughts about improving pinning support. As it is today, it forces us (unreasonably in my opinion) to use too much unsafe code.

Copy link
Member

@bjorn3 bjorn3 Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Projection: the compiler can't (or won't) verify for us that we don't move out of the fields of a pinned structure, so we need unsafe. (Though admittedly I don't explicitly say that state is pinned, mostly because it doesn't give me anything.)

The pin-project crate has a proc macro to help with this.

Copy link

@RalfJung RalfJung Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that writing code with pin projections currently requires more unsafe code than it should. I am not sure what the best way forward for that is; the pin-project crate mentioned by @bjorn3 is a great start showing how far one can go even without any language support. (I did not carefully audit it though.)

The "new and then init" pattern seems mostly orthogonal to pinning; that pattern inherently requires exposing the user to an unsafe pre-init state. If you want to expose that state safely, you need to give it a separate type.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you commented on something else related to this project, I thought I'd ask you

Sure, feel free to. :) I don't have a ton of time these days, but I am super excited about Rust having a realistic chance of being used in the upstream Linux kernel, so if I can help in small ways I'm more than happy to do that.

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had one pass through this, and I'm still working to make sure I fully understand it, to be honest.

One small thing though -- do the changes to make read/write return counts need to be a part of this? Could we split them out, just to reduce the review surface area.

rust/kernel/miscdev.rs Show resolved Hide resolved
@wedsonaf
Copy link
Author

I've had one pass through this, and I'm still working to make sure I fully understand it, to be honest.

Thanks for reviewing. I'm happy to try to clarify if there's anything that doesn't make sense.

One small thing though -- do the changes to make read/write return counts need to be a part of this? Could we split them out, just to reduce the review surface area.

It is needed by the example. I could split out the example if you think it'll help, although it would be counter to what reviewers have been asking -- they always [rightfully] ask for example usage.

@alex
Copy link
Member

alex commented Mar 30, 2021 via email

@wedsonaf
Copy link
Author

Can we split the baby :-) The changes for read/write to return usize into their own PR, but keep the core changes to the example that demo the shared state here.

Sure, give me a few minutes.

@wedsonaf
Copy link
Author

wedsonaf commented Apr 1, 2021

I'm splitting the last commit (or most of it) into its own PR to simplify things here.

@wedsonaf
Copy link
Author

wedsonaf commented Apr 1, 2021

Actually, I'm going to split it into more PRs.

@wedsonaf
Copy link
Author

wedsonaf commented Apr 1, 2021

Only 37ee4f8 and c8e1fdb are part of this PR now.

@wedsonaf wedsonaf requested a review from alex April 1, 2021 17:25
Copy link
Member

@ojeda ojeda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple bits that I noticed...

rust/kernel/file_operations.rs Outdated Show resolved Hide resolved
drivers/char/rust_example.rs Show resolved Hide resolved
rust/kernel/file_operations.rs Show resolved Hide resolved
@wedsonaf
Copy link
Author

wedsonaf commented Apr 7, 2021

@ojeda @alex -- is there anything about this that is bothering you or doesn't sound right? I'm happy to discuss if that's the case.

@alex
Copy link
Member

alex commented Apr 7, 2021

I lost track of the fact that I owed you a re-review. Added to my TODO list for this evening.

@ojeda
Copy link
Member

ojeda commented Apr 7, 2021

is there anything about this that is bothering you or doesn't sound right?

Not at all! Just marked things as resolved (I usually let the PR sender to resolve them).

@alex
Copy link
Member

alex commented Apr 7, 2021

@ojeda did you want to take another pass at this before it's merged?

@ojeda ojeda merged commit 8d2489b into Rust-for-Linux:rust Apr 8, 2021
@ojeda
Copy link
Member

ojeda commented Apr 8, 2021

Let's move forward!

@wedsonaf wedsonaf deleted the file-open branch April 8, 2021 10:40
@wedsonaf wedsonaf mentioned this pull request May 27, 2021
ojeda pushed a commit that referenced this pull request May 4, 2022
This is a partial revert of commit 0faf20a ("powerpc/64s/interrupt:
Don't enable MSR[EE] in irq handlers unless perf is in use").

Prior to that commit, we always set the decrementer in
timer_interrupt(), to clear the timer interrupt. Otherwise we could end
up continuously taking timer interrupts.

When high res timers are enabled there is no problem seen with leaving
the decrementer untouched in timer_interrupt(), because it will be
programmed via hrtimer_interrupt() -> tick_program_event() ->
clockevents_program_event() -> decrementer_set_next_event().

However with CONFIG_HIGH_RES_TIMERS=n or booting with highres=off, we
see a stall/lockup, because tick_nohz_handler() does not cause a
reprogram of the decrementer, leading to endless timer interrupts.
Example trace:

  [    1.898617][    T7] Freeing initrd memory: 2624K^M
  [   22.680919][    C1] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:^M
  [   22.682281][    C1] rcu:     0-....: (25 ticks this GP) idle=073/0/0x1 softirq=10/16 fqs=1050 ^M
  [   22.682851][    C1]  (detected by 1, t=2102 jiffies, g=-1179, q=476)^M
  [   22.683649][    C1] Sending NMI from CPU 1 to CPUs 0:^M
  [   22.685252][    C0] NMI backtrace for cpu 0^M
  [   22.685649][    C0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-rc2-00185-g0faf20a1ad16 #145^M
  [   22.686393][    C0] NIP:  c000000000016d64 LR: c000000000f6cca4 CTR: c00000000019c6e0^M
  [   22.686774][    C0] REGS: c000000002833590 TRAP: 0500   Not tainted  (5.16.0-rc2-00185-g0faf20a1ad16)^M
  [   22.687222][    C0] MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24000222  XER: 00000000^M
  [   22.688297][    C0] CFAR: c00000000000c854 IRQMASK: 0 ^M
  ...
  [   22.692637][    C0] NIP [c000000000016d64] arch_local_irq_restore+0x174/0x250^M
  [   22.694443][    C0] LR [c000000000f6cca4] __do_softirq+0xe4/0x3dc^M
  [   22.695762][    C0] Call Trace:^M
  [   22.696050][    C0] [c000000002833830] [c000000000f6cc80] __do_softirq+0xc0/0x3dc (unreliable)^M
  [   22.697377][    C0] [c000000002833920] [c000000000151508] __irq_exit_rcu+0xd8/0x130^M
  [   22.698739][    C0] [c000000002833950] [c000000000151730] irq_exit+0x20/0x40^M
  [   22.699938][    C0] [c000000002833970] [c000000000027f40] timer_interrupt+0x270/0x460^M
  [   22.701119][    C0] [c0000000028339d0] [c0000000000099a8] decrementer_common_virt+0x208/0x210^M

Possibly this should be fixed in the lowres timing code, but that would
be a generic change and could take some time and may not backport
easily, so for now make the programming of the decrementer unconditional
again in timer_interrupt() to avoid the stall/lockup.

Fixes: 0faf20a ("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use")
Reported-by: Miguel Ojeda <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants