Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pausing execution of interpreter #687

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

resulknad
Copy link
Collaborator

Enable preemption / pausing of wasm applets behind a feature flag.

  • pass AtomicBool as argument
  • check the bool on function calls and back edges
  • wasm test case with infinite loop and infinite recursion added

@resulknad resulknad requested a review from ia0 as a code owner November 12, 2024 15:51
Copy link
Member

@ia0 ia0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The main point is that we need a way to know whether execution was interrupted and a way to resume from that. Right now we don't. The design is not clear. I guess one option would be to have some interrupted: Option<Thread<'m>> in Store and set it in unbounded_continue() before returning Interrupt. Then add some is_interrupted() -> bool similar to last_call() and make sure in last_call that we are not interrupted, and add some Store::resume() function to resume from interrupt.

Similarly, the tests/interrupt.rs file should not only test that we can interrupt, but also that we can resume. We could have 2 tests (and run the test with --test-threads=1 if we care about timings, but maybe we shouldn't):

  • One that we can break of an infinite loop.
  • One that is essentially something like loop { for _ in 0..count() {} } where count() is a host function that in practice can always return some relatively big number and starts a thread that waits then interrupts (like above). And we can run it a few times alternating between host call and interrupt.

Actually, maybe only the second test is enough since it contains the first one. We also don't need to test timings anymore because the fact that the host function is eventually called is already a proof that something gets executed.

crates/interpreter/src/exec.rs Outdated Show resolved Hide resolved
crates/interpreter/examples/hello.rs Outdated Show resolved Hide resolved
crates/interpreter/Cargo.toml Outdated Show resolved Hide resolved
crates/interpreter/examples/hello.rs Outdated Show resolved Hide resolved
crates/interpreter/src/exec.rs Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should ideally be in binary format in the tests/interrupt.rs file with the text format that generates it (like examples/hello.rs has). In particular, this test is not in the specification, so it shouldn't be in tests/spec.rs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, good point. I have created both a .wat and .wasm file. Let me know if you would prefer to have them embedded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to have the .wasm file embedded like examples/hello.rs to avoid having too many binary files in the repository. We currently have pdf and png because there's no real workaround, so I'd like to avoid adding wasm to this list.

crates/interpreter/tests/spec.rs Outdated Show resolved Hide resolved
crates/interpreter/test.sh Outdated Show resolved Hide resolved
crates/interpreter/src/exec.rs Outdated Show resolved Hide resolved
crates/interpreter/src/exec.rs Outdated Show resolved Hide resolved
@resulknad
Copy link
Collaborator Author

Thanks for the feedback! I really like not passing the bool as an argument everywhere, it's much cleaner that way.

As for the resume after the interruption, I decided to piggy back on how it's done for other host calls. I was thinking it's the same exact mechanism and requirements, so no need to replicate. Let me know if you think otherwise.

For the test case, have a look at interrupt.rs. Basically it loops a few times, triggering an interrupt each time and resuming again. Ultimately we check that a certain amount of interrupts have happened.

@ia0 ia0 added crate:interpreter Modifies the interpreter for:usability Improves users (and maintainers) life labels Dec 2, 2024
Copy link
Member

@ia0 ia0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to piggy back on how it's done for other host calls. I was thinking it's the same exact mechanism and requirements, so no need to replicate. Let me know if you think otherwise.

Yes I think otherwise but while this is gated by a feature I'd say it's fine.

The reason I think otherwise is because it's not the same mechanism and requirements :-) and it's very easy for a user to do a mistake and reply to an interrupt as if it was a call (discarding data) or reply to a call as if it was an interrupt (possibly panicking).

For the test case, have a look at interrupt.rs. Basically it loops a few times, triggering an interrupt each time and resuming again. Ultimately we check that a certain amount of interrupts have happened.

I like the idea of stopping after a deterministic condition. And when we make the count dynamic (or adaptative) we should instead check that a certain amount of interrupts happened.

self.interrupt,
);

// Disable interrupts for the start section.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want an abstraction to avoid losing the atomic bool.

fn Store::without_interrupt<R>(&mut self, operation: impl FnOnce(&mut Store) -> R) -> R {
    #[cfg(feature = "interrupt")]
    let saved = self.interrupt.take();
    let result = operation(self);
    #[cfg(feature = "interrupt")]
    self.interrupt = saved;
    result
}

Then here we would:

self.without_interrupt(|this| {
    let result = thread.run(this)?;
    assert!(matches!(...));
})?;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if we do not pass the atomic bool when creating the thread just above, there is no need to manipulate anything on the store right?

Also I was running into some lifetime issues with the snippet above. Without any lifetime annotations, it will require the Store to live forever ('static). My first attempt at adding some lifetime annotations wasn't really successful but we can look more into that in case we decide that we do need this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, my code snippet was wrong, we need to specify the store lifetime:

    fn without_interrupt<R>(&mut self, operation: impl FnOnce(&mut Store<'m>) -> R) -> R {

And then it can be used like that:

            let thread = Thread::new(parser, Frame::new(inst_id, 0, &[], locals));
            self.without_interrupt(|x| {
                let result = thread.run(x)?;
                assert!(matches!(result, RunResult::Done(x) if x.is_empty()));
                Ok(())
            })?;

@@ -470,6 +514,10 @@ pub enum RunResult<'a, 'm> {

/// Execution is calling into the host.
Host(Call<'a, 'm>),

// Execution was interrupted by the host.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Execution was interrupted by the host.
/// Execution was interrupted by the host.

@@ -58,6 +63,9 @@ pub struct Store<'m> {
// functions in `funcs` is stored to limit normal linking to that part.
func_default: Option<(&'m str, usize)>,
threads: Vec<Continuation<'m>>,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change

Comment on lines 1113 to 1115
if self.interrupt.is_some_and(|interrupt| {
interrupt.compare_exchange_weak(true, false, Relaxed, Relaxed).is_ok()
}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.interrupt.is_some_and(|interrupt| {
interrupt.compare_exchange_weak(true, false, Relaxed, Relaxed).is_ok()
}) {
if self.interrupt.is_some_and(|x|x.swap(false, Relaxed)) {

crates/interpreter/src/exec.rs Show resolved Hide resolved
crates/interpreter/tests/interrupt.rs Outdated Show resolved Hide resolved
crates/interpreter/tests/interrupt.rs Outdated Show resolved Hide resolved
crates/interpreter/tests/interrupt.rs Outdated Show resolved Hide resolved
crates/interpreter/tests/interrupt.rs Outdated Show resolved Hide resolved
thread::sleep(time::Duration::from_millis(1));
interrupt.store(true, SeqCst);
});
result = call.resume(&[Val::I32(1000)]).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit brittle. We could use a dynamic count (starting with 1000) and doubling each time we hit the next loop without first hitting an interrupt. But this could be done in a follow-up PR if needed.

Copy link
Collaborator Author

@resulknad resulknad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have some questions around how to pass a mutable reference of the store into pop_label and maybe related to that, if we actually do need to set the store's atomic bool reference to None.

parser,
Frame::new(inst_id, 0, &[], locals),
#[cfg(feature = "interrupt")]
self.interrupt,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not passing &mut Store to pop_label because let inst: &mut Instance<'m> = &mut store.insts[inst_id]; already has a mutable reference to store. So rustc will complain with cannot borrow store as mutable more than once at a time.

This is why I passed a mutable reference to store.threads only. Do you have a suggestion on how to work around this?

self.interrupt,
);

// Disable interrupts for the start section.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if we do not pass the atomic bool when creating the thread just above, there is no need to manipulate anything on the store right?

Also I was running into some lifetime issues with the snippet above. Without any lifetime annotations, it will require the Store to live forever ('static). My first attempt at adding some lifetime annotations wasn't really successful but we can look more into that in case we decide that we do need this function.

crates/interpreter/test.sh Outdated Show resolved Hide resolved
crates/interpreter/tests/interrupt.rs Outdated Show resolved Hide resolved
crates/interpreter/src/exec.rs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crate:interpreter Modifies the interpreter for:usability Improves users (and maintainers) life
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants