Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

boot-qemu: Always use -no-reboot #65

Merged
merged 1 commit into from
Jul 25, 2022
Merged

Conversation

kees
Copy link
Contributor

@kees kees commented Jul 25, 2022

To make things easier for shutting down after a reboot (or a reboot on
panic), always use -no-reboot regardless of architecture.

Signed-off-by: Kees Cook [email protected]

To make things easier for shutting down after a reboot (or a reboot on
panic), always use -no-reboot regardless of architecture.

Signed-off-by: Kees Cook <[email protected]>
Copy link
Member

@msfjarvis msfjarvis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable from what I've read about -no-reboot and -no-shutdown but I'll let someone who uses this on the daily merge the PR.

@kees kees merged commit 2f696f1 into ClangBuiltLinux:main Jul 25, 2022
@nathanchance
Copy link
Member

Hmmm, we might want to rethink doing this unconditionally, as I think it will cause us to miss out on timeouts for certain crashes. android-4.14-stable has been failing to boot for two months but there is no failure because the machine does not continuously reboot, which would trigger the timeout.

[    0.902647] Freeing SMP alternatives memory: 132K
[    0.937461] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in:         (ptrval)
[    0.937461] 
[    0.938272] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.305 #1
[    0.938448] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.1-1-1 04/01/2014
[    0.938786] Call Trace:
[    0.940000]  panic+0x11b/0x570
[    0.940000]  ? acpi_read_bit_register+0x66/0xa8
[    0.940000]  ? start_kernel+0xc5d/0xc90
[    0.940000]  ? early_idt_handler_array+0x120/0x120
[    0.940000]  __stack_chk_fail+0x14/0x20
[    0.940000]  start_kernel+0xc5d/0xc90
[    0.940000]  x86_64_start_reservations+0x30a/0x310
[    0.940000]  x86_64_start_kernel+0x44a/0x450
[    0.940000]  secondary_startup_64+0xa5/0xb0
[    0.940000] Rebooting in 5 seconds..

@kees
Copy link
Contributor Author

kees commented Mar 17, 2023

But why didn't the running notice it never reached userspace?

@nathanchance
Copy link
Member

But why didn't the running notice it never reached userspace?

Because QEMU exited cleanly, which seems expected to me in that case with -no-reboot (the machine goes to reboot but we told QEMU to just exit on reboot). Without timeout, we would have to always use pvpanic (I think?) to catch these issues, which won't scale with CI. It is possible that -action panic=exit-failure might do what we want but I am not sure that will catch everything, I will try to play around with it later.

@ojeda
Copy link
Member

ojeda commented Mar 17, 2023

What about matching the log against some string that you print within QEMU at the end of the run (and checking it is there only once)? Poor man's solution, but it may be a good idea and can be done in addition to another approach too. We did something like that for tests etc.

@nathanchance
Copy link
Member

What about matching the log against some string that you print within QEMU at the end of the run (and checking it is there only once)? Poor man's solution, but it may be a good idea and can be done in addition to another approach too. We did something like that for tests etc.

Thanks for the suggestion, I have gone ahead and filed #96 to further discuss that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants