I see. Indeed the lower clock states are marked as “inefficient”.
We may have some luck if we enable ftrace for the kernel and poke around…
Edit: note to self:
loc | comment |
---|---|
kernel/power/suspend.c:L391 |
suspend_enter |
L425 |
call to pm_sleep_disable_secondary_cpus, then cpuidle_pause. We observe a log sequence to shutdown non-booting cores. |
L434 |
call to syscore_suspend |
drivers/base/syscore.c:L47 |
syscore_suspend – goes through registered syscore_ops. cpu_pm is one of them (maybe irrelevant). |
arch/arm64/kernel/sleep.S , arm64/kernel/suspend.c
|
arm64 sleep. It’s hooked up to cpuidle. |
In between, there’re two paths that may cause immediate resume:
- pm_sleep_disable_secondary_cpus fails,
or suspend_test fails(update: that’s for pm_test). - syscore_suspend() fails.
Update:
After a few rounds of printk hacking…
error = syscore_suspend();
pr_err("after syscore_suspend, error=%d", error);
if (!error) {
*wakeup = pm_wakeup_pending();
pr_err("after pm_wakeup_pending");
if (!(suspend_test(TEST_CORE) || *wakeup)) {
trace_suspend_resume(TPS("machine_suspend"),
state, true);
pr_err("before suspend_ops->enter");
error = suspend_ops->enter(state);
pr_err("after suspend_ops->enter");
trace_suspend_resume(TPS("machine_suspend"),
state, false);
} else if (*wakeup) {
error = -EBUSY;
pr_err("wakeup BUSY");
}
syscore_resume();
} else {
pr_err("syscore_suspend fails: %d", error);
}
[ 965.639523] PM: suspend entry (deep)
[ 965.930618] Filesystems sync: 0.291 seconds
[ 965.974514] Freezing user space processes ... (elapsed 0.003 seconds) done.
[ 965.978488] OOM killer disabled.
[ 965.978494] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[ 965.980077] printk: Suspending console(s) (use no_console_suspend to debug)
[ 966.526141] PM: suspend devices took 0.550 seconds
[ 966.529183] PM: after platform_suspend_prepare_noirq
[ 966.529248] Disabling non-boot CPUs ...
[ 966.532391] psci: CPU1 killed (polled 0 ms)
[ 966.536375] psci: CPU2 killed (polled 0 ms)
[ 966.539808] psci: CPU3 killed (polled 0 ms)
[ 966.542967] psci: CPU4 killed (polled 0 ms)
[ 966.545572] psci: CPU5 killed (polled 0 ms)
[ 966.546524] PM: after syscore_suspend, error=0
[ 966.546524] PM: after pm_wakeup_pending
[ 966.546524] PM: before suspend_ops->enter
[ 966.546524] PM: after suspend_ops->enter
[ 966.546543] Enabling non-boot CPUs ...
[ 966.547097] Detected VIPT I-cache on CPU1
So it’s suspend_ops->enter
not doing its work now;
For arm64 suspend should be handled by PSCI. drivers/firmware/psci/psci.c
Down there (PSCI firmware), it’s really out of my reach. But I’ll finish the tracing anyway.
[ 0.000000] psci: psci_init_smccc: feature = 0
[ 0.000000] psci: SMC Calling Convention v1.2
[ 0.000000] psci: psci_init_cpu_suspend: feature = 0
[ 0.000000] psci: psci_init_system_suspend: ret = 0
[ 0.000000] psci: psci_system_reset2: ret = -1
[ 80.919482] Disabling non-boot CPUs ...
[ 80.923640] psci: CPU1 killed (polled 0 ms)
[ 80.927784] psci: CPU2 killed (polled 0 ms)
[ 80.931379] psci: CPU3 killed (polled 0 ms)
[ 80.935126] psci: CPU4 killed (polled 10 ms)
[ 80.938110] psci: CPU5 killed (polled 0 ms)
[ 80.939568] PM: after syscore_suspend, error=0
[ 80.939568] PM: after pm_wakeup_pending
[ 80.939568] PM: before suspend_ops->enter
[ 80.939568] psci: enter psci_system_suspend_enter
[ 80.939568] psci: enter psci_system_suspend, pa_cpu_resume=312F490
[ 80.939568] PM: after suspend_ops->enter
[ 80.939599] Enabling non-boot CPUs ...
So… PSCI is correctly initialized. But invoke_psci_fn(PSCI_FN_NATIVE(1_0, SYSTEM_SUSPEND), pa_cpu_resume, 0, 0);
bounces back immediately indicating an error.
Suspecting psci is not properly configured in device tree.
This is the original patch: [PATCH 0/2] PSCI: system suspend support
psci_system_suspend_enter -> cpu_suspend -> psci_system_suspend
arm64 suspend.c, L119
if (__cpu_suspend_enter(&state)) { // state is 0 for system suspend
/* Call the suspend finisher */
ret = fn(arg); // but PSCI doesn't like it...
/*
* Never gets here, unless the suspend finisher fails.
* Successful cpu_suspend() should return from cpu_resume(),
* returning through this code path is considered an error
* If the return value is set to 0 force ret = -EOPNOTSUPP
* to make sure a proper error condition is propagated
*/
if (!ret)
ret = -EOPNOTSUPP;
} else {
RCU_NONIDLE(__cpu_suspend_exit());
}
Possibly related: 0012-add-suspend-to-rk3399-PBP.patch · master · manjaro-arm / packages / core / linux-pinebookpro · GitLab
The patch looks promising. It uses a rockchip-specific way of talking to the trusted firmware (ATF).
It cannot be directly applied to DevTerm, because it uses ACPI pm_poweroff_prepare. (It will be removed soon, anyway: [v7,20/20] reboot: Remove pm_power_off_prepare() - Patchwork)
If we want to do it quick and dirty, we can just plug it into psci.c
without even touching device tree ((
If clockworkpi cares enough they should be able to pick it up and polish…