I’ve recently got my DevTerm A06 and after assembling it, I’ve started looking around. What I’ve noticed on both original 5.10 kernel and 5.17 used by Manjaro is that rk808 pmic_int_l seems to be stuck in low state. This generates a flood of irq requests which are basically no-op since rk808 interrupt status register does not indicate any interrupt was raised. From what I can see from rk808 IO_POL_REG the interrupt is configured as active low, which matches description from the device tree. The A06 Core Mainboard V3.14 schematic also confirms GPIO1_C5 should be used as interrupt line for rk808. It basically seems like software side is correct WRT this interrupt.
Is the schematic correct in this case? Maybe the final device uses different gpio pin to service this irq, or maybe the INT line from rk808 is not actually connected to anything?
I’m also wondering about this. Their DT implementation may be not 100% correct. During boot the kernel complains about pin conflicts, and then there’s IRQ #98 that “nobody cares”.
Also if you are correct about this, it could be really good news for power management
I think the pin conflict problem is a simple mistake. The kernel complains about gpio4 RK_PA0. This GPIO pin is referenced from rk3399.dtsi i2s0_8ch_bus, which is then referenced from i2s0 node. A06 wants to use the gpio4 PA0 for i2s_8ch_mclk_pin so the A06 DTS overwrites i2s0_8ch_bus from rk3399.dtsi. It also tries to overwrite i2s0 node, probably to make sure it uses overwritten i2s0_8ch_bus. Unfortunately the overwritten version of i2s0 uses i2s0_2ch_bus instead of i2s0_8ch_bus as pinctrl-0. This somehow results in final dtb still referencing gpio4 RK_PA0 from two different nodes leading to a conflict. Its not fatal and I think it can be easily solved by the following patch.
@@ -742,7 +742,7 @@
rockchip,playback-channels = <8>;
rockchip,capture-channels = <8>;
- pinctrl-0 = <&i2s0_2ch_bus>;
+ pinctrl-0 = <&i2s0_8ch_bus>;
status = "okay";
This is at least what I did in my local tree where I no longer see the message about the conflict.
Maybe this hibernation resume error log is relevant to your observation on rk808:
[ 2.009815] PM: hibernation: Basic memory bitmaps created
[ 2.141058] PM: Using 3 thread(s) for decompression
[ 2.141077] PM: Loading and decompressing image data (286471 pages)...
[ 2.141084] hibernate: Hibernated on CPU 0 [mpidr:0x0]
[ 3.562412] PM: Image loading progress: 0%
[ 5.604308] PM: Image loading progress: 10%
[ 6.990064] PM: Image loading progress: 20%
[ 9.713276] PM: Image loading progress: 30%
[ 11.235375] PM: Image loading progress: 40%
[ 12.762288] PM: Image loading progress: 50%
[ 14.191396] PM: Image loading progress: 60%
[ 15.595667] PM: Image loading progress: 70%
[ 17.035060] PM: Image loading progress: 80%
[ 18.496474] PM: Image loading progress: 90%
[ 19.579049] PM: Image loading progress: 100%
[ 19.579217] PM: Image loading done
[ 19.579229] PM: hibernation: Read 1145884 kbytes in 17.43 seconds (65.74 MB/s)
[ 19.585161] PM: Image successfully loaded
[ 19.585193] printk: Suspending console(s) (use no_console_suspend to debug)
[ 20.967989] rk3x-i2c ff3c0000.i2c: timeout, ipd: 0x1b, state: 2
[ 20.968044] rk808 0-001b: Failed to read IRQ status: -110
[ 21.007718] Disabling non-boot CPUs ...
[ 21.007732] Wakeup pending. Abort CPU freeze
[ 21.007739] Non-boot CPUs are not disabled
[ 21.034016] mmc_host mmc0: Bus speed (slot 0) = 400000Hz (slot req 400000Hz, actual 400000HZ div = 0)
[ 21.179275] mmc_host mmc0: Bus speed (slot 0) = 148500000Hz (slot req 150000000Hz, actual 148500000HZ div = 0)
[ 21.339876] dwmmc_rockchip fe310000.mmc: Successfully tuned phase to 160
[ 21.554816] PM: hibernation: Failed to load image, recovering.
[ 21.711442] PM: hibernation: Basic memory bitmaps freed
[ 21.711456] OOM killer enabled.
[ 21.711458] Restarting tasks ... done.
[ 21.711934] PM: hibernation: resume failed (-16)
So it took seconds to load the image and prepare, then rk808 poked the cores
Haven’t looked into hibernation/s2ram yet, so can’t comment much on that. What I know is the kernel stops being interested in rk808 irq after 100k attempts to service it. When it does so it prints the “nobody cares” message + prints stack trace. I guess after that it just masks it and everything keeps running.
The only irq that rk808 is responsible for according to /proc/interrupts is the rtc alarm. I guess this means you could not instruct the device to come out of suspend after specific time has passed. What I think is probably also broken by this is hwclock command. For example if try running hwclock -rv it always fails while waiting for clock tick. I guess this just waits for some irq from rtc that never arrives?
Short version I don’t think its crucial to have this irq working, but it would be really nice to understand why its broken. If its just wrong GPIO specified in DTB and schematics the fix should be easy. If there is some HW problem I guess we could just tell the kernel to not enable interrupts for rk808. This should save some CPU cycles during early device startup and also get rid of ugly message in dmesg.
One way to isolate the problem is to probe the rk808 interrupt leg for its behavior…
Not sure if there’s a way to access the other end, gpio1 c5 (meh if there is, problem solved already, a multimeter would do).
emm did I just hide the reply from @tworaz on my side? read the first line, clicking around, and it disappeared
@yatli Nope, I got a message from discourse about my reply being hidden due to antispam filter. Apparently its awaiting moderator approval.
I’ve made a few more experiments.
- The hwclock command works as expected if executed during initial rk808 irq flood. It stops working after “nobody cared” message indicating hwclock indeed requires working rtc alarm irq.
- Switching rk808 irq to be active in high state and adjusting GPIO1_C5 configuration accordingly in dts stops the IRQ flood. This shows GPIO1_C5 is also stuck low state in such setup and that its not rk808 itself that issues bogus irq requests.
- If I schedule RTC alarm to fire in lets say 5 seconds, the GPIO1_C5 never changes state (if configured to be active-high). If I read rk808 INT_STS_REG1 after specified alarm timeout has passed, the register does indicate RTC alarm interrupt is pending. Since this IRQ is not masked in INT_MSK_REG1 rkk808 should change the state of its INT line. My guess is it does, but the INT line is not connected to GPIO1_C5 and the SOC never sees the interrupt.
there’s IRQ #98 that “nobody cares”.
cd arch/arm64 && rg 'interrupt.*98 gives me the uart1 interrupt. uart1 is not enabled in the device tree. So maybe that’s a false alarm on me. rk808 should have irq 21 (I’m not super familiar with devicetree, pls educate me if otherwise). How do you observe the irq flood? I don’t see anything in dmesg about it.
OOh @tworaz you know what?
“PMU Interrupt GPIO1_C5 is relocated here”
Coincidence or what?
On the DevTerm schematics though, GPIO3_B2 is connected to DDR2 pin 5, marked “GPIO0”, then to PMU-SDA (AXP228 not rk808)
||I2C8_SCL (PI-2 bus)
So both PBP and RkP64 “moved” from GPIO1_C5 to GPIO3_B2, but that is occupied on DT by axp228
The devterm device tree also enables sdio0_bus4 that configures RK_PC5.
The devterm device tree configures RK_PC5 to be pull-up, so if it’s not connected it should at least stay high.
@m4xm4n have better idea about that as he have written the dts for mainline.
It is good that the whole community is helping with reviewing his work.
@yatli I’ve seen both the pine64 schematics and patches changing relavant DTS files. I suspect something similar needs to be done here, but I have no clue which GPIO pin should be used instead.
As for IRQ flood, just check /proc/interrupts. It should show that IRQ #98 is actually GPIO1_C5 (21) pin which is connected to rk808. Initially when you boot the device the value of the first column should be increasing really fast. This means this IRQ is triggered almost constantly. The flood stops after 100K attempts at servicing this interrupt.
Thanks @tworaz! I’ll see if setting it to high interrupt will allow s2ram/s2disk.
Some suspend drama here: Getting suspend to work properly on A06 - #12 by yatli
@spikerguy the PMU stuff is all unmodified from the original ClockworkPi patches and I haven’t touched it yet for mainline. it’s sort of related to the rk808 patches I’m trying to fix for 5.18. The interrupt stuff is definitely incorrect (@veritazz flagged that a while back when we were working on the initial Manjaro/mainline work).
The pin conflict w/ i2s0 I believe is my mistake though (at least on the Manjaro/mainline side)
@yatli sounds like you’re onto something w/ this though, so I will happily keep on eye on this space (I’m very slowly working on cleaning up and upstreaming all of this stuff)
@m4xm4n What problems are you facing with 5.18? I’m using 5.18.1 on my A06 right now and besides IRQ problems described here (which also happen on 5.17 and 5.10) everything else seems to be fine.
@tworaz other than having to tweak the rk808 patches a bit to apply, it no longer seems to shut off the power LED on shutdown (doesn’t seem to be related to the pm_poweroff_prepare stuff, since that just seems to shutdown LDO4)
Where are you running 5.18.1? (Via armbian? Manjaro? Arch? self-compiled?)
@m4xm4n wrt the power LED…
Took mine apart and found R116 is NC [doge]
If that’s the “common practice” (who knows, maybe each a06 is different ), then power LED not turning off = 3v3 still active
Both the system and kernel I’m using are self compiled. For the kernel part I just took your 5.17.9 Manjaro patches, applied them on top of upstream stable 5.17 tree, tweaked them a bit to my liking and then at some point I’ve rebased my branch on top 5.18. I don’t remember running into any conflicts during 5.18 rebase.
WRT power led I have a bit of experience with that. Make sure axp228 power_off function is run after the custom pm_power_off_prepare code Clockwork guys have added to rk808 driver. The original patch is a bit of a hack. Apparently to properly shut down DevTerm you have to first turn off rk808 LDO4 which AFAIU cuts power to sdmmc. If you leave it turned on the device will shut down, but the green led will remain active. If you only disable rk808 LD04 and skip the axp228 shutdown procedure the led will also remain active, but it’ll be a bit brighter.
IMO to upstream those changes usage of axp228 pm_power_off procedure should be controlled via DT property like system-power-controller. If such property is present in the axp dts node then it should unconditionally overwrite pm_power_off global variable. On DevTerm pm_power_off is currently always non NULL due to PSCI providing this shutdown hook. This is why the “pm_power_off = 0” bit is present in the power patch AFAIU.
I currently have no idea how to best cut power to rk808 LDO4 in a way that is upstream friendly.
Do you guys have success with the miniusb uart port? I’m going to need it to debug suspend. I have tried a few options to no avail:
- Enable uart2 in devicetree
chosen line with baudrate 115200n
- Update /boot/extlinux/extlinux.conf to enable console at 115200,n
- Compile uboot and bring the baudrate down to 115200,n
- Update uboot CONFIG_BAUDRATE to be 115200
- Update ch340 Windows driver, tried two versions (202201 and 202203 ones)
- Play with ch340 settings in Device Manager, turning off FIFO? adjust FIFI size?
- 8N1? 8N2? Software flow control?
All I can get is partial text. Works for a while after I unplug-plug it to the computer.
Edit: … after all the attempts, tried another usb cable, problem solved
@m4xm4n @tworaz any ideas on: Getting suspend to work properly on A06 - #18 by yatli ?
I’m trying to use PWR_KEY_L as a wake up source, but it always reads 1.