[Letux-kernel] Pandora: XUDF and other issues
H. Nikolaus Schaller
hns at goldelico.com
Sat Feb 19 17:25:00 CET 2022
> Am 18.02.2022 um 22:47 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
>> So let's check if reverting this commit makes a difference. If not the
>> bisect was on the wrong track (due to randomness effects).
> Hm. It was a false alarm.
> The file drivers/firmware/ti_sci.c is not even compiled
> since we don't have CONFIG_TI_SCI_PROTOCOL in the defconfig.
> So I'll repeat the bisect.
Well, I did repeat it.
With a really difficult to explain result:
this time, the first bad commit is the one I initially specified as bad (v5.3).
This means the XUDF problem was not reproducible any more
over ca. 15 installs of new kernel variants, rebooting, waiting,
running high-load for 30 seconds and then checking dmesg for signs
of XUDF issues.
After a while of experimentation I realized one more change I
had done: I did enable charging with 800mA over USB because
the battery was only ca. 12%. And there is plenty of time while
git bisect compiles a new kernel. So why not take the chance to
recharge the battery during this idle time?
This brought me to another experiment. I did run ./high-load >/dev/null &
And experimented with /sys/class/power_supply/twl4030_usb/input_current_limit
Indeed, increasing to 800000 made the XUDF issues go away and
500000 made them appear occasionally. Also backlight going off
made them go away. And pressing a key turned on backlight and
XUDF was back.
So this explains a lot: it is a hardware issue or limitation.
It seems to depend on total power consumption (i.e. backlight and
processor load - which can be different in different kernel version)
and battery charge level and how much the USB power supply can
Maybe batteries are 10 years old and getting weak? Although they
reach full capacity and seem to charge normally.
This may also explain why you did see it first when running Letux
kernels (can have higher power demand) and why it was still effective
when booting the Pandora OS from NAND.
Please can you try to find out if it depends on USB charging, backlight
or similar effects on your unit?
So it seems to be nothing we can solve by bisecting for a kernel bug.
I just started to cross-check letux-5.17-rc4. At the moment it only shows
[ 330.002105] ti-soc-thermal 48002524.bandgap: eocz timed out waiting high
This does not appear to depend on input_current_limit or anything else.
But it also occurs in a high-load situation.
This looks like a "real bug" - which hopefully can be bisected more
BR and thanks,
More information about the Letux-kernel