[Letux-kernel] Pandora: XUDF and other issues

H. Nikolaus Schaller hns at goldelico.com
Sun Feb 20 14:12:40 CET 2022


Hi,


> Am 20.02.2022 um 08:21 schrieb Grond <grond66 at riseup.net>:
> 
> On Sat, Feb 19, 2022 at 05:25:00PM +0100, H. Nikolaus Schaller wrote:
>> Hi Grond,
>> 
>> 
>> Please can you try to find out if it depends on USB charging, backlight
>> or similar effects on your unit?
> I will try. But there are a few caveats with this. We appear to be
> experiencing two different sets of symptoms. For you, the bus timeout/
> XUDF issue occurs when power consumption goes above some threshold. On
> my unit it appears to be caused by something totally jamming the SCL
> line (it is completely pulled down starting at boot time, and never goes
> high). It is constant (100% of bus transactions fail) and happens
> reliably across kernel versions. Right now, I have the battery pulled
> for a few days in the probably vain hope that this changes anything, but
> when I'm done with that experiment, I'll give this a try.

Well, what I could suspect is that the bq27xxx fuel gauge chip is running
outside of its specs in some cases. This may be less or more harmul on
some devices.

Looks like I have to think about and study schematics/data sheets...

What also could be is that the bq27xxx on your unit is broken. Or wrongly
programmed. I remember there was a tool to write it - but that requires
i2c to work.

Ah, this brings me to another idea. Can you break in the u-boot console
and use the i2c commands to study if u-boot can communicate on
this i2c bus?

> 
> What happens on your unit when the battery is removed and it is powered
> via the barrel jack? If this is related to the battery being unable to
> provide enough power this should presumably keep the symptoms from
> manifesting by running the whole system in constant voltage mode...

Haven't tried yet but will do asap. Most likely it will behave as with
a 100% charged battery. Expectations are good, but experiments are better :)

> 
>> 
>> So it seems to be nothing we can solve by bisecting for a kernel bug.
>> 
>> I just started to cross-check letux-5.17-rc4. At the moment it only shows
>> the
>> 
>> [  330.002105] ti-soc-thermal 48002524.bandgap: eocz timed out waiting high
>> 
>> This does not appear to depend on input_current_limit or anything else.
>> But it also occurs in a high-load situation.
>> 
>> This looks like a "real bug" - which hopefully can be bisected more
>> repeatable.

Yes, this was easily bisected and seems to be this issue:

# first bad commit: [514cbabb01422d501d533a6495b924e4c22d4822] thermal: ti-soc-thermal: Simplify polling with iopoll

What I suspect here is that the minimal waiting time mentioned in the commit
message may not be enough for the omap3530-600MHz models but for all other
OMAP variants.

Whatever the reason is, a git revert works here and silences the eocz messages.

I want to look into the omap3530/dm3730 TRM and propose a fix for upstream.

BR,
Nikolaus



More information about the Letux-kernel mailing list