[Letux-kernel] Pandora: XUDF and other issues

H. Nikolaus Schaller hns at goldelico.com
Fri Feb 18 19:59:18 CET 2022


Hi,


> Am 18.02.2022 um 11:52 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
> 
> Hi,
> this topic isn't forgotten but really difficult to solve.
> 
> ...
> 
> To summarize: there are still pieces missing to be able to run the final bisect
> and look out how XUDF depends on some kernel patches.
> 
> And today I once got the XUDF messages when running 5.1. Something
> I had never observed before. So it may be that the symptom has some random
> component which usually makes the complete bisect fail. And a random
> influence is either some unitialized variable or some hardware effect?
> 

thanks to help from Notaz I could improve my setup so that the Pandora
can reboot from the SD card with LetuxOS unattended. With this and doing
some manual bisect to reduce the search range to v5.1..v5.3 was finally
able to automatically bisect the kernel to a commit after which the
likelihood of XUDF errors increases to 100%:

0b88bc9292515550b4f7b30ac7d7ecd80f5bc1aa is the first bad commit
commit 0b88bc9292515550b4f7b30ac7d7ecd80f5bc1aa
Author: Stephen Boyd <sboyd at kernel.org>
Date:   Mon Jun 24 18:06:15 2019 -0700

    clk: ti: Use int to check return value from of_property_count_elems_of_size()
    
    This function can return a negative number when it fails, but res->sets
    is at most a u16 which can't hold that negative number. Let's store the
    result into an int, ret, and then assign that to res->sets when it works
    to avoid this logical impossibility.
    
    Signed-off-by: Stephen Boyd <sboyd at kernel.org>

 drivers/firmware/ti_sci.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

It has something to do with TI chips and clocks. But what I don't know is what
ti_sci is and what firmware it relates to.

What does this mean? This patch is also installed in other platforms, e.g. the
GTA04 or the Pyra.

So not having analysed anything (not even if a revert solves the issues - it
may be a false positive) I would assume that the change in logic either makes
the function return at all (i.e. it did not return before).

Or returns a different value for some case which is special for the Pandora
with an omap3530 inside. It could be some error path that now runs differently.
Something related to setup of the I2C clocks of the omap35xx (only).

So let's check if reverting this commit makes a difference. If not the
bisect was on the wrong track (due to randomness effects).

BR,
Nikolaus




More information about the Letux-kernel mailing list