[Letux-kernel] X1600 / LX16 support - here: fw_getenv() issue

H. Nikolaus Schaller hns at goldelico.com
Tue Feb 6 19:24:34 CET 2024



> Am 06.02.2024 um 17:15 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
> 
> 
> 
>> Am 06.02.2024 um 09:28 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
>> 
>> So I added some more ll_printk() and it appears as if the _fw_argv processing is ok,
>> but someone overwrites some pointers of the _fw_envp so that scanning for env strings
>> fails.
>> 
>> The values overwriting the pointers look like ASCII characters "3=me" and "0 at M2".
>> 
>> Next I'll check values at the beginning of the initialization,right after kernel_entry
>> in start_kernel. Maybe it is possible to "bisect" the code location where these pointers
>> are overwritten.
> 
> First analysis results indicate that the env string pointers are already damaged when
> start_kernel() is called.
> 
> Next, I plan to do comparable tests on the letux-5.10.y-lx16 tests (which is a little
> difficult as we can't printk() so early in kernel startup).
> 
> One new hypothesis comes to my mind: the new kernel is bigger than the letux-5.10.y-lx16
> kernel. So loading the kernel may overwrite something which U-Boot assumes to be safe.

v6.8: 4.8 MB
   Data Size:    5024758 Bytes = 4.8 MiB
   Load Address: 80010000
   Entry Point:  80848e0c

v5.10: 4.2 MB
   Data Size:    4419937 Bytes = 4.2 MiB
   Load Address: 80010000
   Entry Point:  806d88b0

On the other hand, the env strings are stored at 0xa1fxxxxx addresses which should not
overlap. But... We only have 32 MB of RAM... So some address bits are don't care
anyways.

> 
> Another hypothesis is that it is compiler dependent. At least on my setup - but Paul
> has the same issue. I compile the 5.10 kernels with gcc 4.9.2 and the 6.8 kernel with
> gcc 6.3.0. That is not a big deal to test.

There is only a small difference in uImage size and the entry point moves a little.
But no functional difference. At least with 5.10.y. 6.8 requires at least gcc 5.1.0.

What I can't exclude so far is that the bug is also in letux-5.10.y-lx16 but does
not make the kernel boot hang, because the "damaging" of the environment variables
has a different result.

So maybe I should try to backport the ll_printk stuff to the 5.10.y kerneland try
to look what it is doing differently at start_kernel().

Any better ideas/suggestions how to identify this issue?

BR,
Nikolaus




More information about the Letux-kernel mailing list