[Letux-kernel] [Gta04-owner] New LetuxOS Kernels

Tony Lindgren tony at atomide.com
Wed Jun 20 06:26:53 CEST 2018


* Andreas Kemnade <andreas at kemnade.info> [180619 21:35]:
> Hi,
> 
> On Tue, 19 Jun 2018 20:07:23 +0200
> "H. Nikolaus Schaller" <hns at goldelico.com> wrote:
> 
> > [technical discussions should go to the proper mailing lists]
> > 
> > Hi Andreas,
> > 
> > > Am 19.06.2018 um 19:46 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
> > > 
> > > Hi,
> > >   
> > >> Am 19.06.2018 um 19:22 schrieb Andreas Kemnade <andreas at kemnade.info>:
> > >> 
> > >> On Tue, 19 Jun 2018 11:38:26 +0200
> > >> "H. Nikolaus Schaller" <hns at goldelico.com> wrote:
> > >>   
> > >>> Hi,
> > >>> we did hunt a while for a big bug in the kernel.
> > >>> It was a little flaky and random but the main
> > >>> symptom was a kernel NULL pointer panic in
> > >>> strcmp().
> > >>> 
> > >>> The result was that the GTA04 (and also the Pyra)
> > >>> did not continue to boot properly. Sometimes it
> > >>> came to a login: but then several drivers were
> > >>> missing.
> > >>> 
> > >>> After hunting down the bug it was not a device
> > >>> driver but a race and dangling pointer problem
> > >>> in the pincontrol subsystem.
> > >>> 
> > >>> Now we got some patches from the maintainer and
> > >>> the problem seems to have disappeared.
> > >>> 
> > >>> Well, has it or hasn't is?  
> > >> 
> > >> compiled dca26f608a765008b869991bf29fa241769599fb + my compile fix
> > >> 
> > >> result: strcmp problem again.
> > >> [    7.335571] [<c074abc0>] (strcmp) from [<c04266bc>] (pinctrl_generic_add_group+0x50/0xc4)
> > >> [    7.344116] [<c04266bc>] (pinctrl_generic_add_group) from [<c042b920>] (pcs_dt_node_to_map+0x484/0x820)
> > >> [    7.353912] [<c042b920>] (pcs_dt_node_to_map) from [<c04298f4>] (pinctrl_dt_to_map+0x220/0x2bc)
> > >> [    7.363006] [<c04298f4>] (pinctrl_dt_to_map) from [<c04269c4>] (create_pinctrl+0x5c/0x318)  
> > > 
> > > Hm...
> > > 
> > > I did a quick boot - and on first boot I also got a strcmp(NULL).
> > > From the SD card which I had used for extensive testing yesterday.
> > > 
> > > What the hell is going on here?  
> > 
> > Maybe it is still a bug to devm_kzalloc something and store in the radix
> > tree and leave it there, even if the driver is detached?

Or you guys using and older version of the patches? The check for not
allowing to add NULL named entries was added. Not sure how you would
end up with NULL names though unless some parts are still freed on
deferred probe. Care to try with the updated patches and add dump_stack
for NULL names?

> > Then we still try to access this memory region by scanning the tree.
> > 
> > For test purposes we could replace the devm_kzalloc by kzalloc. This
> > whould leak a little memory, but my hope is that the problem disappears.
> > 
> > Do you have a repeatable (at least >some%) scenario to reproduce the
> > bug?
> 
> unfortunately not, maybe we should pass init=/modprobe-mess.sh to
> kernel commandline, and create a worst case modprobe scenario there.
> So we can control probing order more.

Funny how I have not seen these. Probably because I got rid of that
PID 1 software years ago.

Regards,

Tony


More information about the Letux-kernel mailing list