[Letux-kernel] omap-gpmc gta04 regression 4.7-rc1

Roger Quadros rogerq at ti.com
Wed Jun 8 11:31:59 CEST 2016


Marek,

On 07/06/16 19:54, Grygorii Strashko wrote:
> On 06/07/2016 07:22 PM, Linus Walleij wrote:
>> On Tue, Jun 7, 2016 at 5:29 PM, Grygorii Strashko
>> <grygorii.strashko at ti.com> wrote:
>>> On 06/07/2016 04:33 PM, Linus Walleij wrote:
>>>> On Tue, Jun 7, 2016 at 9:50 AM, Roger Quadros <rogerq at ti.com> wrote:
>>>>
>>>>> Linus, any idea why of_gpiochip_find_and_xlate() would do a NULL pointer dereference?
>>>>
>>>> Nope. I see this uses the array fetch function which I did patch around but
>>>> "should" not affect this.
>>>>
>>>>> We have a case here where the GPMC driver registers a gpiochip but after a while
>>>>> unregisters it due to some other orthogonal resource not being available.
>>>>> Is this registering and unregistering considered acceptable from gpiochip point of view?
>>>>
>>>> That would be if it doesn't really go away aftert gpiochip_remove() because
>>>> some refcount doesn't go zero, so double-check that. It should be fine
>>>> anyway though ... just the instance floating around somewhere.
>>>>
>>>
>>> Linus, I've tried to find place in gpiolib.c where gpiochip is removed
>>> from gpio_devices list as part of gpiochip_remove(), but I can't. Am I missed smth.?
>>
>> It is supposed to happen in  gpiodevice_release() as an effect of the
>> references to the kobject in the device going to zero when the last
>> put_device() is called in gpiochip_remove().
>>
>> Unless there is already some other user that has taken a descriptor,
>> such as a kernel driver or userspace.
> 
> Ah. Thanks for the info.
> 
>>
>>> If that is the case then such kind of crash is possible, because gpiochip_find()
>>> uses this list.
>>
>> Hm I suspect it could be that the reference count is off, so the device
>> isn't removed from the list?
>>
>> But we should still numb the device somehow so that the crash doesn't
>> happen even if the device is still in the list.
> 
> Seems there is a window for the race in
> void gpiochip_remove(struct gpio_chip *chip)
> {
> 	struct gpio_device *gdev = chip->gpiodev;
> 	struct gpio_desc *desc;
> 	unsigned long	flags;
> 	unsigned	i;
> 	bool		requested = false;
> 
> 	/* FIXME: should the legacy sysfs handling be moved to gpio_device? */
> 	gpiochip_sysfs_unregister(gdev);
> 	/* Numb the device, cancelling all outstanding operations */
> 	gdev->chip = NULL;
> ^^^^^ Once we've reached this point the gpiochip_find() will start crashing :(
> 
> 	struct gpio_chip *gpiochip_find(void *data,
> 	{
> 		[...]
> 		spin_lock_irqsave(&gpio_lock, flags);
> 		list_for_each_entry(gdev, &gpio_devices, list)
> 			if (match(gdev->chip, data))
> 					^^^^ chip is null, match = of_gpiochip_find_and_xlate()
> 				break;
> 
> 
> 
> 	gpiochip_irqchip_remove(chip);
> 	acpi_gpiochip_remove(chip);
> 	gpiochip_remove_pin_ranges(chip);
> 	gpiochip_free_hogs(chip);
> 	[...]
> 
> 	/*
> 	 * The gpiochip side puts its use of the device to rest here:
> 	 * if there are no userspace clients, the chardev and device will
> 	 * be removed, else it will be dangling until the last user is
> 	 * gone.
> 	 */
> 	put_device(&gdev->dev);
> ^^^ if i understand right gpiodev/chip will be in gpio_devices list till this point
>     [if not used]
> 
> 
> 
> 
Do these 2 patches fix the NULL pointer issue? They should be in by v4.7-rc3

http://git.kernel.org/cgit/linux/kernel/git/linusw/linux-gpio.git/commit/?h=fixes&id=11f33a6d15bfa397867ac0d7f3481b6dd683286f
http://git.kernel.org/cgit/linux/kernel/git/linusw/linux-gpio.git/commit/?h=fixes&id=f4833b8cc7edab57d3f3033e549111a546c2e02b

Then let's investigate why gpmc_cs_request is failing.

Linus & Grygorii, thanks for jumping in on the gpiochip issue. :)

cheers,
-roger


More information about the Letux-kernel mailing list