[Letux-kernel] omap-gpmc gta04 regression 4.7-rc1
Roger Quadros
rogerq at ti.com
Wed Jun 8 11:31:59 CEST 2016
Marek,
On 07/06/16 19:54, Grygorii Strashko wrote:
> On 06/07/2016 07:22 PM, Linus Walleij wrote:
>> On Tue, Jun 7, 2016 at 5:29 PM, Grygorii Strashko
>> <grygorii.strashko at ti.com> wrote:
>>> On 06/07/2016 04:33 PM, Linus Walleij wrote:
>>>> On Tue, Jun 7, 2016 at 9:50 AM, Roger Quadros <rogerq at ti.com> wrote:
>>>>
>>>>> Linus, any idea why of_gpiochip_find_and_xlate() would do a NULL pointer dereference?
>>>>
>>>> Nope. I see this uses the array fetch function which I did patch around but
>>>> "should" not affect this.
>>>>
>>>>> We have a case here where the GPMC driver registers a gpiochip but after a while
>>>>> unregisters it due to some other orthogonal resource not being available.
>>>>> Is this registering and unregistering considered acceptable from gpiochip point of view?
>>>>
>>>> That would be if it doesn't really go away aftert gpiochip_remove() because
>>>> some refcount doesn't go zero, so double-check that. It should be fine
>>>> anyway though ... just the instance floating around somewhere.
>>>>
>>>
>>> Linus, I've tried to find place in gpiolib.c where gpiochip is removed
>>> from gpio_devices list as part of gpiochip_remove(), but I can't. Am I missed smth.?
>>
>> It is supposed to happen in gpiodevice_release() as an effect of the
>> references to the kobject in the device going to zero when the last
>> put_device() is called in gpiochip_remove().
>>
>> Unless there is already some other user that has taken a descriptor,
>> such as a kernel driver or userspace.
>
> Ah. Thanks for the info.
>
>>
>>> If that is the case then such kind of crash is possible, because gpiochip_find()
>>> uses this list.
>>
>> Hm I suspect it could be that the reference count is off, so the device
>> isn't removed from the list?
>>
>> But we should still numb the device somehow so that the crash doesn't
>> happen even if the device is still in the list.
>
> Seems there is a window for the race in
> void gpiochip_remove(struct gpio_chip *chip)
> {
> struct gpio_device *gdev = chip->gpiodev;
> struct gpio_desc *desc;
> unsigned long flags;
> unsigned i;
> bool requested = false;
>
> /* FIXME: should the legacy sysfs handling be moved to gpio_device? */
> gpiochip_sysfs_unregister(gdev);
> /* Numb the device, cancelling all outstanding operations */
> gdev->chip = NULL;
> ^^^^^ Once we've reached this point the gpiochip_find() will start crashing :(
>
> struct gpio_chip *gpiochip_find(void *data,
> {
> [...]
> spin_lock_irqsave(&gpio_lock, flags);
> list_for_each_entry(gdev, &gpio_devices, list)
> if (match(gdev->chip, data))
> ^^^^ chip is null, match = of_gpiochip_find_and_xlate()
> break;
>
>
>
> gpiochip_irqchip_remove(chip);
> acpi_gpiochip_remove(chip);
> gpiochip_remove_pin_ranges(chip);
> gpiochip_free_hogs(chip);
> [...]
>
> /*
> * The gpiochip side puts its use of the device to rest here:
> * if there are no userspace clients, the chardev and device will
> * be removed, else it will be dangling until the last user is
> * gone.
> */
> put_device(&gdev->dev);
> ^^^ if i understand right gpiodev/chip will be in gpio_devices list till this point
> [if not used]
>
>
>
>
Do these 2 patches fix the NULL pointer issue? They should be in by v4.7-rc3
http://git.kernel.org/cgit/linux/kernel/git/linusw/linux-gpio.git/commit/?h=fixes&id=11f33a6d15bfa397867ac0d7f3481b6dd683286f
http://git.kernel.org/cgit/linux/kernel/git/linusw/linux-gpio.git/commit/?h=fixes&id=f4833b8cc7edab57d3f3033e549111a546c2e02b
Then let's investigate why gpmc_cs_request is failing.
Linus & Grygorii, thanks for jumping in on the gpiochip issue. :)
cheers,
-roger
More information about the Letux-kernel
mailing list