[Letux-kernel] weird problem with pwm_bl on omap3

H. Nikolaus Schaller hns at goldelico.com
Tue May 29 19:29:38 CEST 2018


Hi,

> Am 29.05.2018 um 18:36 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
> 
> Hi Andreas,
> 
>> Am 29.05.2018 um 18:27 schrieb Andreas Kemnade <andreas at kemnade.info>:
>> 
>> On Tue, 29 May 2018 17:41:53 +0200
>> "H. Nikolaus Schaller" <hns at goldelico.com> wrote:
>> 
>>> Hi Tony,
>>> 
>>>> Am 17.05.2018 um 20:08 schrieb Tony Lindgren <tony at atomide.com>:
>>>> 
>>>> * H. Nikolaus Schaller <hns at goldelico.com> [180517 12:17]:
>>>>> Hi Tony,
>>>>> we are using for long time the dmtimer/pwm on
>>>>> the GTA04 to drive the display panel backlight.
>>>>> 
>>>>> Starting a while ago (I am not sure when, but it may
>>>>> be 4.17-rc1), the device randomly fails to boot
>>>>> with a NULL pointer dereference in strcmp().
>>>>> Booting again usually runs fine.
>>>> 
>>>> Hmm maybe enable CONFIG_DEBUG_SLAB=y and POISON options
>>>> and see if that catches something. It's might be some
>>>> array out of bounds type issue.
>>>> 
>>>> Regards,
>>>> 
>>>> Tony
>>> 
>>> 
>>> I have restarted hunting the issue but it is very
>>> ghostly. Every time I try to test by another method
>>> it disappears and if I remove my printk things or
>>> /etc/modprobe/blacklist.conf it comes back.
>>> 
>> Have you tried to load every module piece by piece before running udev
>> by booting the kernel with a init=/something.sh script?
>> So you have the order under control?
> 
> Well, if I load modules manually everything is fine.
> The problem appears to be the concurrency of loading
> and deferred probing of modules.
> 
> If I block pwm_bl and omap3dss and some others and load
> them all manually, there was never a problem :(
> 
>> 
>> Another thing: Are our fb patches evil again?
> 
> It also occurs with omapdss blacklisted.
> 
>> 
>> The offmode patch? Maybe it uncovers some other bug.
> 
> Hm. I am not sure if I have included it. I am running with
> mainline with just a handful of our recent patches (not
> everything).
> 
> But I can try it as soon as I find time (kernel compiling has
> got much slower in recent years so that I have new ideas
> faster than a test result :).
> 
>> 
>> Just to sort out some non-mainline stuff
>> 
>> I am compiling and will try to boot with my scripts this evening.
> 
> Yes, please. At least to check if you can observe the same issue.
> It seems to happen more often on a GTA04A5 that a GTA04A4.
> 
> And the strcmp(NULL, "backlight_pins_pinmux") occurred
> first in 4.17-rc1. But it might just be a more visible
> symptom and the bug may be older.

Latest boot log:

[    8.381225] pwm_backlight_probe
[    8.384613] pwm-backlight backlight: backlight supply power not found, using dummy regulator
[    8.503112] pwm_backlight_probe probe error -517
[    8.517517] wwan_on_off_init: wwan_on_off_init
[    8.549804] (NULL device *): hwmon: 'gta04-battery' is not a valid name attribute, please fix
[    8.565521] pps_core: LinuxPPS API ver. 1 registered
[    8.570709] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti at linux.it>
[    8.604522] pinctrl_get_group_selector: strcmp: (null) backlight_pins_pinmux
[    8.612243] iio_charge:-749
[    8.622192] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[    8.641662] pgd = (ptrval)
[    8.644897] [00000000] *pgd=00000000
[    8.652252] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[    8.657897] Modules linked in: pps_core(+) encoder_opa362(+) wwan_on_off(+) snd_soc_gtm601 connector_analog_tv pwm_omap_dmtimer omapdss_base generic_adc_battery pwm_bl wlcore_sdio bmp280_spi bq27xxx_battery_hdq bq27xxx_battery omap_hdq omap2430 ov9655 v4l2_fwnode v4l2_common snd_soc_omap_mcbsp snd_soc_omap snd_pcm_dmaengine bmp280_i2c bmp280 videodev bmc150_magn_i2c tsc2007 bmc150_accel_i2c bmc150_magn at24 bmc150_accel_core leds_tca6507 industrialio_triggered_buffer media kfifo_buf phy_twl4030_usb gpio_twl4030 musb_hdrc twl4030_pwrbutton twl4030_vibra snd_soc_twl4030 twl4030_madc twl4030_charger industrialio gnss_w2sg0004 w2cbw003_bluetooth gnss ehci_omap
[    8.718811] CPU: 0 PID: 917 Comm: kworker/0:2 Not tainted 4.17.0-rc3-letux+ #2356
[    8.726623] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[    8.733184] Workqueue: events deferred_probe_work_func
[    8.738555] PC is at strcmp+0x0/0x34
[    8.742309] LR is at pinctrl_get_group_selector+0x6c/0xa8
[    8.747955] pc : [<c071a6e0>]    lr : [<c0436880>]    psr: 60000013
[    8.754516] sp : ee6a9e08  ip : 00000000  fp : 0000001a
[    8.759979] r10: 00000017  r9 : 0000001a  r8 : c075b144
[    8.765472] r7 : 00000000  r6 : ee3e4d80  r5 : ef7c4a20  r4 : 00000017
[    8.772308] r3 : 00000000  r2 : 00000002  r1 : ef7c4a20  r0 : 00000000
[    8.779144] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[    8.786621] Control: 10c5387d  Table: ad02c019  DAC: 00000051
[    8.792633] Process kworker/0:2 (pid: 917, stack limit = 0x(ptrval))
[    8.799285] Stack: (0xee6a9e08 to 0xee6aa000)
[    8.803833] 9e00:                   ee6a9e2c ee3e4d80 ef7c4a20 00000000 ed0cce10 00000000
[    8.812408] 9e20: ed071140 c0437848 ed0d8210 00000001 00000002 ed265d00 ed071180 ed0711c0
[    8.820953] 9e40: ed0cce10 00000000 00000000 ed071140 00000000 c0436080 00000014 ee26e4c0
[    8.829498] 9e60: c0b5a0b0 c0890054 ee2b8010 00000000 ed265d50 ee2b8010 c0bccf44 fffffdfb
[    8.838073] 9e80: bf1ac014 00000029 c0b95730 c0436248 00000000 ee2b8010 ed071290 c04c8cac
[    8.846618] 9ea0: ee2b8010 00000000 c0bccf48 c04acd3c 00000000 ee6a9ee8 c04ad048 ee2b8044
[    8.855194] 9ec0: ef7baf00 c0b03d00 00000000 c04ab59c ee020e6c ed146b38 ee2b8010 c0b64550
[    8.863769] 9ee0: 00000001 c04acbbc ee2b8010 00000001 00000000 ee2b8010 c0b64550 ee2b8010
[    8.872314] 9f00: c0b9f060 c04ac178 ee2b8010 c0b6433c c0b64358 c04ac688 c04ac59c ee6caa80
[    8.880859] 9f20: c0b64370 ef7b7c40 00000000 ef7baf00 c0b03d00 c01444f0 ee6caa80 c0b64370
[    8.889404] 9f40: ffff8e2b ee6caa80 ef7b7c40 ef7b7c40 ee6a8000 ef7b7c58 c0b03d00 ee6caa98
[    8.897979] 9f60: 00000008 c0144cc4 ee223280 ee374180 ee374240 00000000 ee6caa80 c0144a04
[    8.906555] 9f80: ee0bbef0 ee37419c 00000000 c0148cbc ee374240 c0148b88 00000000 00000000
[    8.915100] 9fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
[    8.923645] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    8.932220] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[    8.940795] [<c071a6e0>] (strcmp) from [<c0436880>] (pinctrl_get_group_selector+0x6c/0xa8)
[    8.949462] [<c0436880>] (pinctrl_get_group_selector) from [<c0437848>] (pinmux_map_to_setting+0x158/0x1a0)
[    8.959655] [<c0437848>] (pinmux_map_to_setting) from [<c0436080>] (create_pinctrl+0x1f0/0x2f8)
[    8.968749] [<c0436080>] (create_pinctrl) from [<c0436248>] (devm_pinctrl_get+0x2c/0x6c)
[    8.977233] [<c0436248>] (devm_pinctrl_get) from [<c04c8cac>] (pinctrl_bind_pins+0x3c/0x138)
[    8.986083] [<c04c8cac>] (pinctrl_bind_pins) from [<c04acd3c>] (driver_probe_device+0xe8/0x318)
[    8.995178] [<c04acd3c>] (driver_probe_device) from [<c04ab59c>] (bus_for_each_drv+0x84/0x94)
[    9.004119] [<c04ab59c>] (bus_for_each_drv) from [<c04acbbc>] (__device_attach+0x88/0xfc)
[    9.012695] [<c04acbbc>] (__device_attach) from [<c04ac178>] (bus_probe_device+0x28/0x80)
[    9.021240] [<c04ac178>] (bus_probe_device) from [<c04ac688>] (deferred_probe_work_func+0xec/0x120)
[    9.030731] [<c04ac688>] (deferred_probe_work_func) from [<c01444f0>] (process_one_work+0x244/0x464)
[    9.040313] [<c01444f0>] (process_one_work) from [<c0144cc4>] (worker_thread+0x2c0/0x3ec)
[    9.048889] [<c0144cc4>] (worker_thread) from [<c0148cbc>] (kthread+0x134/0x150)
[    9.056610] [<c0148cbc>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[    9.064178] Exception stack(0xee6a9fb0 to 0xee6a9ff8)
[    9.069458] 9fa0:                                     00000000 00000000 00000000 00000000
[    9.078002] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    9.086578] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    9.093505] Code: e3520000 e5e32001 1afffffb e12fff1e (e4d03001)
[    9.563262] ---[ end trace 27838669a01b24aa ]---
[   11.037506] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[   11.078430] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[   11.100646] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[   11.119567] cfg80211: failed to load regulatory.db
[   22.575958] random: crng init done

This was with
CONFIG_HIBERNATION=y
CONFIG_DEBUG_KERNEL=y
CONFIG_PAGE_POISONING=y
CONFIG_PAGE_POISONING_NO_SANITY=y
CONFIG_DEBUG_SLAB=y

The GTA04A5 did finally boot to a login:

root at letux:~# lsmod
Module                  Size  Used by
bnep                   20480  2
bluetooth             282624  5 bnep
ecdh_generic           24576  1 bluetooth
usb_f_ecm              16384  1
g_ether                16384  0
usb_f_rndis            20480  2 g_ether
u_ether                20480  3 usb_f_ecm,g_ether,usb_f_rndis
libcomposite           36864  3 usb_f_ecm,g_ether,usb_f_rndis
configfs               32768  4 usb_f_ecm,usb_f_rndis,libcomposite
ipv6                  319488  16
wl18xx                 86016  1
wlcore                163840  1 wl18xx
mac80211              503808  2 wl18xx,wlcore
cfg80211              491520  3 wl18xx,wlcore,mac80211
panel_tpo_td028ttec1    16384  0
pps_gpio               16384  1
snd_soc_simple_card    16384  1
snd_soc_simple_card_utils    16384  1 snd_soc_simple_card
snd_soc_omap_twl4030    16384  1
pps_core               16384  1 pps_gpio
encoder_opa362         16384  1
wwan_on_off            16384  1
snd_soc_gtm601         16384  0
connector_analog_tv    16384  0
pwm_omap_dmtimer       16384  0
omapdss_base           16384  3 connector_analog_tv,encoder_opa362,panel_tpo_td028ttec1
generic_adc_battery    16384  0
pwm_bl                 16384  0
wlcore_sdio            16384  0
bmp280_spi             16384  0
bq27xxx_battery_hdq    16384  0
bq27xxx_battery        20480  1 bq27xxx_battery_hdq
omap_hdq               16384  0
omap2430               16384  0
ov9655                 20480  0
v4l2_fwnode            16384  1 ov9655
v4l2_common            16384  1 ov9655
snd_soc_omap_mcbsp     24576  0
snd_soc_omap           16384  1 snd_soc_omap_mcbsp
snd_pcm_dmaengine      16384  1 snd_soc_omap
bmp280_i2c             16384  0
bmp280                 20480  2 bmp280_spi,bmp280_i2c
videodev              139264  3 v4l2_fwnode,v4l2_common,ov9655
bmc150_magn_i2c        16384  0
tsc2007                16384  0
bmc150_accel_i2c       16384  0
bmc150_magn            16384  1 bmc150_magn_i2c
at24                   20480  0
bmc150_accel_core      20480  1 bmc150_accel_i2c
leds_tca6507           16384  0
industrialio_triggered_buffer    16384  2 bmc150_accel_core,bmc150_magn
media                  24576  2 videodev,ov9655
kfifo_buf              16384  1 industrialio_triggered_buffer
phy_twl4030_usb        16384  3
gpio_twl4030           16384  0
musb_hdrc             106496  2 omap2430,phy_twl4030_usb
twl4030_pwrbutton      16384  0
twl4030_vibra          16384  0
snd_soc_twl4030        49152  0
twl4030_madc           16384  0
twl4030_charger        20480  0
industrialio           53248  9 bmc150_accel_core,tsc2007,generic_adc_battery,twl4030_charger,bmp280,bmc150_magn,industrialio_triggered_buffer,twl4030_madc,kfifo_buf
gnss_w2sg0004          16384  0
w2cbw003_bluetooth     16384  0
gnss                   16384  1 gnss_w2sg0004
ehci_omap              16384  0
root at letux:~# modprobe omapdss
[  319.651580] omapdss: unknown parameter 'def_disp' ignored


^C^C^C^C^C^C^C


^C^C^C^C

but then did hang.


I start to think that something in the pinctrl group list management is lacking
a mutex so that the list searched by pinctrl_get_group_selector() is damaged.

Debugging concurrent thread issues is very very difficult...

BR,
Nikolaus

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.goldelico.com/pipermail/letux-kernel/attachments/20180529/fd455431/attachment-0001.asc>


More information about the Letux-kernel mailing list