[Letux-kernel] Strange bug

H. Nikolaus Schaller hns at goldelico.com
Thu May 17 11:19:31 CEST 2018


> Am 16.05.2018 um 21:55 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
> 
> Hi,
> I think since 4.17-rc1 (maybe before), I am sometimes
> hitting a strange bug, a NULL dereference in strcmp().
> 
> The call stack isn't very helpful since it happens
> in some pinctrl setup:
> 
> [   13.274017] omap-iommu 480bd400.mmu: 480bd400.mmu: version 1.1
> [   13.306854] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP CCP2 was not initialized!
> [   13.360168] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP CSI2a was not initialized!
> [   13.423461] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP CCDC was not initialized!
> [   13.473449] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP preview was not initialized!
> [   13.494995] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP resizer was not initialized!
> [   13.525878] pinctrl_generic_add_group: backlight_pins_pinmux (1)
> [   13.532379] pwm-backlight backlight: backlight supply power not found, using dummy regulator
> [   13.557037] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP AEWB was not initialized!
> [   13.584777] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP AF was not initialized!
> [   13.593536] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP histogram was not initialized!
> [   13.603454] gab_probe: channel 2 error: -19
> [   13.609344] (NULL device *): hwmon: 'gta04-battery' is not a valid name attribute, please fix
> [   13.635070] read_channel: 17 1 (ptrval) (ptrval)
> [   13.643951] pinctrl_generic_add_group: backlight_pins_pinmux (1)
> [   13.650787] iio_charge:-749
> [   13.658721] pinctrl_get_group_selector: (null) backlight_pins_pinmux
> [   13.671844] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [   13.687347] pinctrl_generic_add_group: pinmux_mcbsp4_pins (3)
> [   13.693389] pgd = (ptrval)
> [   13.696380] [00000000] *pgd=00000000
> [   13.700134] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> [   13.705780] Modules linked in: snd_soc_gtm601 pwm_omap_dmtimer connector_analog_tv generic_adc_battery pwm_bl bq27xxx_battery_hdq bq27xxx_battery omap3_isp(+) videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common omap_hdq omap2430 snd_soc_omap_mcbsp(+) snd_soc_omap snd_pcm_dmaengine bmp280_i2c(+) bmp280 ov9655(+) v4l2_fwnode v4l2_common itg3200 at24 videodev phy_twl4030_usb tsc2007 hmc5843_i2c hmc5843_core bma180 musb_hdrc industrialio_triggered_buffer lis3lv02d_i2c media leds_tca6507 lis3lv02d kfifo_buf input_polldev gpio_twl4030 snd_soc_twl4030 twl4030_vibra twl4030_charger twl4030_pwrbutton twl4030_madc industrialio w2sg0004 gps_core w2cbw003_bluetooth ehci_omap omapdss omapdss_base cec
> [   13.771331] CPU: 0 PID: 937 Comm: kworker/0:2 Not tainted 4.17.0-rc2-letux+ #2290
> [   13.779174] Hardware name: Generic OMAP36xx (Flattened Device Tree)
> [   13.785736] Workqueue: events deferred_probe_work_func
> [   13.791107] PC is at strcmp+0x0/0x34
> [   13.794860] LR is at pinctrl_get_group_selector+0x6c/0xa8
> [   13.800506] pc : [<c0705afc>]    lr : [<c0429840>]    psr: 600f0013
> [   13.807067] sp : dddf9e08  ip : 00000000  fp : 00000011
> [   13.812530] r10: 0000000f  r9 : 00000010  r8 : c0745b0c
> [   13.817993] r7 : 00000000  r6 : ddc41600  r5 : df9c5274  r4 : 0000000e
> [   13.824829] r3 : 600f0013  r2 : 00000002  r1 : df9c5274  r0 : 00000000
> [   13.831665] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
> [   13.839111] Control: 10c5387d  Table: 9c88c019  DAC: 00000051
> [   13.845123] Process kworker/0:2 (pid: 937, stack limit = 0x(ptrval))
> [   13.851776] Stack: (0xdddf9e08 to 0xdddfa000)
> [   13.856323] 9e00:                   dddf9e2c ddc41600 df9c5274 00000000 dcc99f10 00000000
> [   13.864868] 9e20: dcc98b80 c042a808 dcc84110 00000001 00000002 dcc98e40 dcc5bd80 dcc85f40
> [   13.873413] 9e40: dcc99f10 00000000 00000000 dcc98b80 00000000 c0429040 00000014 dda424c0
> [   13.881988] 9e60: c0a58df0 c0876b8c dda47c10 00000000 dca09110 dda47c10 c0aca804 fffffdfb
> [   13.890563] 9e80: bf282014 0000002c c0a94390 c0429208 00000000 dda47c10 dca09150 c04bb85c
> [   13.899108] 9ea0: dda47c10 00000000 c0aca808 c049fb58 00000000 dddf9ee8 c049fe64 dda47c44
> [   13.907714] 9ec0: df9bbe00 c0a02d00 00000000 c049e3b8 dd81be6c dcc3d738 dda47c10 c0a63258
> [   13.916259] 9ee0: 00000001 c049f9d8 dda47c10 00000001 00000000 dda47c10 c0a63258 dda47c10
> [   13.924804] 9f00: c0a9dba0 c049ef94 dda47c10 c0a63044 c0a63060 c049f4a4 c049f3b8 dde97900
> [   13.933380] 9f20: c0a63078 df9b8c00 00000000 df9bbe00 c0a02d00 c0145dac dde97900 c0a63078
> [   13.941955] 9f40: ffff9023 dde97900 df9b8c00 df9b8c00 dddf8000 df9b8c18 c0a02d00 dde97918
> [   13.950500] 9f60: 00000008 c0146580 dda48240 dde09ec0 dde16640 00000000 dde97900 c01462c0
> [   13.959075] 9f80: dd8b1ef0 dde09edc 00000000 c014a598 dde16640 c014a464 00000000 00000000
> [   13.967620] 9fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
> [   13.976196] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   13.984741] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [   13.993316] [<c0705afc>] (strcmp) from [<c0429840>] (pinctrl_get_group_selector+0x6c/0xa8)
> [   14.001953] [<c0429840>] (pinctrl_get_group_selector) from [<c042a808>] (pinmux_map_to_setting+0x158/0x1a0)
> [   14.012145] [<c042a808>] (pinmux_map_to_setting) from [<c0429040>] (create_pinctrl+0x1f0/0x2f8)
> [   14.021240] [<c0429040>] (create_pinctrl) from [<c0429208>] (devm_pinctrl_get+0x2c/0x6c)
> [   14.029724] [<c0429208>] (devm_pinctrl_get) from [<c04bb85c>] (pinctrl_bind_pins+0x3c/0x138)
> [   14.038574] [<c04bb85c>] (pinctrl_bind_pins) from [<c049fb58>] (driver_probe_device+0xe8/0x318)
> [   14.047668] [<c049fb58>] (driver_probe_device) from [<c049e3b8>] (bus_for_each_drv+0x84/0x94)
> [   14.056610] [<c049e3b8>] (bus_for_each_drv) from [<c049f9d8>] (__device_attach+0x88/0xfc)
> [   14.065155] [<c049f9d8>] (__device_attach) from [<c049ef94>] (bus_probe_device+0x28/0x80)
> [   14.073730] [<c049ef94>] (bus_probe_device) from [<c049f4a4>] (deferred_probe_work_func+0xec/0x120)
> [   14.083221] [<c049f4a4>] (deferred_probe_work_func) from [<c0145dac>] (process_one_work+0x244/0x464)
> [   14.092803] [<c0145dac>] (process_one_work) from [<c0146580>] (worker_thread+0x2c0/0x3ec)
> [   14.101348] [<c0146580>] (worker_thread) from [<c014a598>] (kthread+0x134/0x150)
> [   14.109100] [<c014a598>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> [   14.116638] Exception stack(0xdddf9fb0 to 0xdddf9ff8)
> [   14.121917] 9fa0:                                     00000000 00000000 00000000 00000000
> [   14.130462] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [   14.139038] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [   14.145935] Code: e3520000 e5e32001 1afffffb e12fff1e (e4d03001) 
> [   14.282989] twl4030_voice_set_tristate codec=(ptrval) 1
> [   14.294921] TPS Voice IF is tristated
> [   14.299560] omap-mcbsp 49022000.mcbsp: ASoC: Failed to create component debugfs directory
> 
> After that, the GTA04 kernel doesn't recover and finish booting.
> 
> The problem does not appear always, so that after rebooting once or twice everything
> seems to be fine.
> 
> What I found as a "cure" is to add:
> 
> root at letux:~# more /etc/modprobe.d/blacklist.conf 
> blacklist pwm_bl
> blacklist omap3_isp
> root at letux:~# 

commenting out omap3_isp makes no difference.

But not blocking pwm_bl (even without loading omap3_isp)makes it hang again:

[    6.365997] bq27xxx_battery_settings
[    6.384368] bq27xxx_battery_settings: power_supply_get_battery_info failed ret=-1088446444
[    6.466979] pwm-backlight backlight: backlight supply power not found, using dummy regulator
[    6.559082] wwan_on_off_init: wwan_on_off_init
[    6.569183] (NULL device *): hwmon: 'gta04-battery' is not a valid name attribute, please fix
[    6.615905] pps_core: LinuxPPS API ver. 1 registered
[    6.637145] iio_charge:-749
[    6.642486] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[    6.666229] pgd = (ptrval)
[    6.669219] [00000000] *pgd=00000000
[    6.672973] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[    6.678619] Modules linked in: snd_soc_simple_card_utils snd_soc_omap_twl4030(+) pps_core(+) encoder_opa362 wwan_on_off(+) snd_soc_gtm601 connector_analog_tv pwm_omap_dmtimer generic_adc_battery pwm_bl bq27xxx_battery_hdq bq27xxx_battery bmp280_spi wlcore_sdio ov9655 v4l2_fwnode v4l2_common omap_hdq omap2430 bmp280_i2c videodev bmp280 at24 bmc150_accel_i2c tsc2007 leds_tca6507 bmc150_magn_i2c bmc150_accel_core bmc150_magn media industrialio_triggered_buffer kfifo_buf phy_twl4030_usb snd_soc_omap_mcbsp snd_soc_omap snd_pcm_dmaengine gpio_twl4030 musb_hdrc snd_soc_twl4030 gnss_w2sg0004 twl4030_vibra twl4030_madc twl4030_charger twl4030_pwrbutton industrialio gnss w2cbw003_bluetooth ehci_omap omapdss omapdss_base cec
[    6.744812] CPU: 0 PID: 43 Comm: kworker/0:1 Not tainted 4.17.0-rc5-letux+ #2330
[    6.752532] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[    6.759094] Workqueue: events deferred_probe_work_func
[    6.764465] PC is at strcmp+0x0/0x34
[    6.768218] LR is at pinctrl_get_group_selector+0x44/0x78
[    6.773864] pc : [<c070d12c>]    lr : [<c0429aac>]    psr: a0000113
[    6.780395] sp : ee23de10  ip : ed1dce90  fp : 0000001e
[    6.785827] r10: 0000001b  r9 : ed0c8840  r8 : 0000001e
[    6.791290] r7 : c074cca8  r6 : ef7c4a20  r5 : ee42a600  r4 : 0000001b
[    6.798095] r3 : c0427cf0  r2 : 00000000  r1 : ef7c4a20  r0 : 00000000
[    6.804901] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[    6.812347] Control: 10c5387d  Table: ae7a4019  DAC: 00000051
[    6.818328] Process kworker/0:1 (pid: 43, stack limit = 0x(ptrval))
[    6.824890] Stack: (0xee23de10 to 0xee23e000)
[    6.829437] de00:                                     ee42a600 ef7c4a20 00000000 ed1d8810
[    6.837982] de20: 00000000 c042aa6c ed206fd0 00000001 00000002 ed0c8cc0 ed0c8880 ed0c88c0
[    6.846527] de40: ed1d8810 00000000 00000000 ed0c8840 00000000 c04292d4 00000014 ee25fa40
[    6.855072] de60: c0a58e30 c087f8cc ee25d010 00000000 ed0c8d10 ee25d010 c0aca944 fffffdfb
[    6.863616] de80: bf202014 0000002c c0a94490 c042949c 00000000 ee25d010 ed0c8990 c04bbb58
[    6.872161] dea0: ee25d010 00000000 c0aca948 c049fe50 00000000 ee23dee8 c04a015c ee25d044
[    6.880706] dec0: ef7baf00 c0a02d00 00000000 c049e6b0 ee01be6c ed1fad38 ee25d010 c0a632d0
[    6.889251] dee0: 00000001 c049fcd0 ee25d010 00000001 00000000 ee25d010 c0a632d0 ee25d010
[    6.897766] df00: c0a9dce0 c049f28c ee25d010 c0a630bc c0a630d8 c049f79c c049f6b0 ee1dbe80
[    6.906311] df20: c0a630f0 ef7b7c40 00000000 ef7baf00 c0a02d00 c0145ef0 ee1dbe80 c0a630f0
[    6.914855] df40: ffff8d67 ee1dbe80 ef7b7c40 ef7b7c40 ee23c000 ef7b7c58 c0a02d00 ee1dbe98
[    6.923400] df60: 00000008 c01466c4 00000000 ee2097c0 ee209800 00000000 ee1dbe80 c0146404
[    6.931915] df80: ee0b5ef0 ee2097dc 00000000 c014a6dc ee209800 c014a5a8 00000000 00000000
[    6.940460] dfa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
[    6.949005] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    6.957519] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[    6.966064] [<c070d12c>] (strcmp) from [<c0429aac>] (pinctrl_get_group_selector+0x44/0x78)
[    6.974700] [<c0429aac>] (pinctrl_get_group_selector) from [<c042aa6c>] (pinmux_map_to_setting+0x158/0x1a0)
[    6.984893] [<c042aa6c>] (pinmux_map_to_setting) from [<c04292d4>] (create_pinctrl+0x1f0/0x2f8)
[    6.993957] [<c04292d4>] (create_pinctrl) from [<c042949c>] (devm_pinctrl_get+0x2c/0x6c)
[    7.002410] [<c042949c>] (devm_pinctrl_get) from [<c04bbb58>] (pinctrl_bind_pins+0x3c/0x138)
[    7.011230] [<c04bbb58>] (pinctrl_bind_pins) from [<c049fe50>] (driver_probe_device+0xe8/0x318)
[    7.020294] [<c049fe50>] (driver_probe_device) from [<c049e6b0>] (bus_for_each_drv+0x84/0x94)
[    7.029205] [<c049e6b0>] (bus_for_each_drv) from [<c049fcd0>] (__device_attach+0x88/0xfc)
[    7.037750] [<c049fcd0>] (__device_attach) from [<c049f28c>] (bus_probe_device+0x28/0x80)
[    7.046295] [<c049f28c>] (bus_probe_device) from [<c049f79c>] (deferred_probe_work_func+0xec/0x120)
[    7.055755] [<c049f79c>] (deferred_probe_work_func) from [<c0145ef0>] (process_one_work+0x244/0x464)
[    7.065277] [<c0145ef0>] (process_one_work) from [<c01466c4>] (worker_thread+0x2c0/0x3ec)
[    7.073822] [<c01466c4>] (worker_thread) from [<c014a6dc>] (kthread+0x134/0x150)
[    7.081542] [<c014a6dc>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[    7.089080] Exception stack(0xee23dfb0 to 0xee23dff8)
[    7.094360] dfa0:                                     00000000 00000000 00000000 00000000
[    7.102905] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    7.111450] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    7.118347] Code: e3520000 e5e32001 1afffffb e12fff1e (e4d03001) 
[    7.127014] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti at linux.it>
[    7.586273] ---[ end trace 3a04ee80f8726c81 ]---
[    8.385223] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    8.422821] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    8.446533] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[    8.455566] cfg80211: failed to load regulatory.db
[   19.887054] random: crng init done


So it seems to be the pwm_bl driver alone which makes trouble...

On a first code scan of https://elixir.bootlin.com/linux/v4.17-rc5/source/drivers/video/backlight/pwm_bl.c
I couldn't find anything obviously harmful. Except that there is no explicit mechanism for the enable-gpio
returning -EPROBE_DEFER.

Well, there is one thing: the code seems to require an explicitly specified "power" regulator.
AFAIR, we don't have such, but the code reports that it substitutes a dummy regulator:

pwm-backlight backlight: backlight supply power not found, using dummy regulator


> 
> Then, of course, there is no backlight. I can start it manually with
> 
> root at letux:~# modprobe pwm_bl
> 
> I am not sure who the troublemaker is: the omap3isp or the pwm_bl.
> 
> In any case it seems not to be the driver, because the pinctrl_get_group_selector()
> is called before the driver is probed.
> 
> So we either have some other driver which damages the pinctrl groups in a way that
> those driver fail, or we have a bug in out device tree.
> 
> Or something with the pinctrl group setup isn't properly locked so that there is
> a strcmp() on some name before it is initialized.
> 
> Or some driver has a bad error path so that the deferred_probe_work_func() gets into
> trouble on the second attempt.
> 
> I had also tried to read the pinctrl groups by debugfs and if I remember correctly,
> there were multiple entries of some devices.
> 
> Anyone with similar experiences?
> Any ideas how to debug further?
> 
> BR and thanks,
> Nikolaus
> 



More information about the Letux-kernel mailing list