[Letux-kernel] Strange bug
H. Nikolaus Schaller
hns at goldelico.com
Thu May 17 11:19:31 CEST 2018
> Am 16.05.2018 um 21:55 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
>
> Hi,
> I think since 4.17-rc1 (maybe before), I am sometimes
> hitting a strange bug, a NULL dereference in strcmp().
>
> The call stack isn't very helpful since it happens
> in some pinctrl setup:
>
> [ 13.274017] omap-iommu 480bd400.mmu: 480bd400.mmu: version 1.1
> [ 13.306854] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP CCP2 was not initialized!
> [ 13.360168] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP CSI2a was not initialized!
> [ 13.423461] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP CCDC was not initialized!
> [ 13.473449] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP preview was not initialized!
> [ 13.494995] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP resizer was not initialized!
> [ 13.525878] pinctrl_generic_add_group: backlight_pins_pinmux (1)
> [ 13.532379] pwm-backlight backlight: backlight supply power not found, using dummy regulator
> [ 13.557037] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP AEWB was not initialized!
> [ 13.584777] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP AF was not initialized!
> [ 13.593536] omap3isp 480bc000.isp: Entity type for entity OMAP3 ISP histogram was not initialized!
> [ 13.603454] gab_probe: channel 2 error: -19
> [ 13.609344] (NULL device *): hwmon: 'gta04-battery' is not a valid name attribute, please fix
> [ 13.635070] read_channel: 17 1 (ptrval) (ptrval)
> [ 13.643951] pinctrl_generic_add_group: backlight_pins_pinmux (1)
> [ 13.650787] iio_charge:-749
> [ 13.658721] pinctrl_get_group_selector: (null) backlight_pins_pinmux
> [ 13.671844] Unable to handle kernel NULL pointer dereference at virtual address 00000000
> [ 13.687347] pinctrl_generic_add_group: pinmux_mcbsp4_pins (3)
> [ 13.693389] pgd = (ptrval)
> [ 13.696380] [00000000] *pgd=00000000
> [ 13.700134] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
> [ 13.705780] Modules linked in: snd_soc_gtm601 pwm_omap_dmtimer connector_analog_tv generic_adc_battery pwm_bl bq27xxx_battery_hdq bq27xxx_battery omap3_isp(+) videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common omap_hdq omap2430 snd_soc_omap_mcbsp(+) snd_soc_omap snd_pcm_dmaengine bmp280_i2c(+) bmp280 ov9655(+) v4l2_fwnode v4l2_common itg3200 at24 videodev phy_twl4030_usb tsc2007 hmc5843_i2c hmc5843_core bma180 musb_hdrc industrialio_triggered_buffer lis3lv02d_i2c media leds_tca6507 lis3lv02d kfifo_buf input_polldev gpio_twl4030 snd_soc_twl4030 twl4030_vibra twl4030_charger twl4030_pwrbutton twl4030_madc industrialio w2sg0004 gps_core w2cbw003_bluetooth ehci_omap omapdss omapdss_base cec
> [ 13.771331] CPU: 0 PID: 937 Comm: kworker/0:2 Not tainted 4.17.0-rc2-letux+ #2290
> [ 13.779174] Hardware name: Generic OMAP36xx (Flattened Device Tree)
> [ 13.785736] Workqueue: events deferred_probe_work_func
> [ 13.791107] PC is at strcmp+0x0/0x34
> [ 13.794860] LR is at pinctrl_get_group_selector+0x6c/0xa8
> [ 13.800506] pc : [<c0705afc>] lr : [<c0429840>] psr: 600f0013
> [ 13.807067] sp : dddf9e08 ip : 00000000 fp : 00000011
> [ 13.812530] r10: 0000000f r9 : 00000010 r8 : c0745b0c
> [ 13.817993] r7 : 00000000 r6 : ddc41600 r5 : df9c5274 r4 : 0000000e
> [ 13.824829] r3 : 600f0013 r2 : 00000002 r1 : df9c5274 r0 : 00000000
> [ 13.831665] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> [ 13.839111] Control: 10c5387d Table: 9c88c019 DAC: 00000051
> [ 13.845123] Process kworker/0:2 (pid: 937, stack limit = 0x(ptrval))
> [ 13.851776] Stack: (0xdddf9e08 to 0xdddfa000)
> [ 13.856323] 9e00: dddf9e2c ddc41600 df9c5274 00000000 dcc99f10 00000000
> [ 13.864868] 9e20: dcc98b80 c042a808 dcc84110 00000001 00000002 dcc98e40 dcc5bd80 dcc85f40
> [ 13.873413] 9e40: dcc99f10 00000000 00000000 dcc98b80 00000000 c0429040 00000014 dda424c0
> [ 13.881988] 9e60: c0a58df0 c0876b8c dda47c10 00000000 dca09110 dda47c10 c0aca804 fffffdfb
> [ 13.890563] 9e80: bf282014 0000002c c0a94390 c0429208 00000000 dda47c10 dca09150 c04bb85c
> [ 13.899108] 9ea0: dda47c10 00000000 c0aca808 c049fb58 00000000 dddf9ee8 c049fe64 dda47c44
> [ 13.907714] 9ec0: df9bbe00 c0a02d00 00000000 c049e3b8 dd81be6c dcc3d738 dda47c10 c0a63258
> [ 13.916259] 9ee0: 00000001 c049f9d8 dda47c10 00000001 00000000 dda47c10 c0a63258 dda47c10
> [ 13.924804] 9f00: c0a9dba0 c049ef94 dda47c10 c0a63044 c0a63060 c049f4a4 c049f3b8 dde97900
> [ 13.933380] 9f20: c0a63078 df9b8c00 00000000 df9bbe00 c0a02d00 c0145dac dde97900 c0a63078
> [ 13.941955] 9f40: ffff9023 dde97900 df9b8c00 df9b8c00 dddf8000 df9b8c18 c0a02d00 dde97918
> [ 13.950500] 9f60: 00000008 c0146580 dda48240 dde09ec0 dde16640 00000000 dde97900 c01462c0
> [ 13.959075] 9f80: dd8b1ef0 dde09edc 00000000 c014a598 dde16640 c014a464 00000000 00000000
> [ 13.967620] 9fa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
> [ 13.976196] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 13.984741] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
> [ 13.993316] [<c0705afc>] (strcmp) from [<c0429840>] (pinctrl_get_group_selector+0x6c/0xa8)
> [ 14.001953] [<c0429840>] (pinctrl_get_group_selector) from [<c042a808>] (pinmux_map_to_setting+0x158/0x1a0)
> [ 14.012145] [<c042a808>] (pinmux_map_to_setting) from [<c0429040>] (create_pinctrl+0x1f0/0x2f8)
> [ 14.021240] [<c0429040>] (create_pinctrl) from [<c0429208>] (devm_pinctrl_get+0x2c/0x6c)
> [ 14.029724] [<c0429208>] (devm_pinctrl_get) from [<c04bb85c>] (pinctrl_bind_pins+0x3c/0x138)
> [ 14.038574] [<c04bb85c>] (pinctrl_bind_pins) from [<c049fb58>] (driver_probe_device+0xe8/0x318)
> [ 14.047668] [<c049fb58>] (driver_probe_device) from [<c049e3b8>] (bus_for_each_drv+0x84/0x94)
> [ 14.056610] [<c049e3b8>] (bus_for_each_drv) from [<c049f9d8>] (__device_attach+0x88/0xfc)
> [ 14.065155] [<c049f9d8>] (__device_attach) from [<c049ef94>] (bus_probe_device+0x28/0x80)
> [ 14.073730] [<c049ef94>] (bus_probe_device) from [<c049f4a4>] (deferred_probe_work_func+0xec/0x120)
> [ 14.083221] [<c049f4a4>] (deferred_probe_work_func) from [<c0145dac>] (process_one_work+0x244/0x464)
> [ 14.092803] [<c0145dac>] (process_one_work) from [<c0146580>] (worker_thread+0x2c0/0x3ec)
> [ 14.101348] [<c0146580>] (worker_thread) from [<c014a598>] (kthread+0x134/0x150)
> [ 14.109100] [<c014a598>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
> [ 14.116638] Exception stack(0xdddf9fb0 to 0xdddf9ff8)
> [ 14.121917] 9fa0: 00000000 00000000 00000000 00000000
> [ 14.130462] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 14.139038] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ 14.145935] Code: e3520000 e5e32001 1afffffb e12fff1e (e4d03001)
> [ 14.282989] twl4030_voice_set_tristate codec=(ptrval) 1
> [ 14.294921] TPS Voice IF is tristated
> [ 14.299560] omap-mcbsp 49022000.mcbsp: ASoC: Failed to create component debugfs directory
>
> After that, the GTA04 kernel doesn't recover and finish booting.
>
> The problem does not appear always, so that after rebooting once or twice everything
> seems to be fine.
>
> What I found as a "cure" is to add:
>
> root at letux:~# more /etc/modprobe.d/blacklist.conf
> blacklist pwm_bl
> blacklist omap3_isp
> root at letux:~#
commenting out omap3_isp makes no difference.
But not blocking pwm_bl (even without loading omap3_isp)makes it hang again:
[ 6.365997] bq27xxx_battery_settings
[ 6.384368] bq27xxx_battery_settings: power_supply_get_battery_info failed ret=-1088446444
[ 6.466979] pwm-backlight backlight: backlight supply power not found, using dummy regulator
[ 6.559082] wwan_on_off_init: wwan_on_off_init
[ 6.569183] (NULL device *): hwmon: 'gta04-battery' is not a valid name attribute, please fix
[ 6.615905] pps_core: LinuxPPS API ver. 1 registered
[ 6.637145] iio_charge:-749
[ 6.642486] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[ 6.666229] pgd = (ptrval)
[ 6.669219] [00000000] *pgd=00000000
[ 6.672973] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[ 6.678619] Modules linked in: snd_soc_simple_card_utils snd_soc_omap_twl4030(+) pps_core(+) encoder_opa362 wwan_on_off(+) snd_soc_gtm601 connector_analog_tv pwm_omap_dmtimer generic_adc_battery pwm_bl bq27xxx_battery_hdq bq27xxx_battery bmp280_spi wlcore_sdio ov9655 v4l2_fwnode v4l2_common omap_hdq omap2430 bmp280_i2c videodev bmp280 at24 bmc150_accel_i2c tsc2007 leds_tca6507 bmc150_magn_i2c bmc150_accel_core bmc150_magn media industrialio_triggered_buffer kfifo_buf phy_twl4030_usb snd_soc_omap_mcbsp snd_soc_omap snd_pcm_dmaengine gpio_twl4030 musb_hdrc snd_soc_twl4030 gnss_w2sg0004 twl4030_vibra twl4030_madc twl4030_charger twl4030_pwrbutton industrialio gnss w2cbw003_bluetooth ehci_omap omapdss omapdss_base cec
[ 6.744812] CPU: 0 PID: 43 Comm: kworker/0:1 Not tainted 4.17.0-rc5-letux+ #2330
[ 6.752532] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[ 6.759094] Workqueue: events deferred_probe_work_func
[ 6.764465] PC is at strcmp+0x0/0x34
[ 6.768218] LR is at pinctrl_get_group_selector+0x44/0x78
[ 6.773864] pc : [<c070d12c>] lr : [<c0429aac>] psr: a0000113
[ 6.780395] sp : ee23de10 ip : ed1dce90 fp : 0000001e
[ 6.785827] r10: 0000001b r9 : ed0c8840 r8 : 0000001e
[ 6.791290] r7 : c074cca8 r6 : ef7c4a20 r5 : ee42a600 r4 : 0000001b
[ 6.798095] r3 : c0427cf0 r2 : 00000000 r1 : ef7c4a20 r0 : 00000000
[ 6.804901] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 6.812347] Control: 10c5387d Table: ae7a4019 DAC: 00000051
[ 6.818328] Process kworker/0:1 (pid: 43, stack limit = 0x(ptrval))
[ 6.824890] Stack: (0xee23de10 to 0xee23e000)
[ 6.829437] de00: ee42a600 ef7c4a20 00000000 ed1d8810
[ 6.837982] de20: 00000000 c042aa6c ed206fd0 00000001 00000002 ed0c8cc0 ed0c8880 ed0c88c0
[ 6.846527] de40: ed1d8810 00000000 00000000 ed0c8840 00000000 c04292d4 00000014 ee25fa40
[ 6.855072] de60: c0a58e30 c087f8cc ee25d010 00000000 ed0c8d10 ee25d010 c0aca944 fffffdfb
[ 6.863616] de80: bf202014 0000002c c0a94490 c042949c 00000000 ee25d010 ed0c8990 c04bbb58
[ 6.872161] dea0: ee25d010 00000000 c0aca948 c049fe50 00000000 ee23dee8 c04a015c ee25d044
[ 6.880706] dec0: ef7baf00 c0a02d00 00000000 c049e6b0 ee01be6c ed1fad38 ee25d010 c0a632d0
[ 6.889251] dee0: 00000001 c049fcd0 ee25d010 00000001 00000000 ee25d010 c0a632d0 ee25d010
[ 6.897766] df00: c0a9dce0 c049f28c ee25d010 c0a630bc c0a630d8 c049f79c c049f6b0 ee1dbe80
[ 6.906311] df20: c0a630f0 ef7b7c40 00000000 ef7baf00 c0a02d00 c0145ef0 ee1dbe80 c0a630f0
[ 6.914855] df40: ffff8d67 ee1dbe80 ef7b7c40 ef7b7c40 ee23c000 ef7b7c58 c0a02d00 ee1dbe98
[ 6.923400] df60: 00000008 c01466c4 00000000 ee2097c0 ee209800 00000000 ee1dbe80 c0146404
[ 6.931915] df80: ee0b5ef0 ee2097dc 00000000 c014a6dc ee209800 c014a5a8 00000000 00000000
[ 6.940460] dfa0: 00000000 00000000 00000000 c01010e8 00000000 00000000 00000000 00000000
[ 6.949005] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 6.957519] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[ 6.966064] [<c070d12c>] (strcmp) from [<c0429aac>] (pinctrl_get_group_selector+0x44/0x78)
[ 6.974700] [<c0429aac>] (pinctrl_get_group_selector) from [<c042aa6c>] (pinmux_map_to_setting+0x158/0x1a0)
[ 6.984893] [<c042aa6c>] (pinmux_map_to_setting) from [<c04292d4>] (create_pinctrl+0x1f0/0x2f8)
[ 6.993957] [<c04292d4>] (create_pinctrl) from [<c042949c>] (devm_pinctrl_get+0x2c/0x6c)
[ 7.002410] [<c042949c>] (devm_pinctrl_get) from [<c04bbb58>] (pinctrl_bind_pins+0x3c/0x138)
[ 7.011230] [<c04bbb58>] (pinctrl_bind_pins) from [<c049fe50>] (driver_probe_device+0xe8/0x318)
[ 7.020294] [<c049fe50>] (driver_probe_device) from [<c049e6b0>] (bus_for_each_drv+0x84/0x94)
[ 7.029205] [<c049e6b0>] (bus_for_each_drv) from [<c049fcd0>] (__device_attach+0x88/0xfc)
[ 7.037750] [<c049fcd0>] (__device_attach) from [<c049f28c>] (bus_probe_device+0x28/0x80)
[ 7.046295] [<c049f28c>] (bus_probe_device) from [<c049f79c>] (deferred_probe_work_func+0xec/0x120)
[ 7.055755] [<c049f79c>] (deferred_probe_work_func) from [<c0145ef0>] (process_one_work+0x244/0x464)
[ 7.065277] [<c0145ef0>] (process_one_work) from [<c01466c4>] (worker_thread+0x2c0/0x3ec)
[ 7.073822] [<c01466c4>] (worker_thread) from [<c014a6dc>] (kthread+0x134/0x150)
[ 7.081542] [<c014a6dc>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[ 7.089080] Exception stack(0xee23dfb0 to 0xee23dff8)
[ 7.094360] dfa0: 00000000 00000000 00000000 00000000
[ 7.102905] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 7.111450] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 7.118347] Code: e3520000 e5e32001 1afffffb e12fff1e (e4d03001)
[ 7.127014] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti at linux.it>
[ 7.586273] ---[ end trace 3a04ee80f8726c81 ]---
[ 8.385223] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 8.422821] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 8.446533] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[ 8.455566] cfg80211: failed to load regulatory.db
[ 19.887054] random: crng init done
So it seems to be the pwm_bl driver alone which makes trouble...
On a first code scan of https://elixir.bootlin.com/linux/v4.17-rc5/source/drivers/video/backlight/pwm_bl.c
I couldn't find anything obviously harmful. Except that there is no explicit mechanism for the enable-gpio
returning -EPROBE_DEFER.
Well, there is one thing: the code seems to require an explicitly specified "power" regulator.
AFAIR, we don't have such, but the code reports that it substitutes a dummy regulator:
pwm-backlight backlight: backlight supply power not found, using dummy regulator
>
> Then, of course, there is no backlight. I can start it manually with
>
> root at letux:~# modprobe pwm_bl
>
> I am not sure who the troublemaker is: the omap3isp or the pwm_bl.
>
> In any case it seems not to be the driver, because the pinctrl_get_group_selector()
> is called before the driver is probed.
>
> So we either have some other driver which damages the pinctrl groups in a way that
> those driver fail, or we have a bug in out device tree.
>
> Or something with the pinctrl group setup isn't properly locked so that there is
> a strcmp() on some name before it is initialized.
>
> Or some driver has a bad error path so that the deferred_probe_work_func() gets into
> trouble on the second attempt.
>
> I had also tried to read the pinctrl groups by debugfs and if I remember correctly,
> there were multiple entries of some devices.
>
> Anyone with similar experiences?
> Any ideas how to debug further?
>
> BR and thanks,
> Nikolaus
>
More information about the Letux-kernel
mailing list