[Letux-kernel] [PATCH RFC] net: hso: register netdev later to avoid a race condition
H. Nikolaus Schaller
hns at goldelico.com
Tue Apr 25 15:29:53 CEST 2017
> Am 25.04.2017 um 15:13 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
>
> Hi,
>
>> Am 24.04.2017 um 23:20 schrieb Andreas Kemnade <andreas at kemnade.info>:
>>
>> Hi,
>>
>> On Mon, 24 Apr 2017 22:52:34 +0200
>> "H. Nikolaus Schaller" <hns at goldelico.com> wrote:
>>
>>> Hi,
>>>
>>>> Am 24.04.2017 um 22:41 schrieb Andreas Kemnade <andreas at kemnade.info>:
>>>>
>>>> Hi,
>>>>
>>>> On Mon, 24 Apr 2017 22:36:11 +0200
>>>> Andreas Kemnade <andreas at kemnade.info> wrote:
>>>>
>>>>> If the netdev is accessed before the urbs are initialized,
>>>>> there will be NULL pointer dereferences
>>>>>
>>>>> Signed-off-by: Andreas Kemnade <andreas at kemnade.info>
>>>>
>>>> this should be a replacement for
>>>>
>>>> drivers: net: hso: hack to avoid NULL pointer dereferencing - the reason is unknown. Perhaps some race condition in the USB stack?
>>>>
>>>> I cannot reproduce the problem here even without that hack patch. There
>>>> was some dhcpd running? How was it started?
>>>
>>> I don't exactly remember. I think when starting the modem and during enumeration of the interfaces.
>>>
>>> If I remember correctly it started to occur with 4.11-rc1 or 2. Or maybe even earlier and was never seen again after applying the hack.
>>>
>>
>> from git show 64c8821c4167d1a7a13caaedb419366296c2de30
>> [ 672.198486] CPU: 0 PID: 2403 Comm: dhcpcd Tainted: G W 4.10.
>> 0-letux+ #848
>>
>> So it was 4.10. and there was a dhcpcd running. And it was probably
>> very quick at attaching the device. So it triggered the race condition.
>> It accesses the device when it is not yet ready.
>> But my patch should prevent that.
>>
>> There are no recent changes in hso anyway, so bugs may just randomly
>> get uncovered or hidden again.
>
> I tried to remove my patch and add yours and the bug is back.
> At least on GTA04A5.
On GTA04A4 it seems not to happen.
>
> It occurs right after the hso driver has been loaded for the first time.
>
>
>> root at letux:~# lsusb
>> Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>> root at letux:~# ./wwan-status
>> [ 357.913024] wwan_on_off_rfkill_set_block: blocked: 0
>> [ 357.919616] modem: set_power 1
>> [ 357.930999] state 0
>> [ 357.933502] modem: send impulse
>> [ 358.676696] modem: done
>> [ 360.946136] usb 1-2: new high-speed USB device number 2 using ehci-omap
>> [ 361.151885] usb 1-2: New USB device found, idVendor=0af0, idProduct=8800
>> [ 361.159393] usb 1-2: New USB device strings: Mfr=3, Product=2, SerialNumber=0
>> [ 361.168518] usb 1-2: Product: Globetrotter HSUPA Modem
>> [ 361.173950] usb 1-2: Manufacturer: Option N.V.
>> [ 361.420532] hso: drivers/net/usb/hso.c: Option Wireless
root at letux:~# ./wwan-status
[ 103.105682] wwan_on_off_rfkill_set_block: blocked: 0
[ 103.111480] modem: set_power 1
[ 103.118621] USB phy event 0
[ 103.121643] state 0
[ 103.124114] modem: send impulse
[ 103.865570] USB phy event 0
[ 103.868591] modem: done
[ 106.495788] usb 2-2: new high-speed USB device number 2 using ehci-omap
[ 106.702758] usb 2-2: New USB device found, idVendor=0af0, idProduct=8800
[ 106.713409] usb 2-2: New USB device strings: Mfr=3, Product=2, SerialNumber=0
[ 106.724822] usb 2-2: Product: Globetrotter HSUPA Modem
[ 106.730926] usb 2-2: Manufacturer: Option N.V.
[ 107.132171] hso: drivers/net/usb/hso.c: Option Wireless
[ 107.293060] usbcore: registered new interface driver hso
AT$QCVOLT
3706
OK
AT$QCSIMSTAT?
$QCSIMSTAT: 0,UNKNOWN
OK
^Croot at letux:~#
Works.
>> [ 361.477783] Unable to handle kernel NULL pointer dereference at virtual address 00000030
>> [ 361.503814] pgd = eca7c000
>> [ 361.510253] hso 1-2:1.5: Not our interface
>> [ 361.520660] [00000030] *pgd=aca4a831, *pte=00000000, *ppte=00000000
^^^ these lines are missing on GTA04A4.
>> [ 361.529022] usbcore: registered new interface driver hso
>> [ 361.551696] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
>> [ 361.557464] Modules linked in: hso bnep bluetooth usb_f_ecm g_ether usb_f_rndis u_ether libcomposite configfs ipv6 arc4 wl18xx wlcore mac80211 cfg80211 bq27xxx_battery omapdrm drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm panel_tpo_td028ttec1 snd_soc_simple_card snd_soc_simple_card_utils snd_soc_omap_twl4030 encoder_opa362 wwan_on_off twl4030_madc_hwmon snd_soc_gtm601 connector_analog_tv pwm_omap_dmtimer generic_adc_battery pwm_bl extcon_gpio omap3_isp videobuf2_dma_contig videobuf2_memops wlcore_sdio videobuf2_v4l2 w1_bq27000 videobuf2_core ov9650 omap2430 omap_hdq snd_soc_omap_mcbsp v4l2_common snd_soc_omap bmp280_i2c snd_pcm_dmaengine videodev bmp280 bmg160_i2c bmg160_core bmc150_accel_i2c bmc150_magn_i2c phy_twl4030_usb bmc150_magn bmc150_accel_core
>> [ 361.632415] at24 industrialio_triggered_buffer media nvmem_core tsc2007 leds_tca6507 bno055 kfifo_buf musb_hdrc gpio_twl4030 twl4030_pwrbutton twl4030_vibra snd_soc_twl4030 twl4030_charger twl4030_madc industrialio w2sg0004 ehci_omap omapdss
Well, I was too early... After some reboot:
root at letux:~# ./wwan-status
[ 156.716522] wwan_on_off_rfkill_set_block: blocked: 0
[ 156.722320] modem: set_power 1
[ 156.728851] USB phy event 0
[ 156.731872] state 0
[ 156.734313] modem: send impulse
[ 157.475524] USB phy event 0
[ 157.478576] modem: done
[ 160.665771] usb 1-2: new high-speed USB device number 2 using ehci-omap
[ 160.875244] usb 1-2: New USB device found, idVendor=0af0, idProduct=8800
[ 160.882934] usb 1-2: New USB device strings: Mfr=3, Product=2, SerialNumber=0
[ 160.893341] usb 1-2: Product: Globetrotter HSUPA Modem
[ 160.900909] usb 1-2: Manufacturer: Option N.V.
[ 161.330993] hso: drivers/net/usb/hso.c: Option Wireless
[ 161.431152] Unable to handle kernel NULL pointer dereference at virtual address 00000030
[ 161.459716] pgd = dca78000
[ 161.462768] [00000030] *pgd=9dc8d831, *pte=00000000, *ppte=00000000
[ 161.478607] usbcore: registered new interface driver hso
[ 161.494812] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
[ 161.500579] Modules linked in: hso bnep bluetooth usb_f_ecm g_ether usb_f_rndis u_ether libcomposite configfs ipv6 libertas_sdio libertas cfg80211 bq27xxx_battery omapdrm drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea drm snd_soc_simple_card panel_tpo_td028ttec1 snd_soc_simple_card_utils snd_soc_omap_twl4030 encoder_opa362 snd_soc_gtm601 wwan_on_off twl4030_madc_hwmon pwm_omap_dmtimer connector_analog_tv generic_adc_battery pwm_bl extcon_gpio omap3_isp videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 w1_bq27000 videobuf2_core ov9650 snd_soc_omap_mcbsp omap2430 omap_hdq snd_soc_omap v4l2_common snd_pcm_dmaengine bmp280_i2c bmp280 phy_twl4030_usb videodev hmc5843_i2c itg3200 at24 leds_tca6507 bma180 tsc2007 media musb_hdrc hmc5843_core nvmem_core
[ 161.574890] industrialio_triggered_buffer gpio_twl4030 kfifo_buf twl4030_charger twl4030_madc snd_soc_twl4030 twl4030_vibra twl4030_pwrbutton industrialio w2sg0004 ehci_omap omapdss
[ 161.591979] CPU: 0 PID: 2688 Comm: dhcpcd Not tainted 4.11.0-rc8-letux+ #1012
[ 161.599456] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[ 161.606018] task: db49c280 task.stack: d1022000
[ 161.610809] PC is at hso_start_net_device+0x50/0xc0 [hso]
[ 161.616485] LR is at hso_net_open+0x68/0x84 [hso]
[ 161.621398] pc : [<bf22a304>] lr : [<bf22aed8>] psr: a0030013
[ 161.621398] sp : d1023e20 ip : 00000000 fp : ffffffff
[ 161.633422] r10: 00000000 r9 : ddc83c0c r8 : db53962c
[ 161.638885] r7 : bf22aef4 r6 : db539600 r5 : 00000000 r4 : dcc39840
[ 161.645751] r3 : 00000000 r2 : c0030280 r1 : 00000000 r0 : db3b4c00
[ 161.652587] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 161.660064] Control: 10c5387d Table: 9ca78019 DAC: 00000051
[ 161.666107] Process dhcpcd (pid: 2688, stack limit = 0xd1022218)
[ 161.672393] Stack: (0xd1023e20 to 0xd1024000)
[ 161.676971] 3e20: dcc39840 db539698 00000001 00000000 db539000 db539600 db539660 00000000
[ 161.685546] 3e40: ddc83c0c bf22aed8 bf22ae70 db539000 00000001 bf22e234 db539030 c068e530
[ 161.694122] 3e60: c068e490 db539000 00000001 00001091 00001090 c068e77c db539000 00001090
[ 161.702667] 3e80: db539140 db539000 00000000 c068e838 00000000 00000001 ddc83c00 db539000
[ 161.711242] 3ea0: 00000000 c06ff760 c0ca39c2 beef782c 00000020 00000000 306f7368 00000000
[ 161.719818] 3ec0: 00000000 00000000 00001091 00000000 00000000 00000000 00000000 00008914
[ 161.728393] 3ee0: dd67d020 beef782c c0c9bfc0 dd67d000 00000005 00000000 0003d160 c066b50c
[ 161.736968] 3f00: beef782c dd67d020 db569700 c02839a0 00000005 c0282e80 0000c000 c0283844
[ 161.745544] 3f20: c09b16c9 c098c994 00000001 c0c5e513 c0c5e513 00000000 c0151444 c01a20f4
[ 161.754119] 3f40: c0c5e513 c01a3168 d1022000 c01a3194 db49c830 60030013 00000000 db49c280
[ 161.762664] 3f60: db49c774 00000000 db569700 db569700 beef782c 00008914 00000005 00000000
[ 161.771240] 3f80: 0003d160 c02839a0 00001091 0003a1f0 000533a8 0003a248 00000036 c01071e4
[ 161.779815] 3fa0: d1022000 c0107040 0003a1f0 000533a8 00000005 00008914 beef782c 00001091
[ 161.788391] 3fc0: 0003a1f0 000533a8 0003a248 00000036 0003ac0c 000533a8 000533b0 0003d160
[ 161.796966] 3fe0: 0003a0ac beef7824 000167eb b6f57106 40030030 00000005 04000046 800000c0
[ 161.805572] [<bf22a304>] (hso_start_net_device [hso]) from [<bf22aed8>] (hso_net_open+0x68/0x84 [hso])
[ 161.815338] [<bf22aed8>] (hso_net_open [hso]) from [<c068e530>] (__dev_open+0xa0/0xf4)
[ 161.823638] [<c068e530>] (__dev_open) from [<c068e77c>] (__dev_change_flags+0x8c/0x130)
[ 161.832031] [<c068e77c>] (__dev_change_flags) from [<c068e838>] (dev_change_flags+0x18/0x48)
[ 161.840881] [<c068e838>] (dev_change_flags) from [<c06ff760>] (devinet_ioctl+0x348/0x714)
[ 161.849487] [<c06ff760>] (devinet_ioctl) from [<c066b50c>] (sock_ioctl+0x2b0/0x308)
[ 161.857513] [<c066b50c>] (sock_ioctl) from [<c0282e80>] (vfs_ioctl+0x20/0x34)
[ 161.865020] [<c0282e80>] (vfs_ioctl) from [<c0283844>] (do_vfs_ioctl+0x82c/0x93c)
[ 161.872833] [<c0283844>] (do_vfs_ioctl) from [<c02839a0>] (SyS_ioctl+0x4c/0x74)
[ 161.880523] [<c02839a0>] (SyS_ioctl) from [<c0107040>] (ret_fast_syscall+0x0/0x1c)
[ 161.888458] Code: e3822103 e3822080 e1822781 e5981014 (e5832030)
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.494812] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.666107] Process dhcpcd (pid: 2688, stack limit = 0xd1022218)
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.672393] Stack: (0xd1023e20 to 0xd1024000)
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.676971] 3e20: dcc39840 db539698 00000001 00000000 db539000 db539600 db539660 00000000
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.685546] 3e40: ddc83c0c bf22aed8 bf22ae70 db539000 00000001 bf22e234 db539030 c068e530
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.694122] 3e60: c068e490 db539000 00000001 00001091 00001090 c068e77c db539000 00001090
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.702667] 3e80: db539140 db539000 00000000 c068e838 00000000 00000001 ddc83c00 db539000
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.711242] 3ea0: 00000000 c06ff760 c0ca39c2 beef782c 00000020 00000000 306f7368 00000000
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.719818] 3ec0: 00000000 00000000 00001091 00000000 00000000 00000000 00000000 00008914
[ 162.010528] ---[ end trace 40c25a7f10cd84e6 ]---
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.728393] 3ee0: dd67d020 beef782c c0c9bfc0 dd67d000 00000005 00000000 0003d160 c066b50c
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.736968] 3f00: beef782c dd67d020 db569700 c02839a0 00000005 c0282e80 0000c000 c0283844
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.745544] 3f20: c09b16c9 c098c994 00000001 c0c5e513 c0c5e513 00000000 c0151444 c01a20f4
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.754119] 3f40: c0c5e513 c01a3168 d1022000 c01a3194 db49c830 60030013 00000000 db49c280
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.762664] 3f60: db49c774 00000000 db569700 db569700 beef782c 00008914 00000005 00000000
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.771240] 3f80: 0003d160 c02839a0 00001091 0003a1f0 000533a8 0003a248 00000036 c01071e4
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.779815] 3fa0: d1022000 c0107040 0003a1f0 000533a8 00000005 00008914 beef782c 00001091
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.788391] 3fc0: 0003a1f0 000533a8 0003a248 00000036 0003ac0c 000533a8 000533b0 0003d160
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.796966] 3fe0: 0003a0ac beef7824 000167eb b6f57106 40030030 00000005 04000046 800000c0
Message from syslogd at letux at Jan 1 00:10:48 ...
kernel:[ 161.888458] Code: e3822103 e3822080 e1822781 e5981014 (e5832030)
AT$QCVOLT
3673
OK
^Croot at letux:~#
So my conclusion is:
* dhcpd is recognising quite quickly that there is a new interface being enabled.
* and tries to do some ioctl which makes the kernel segfault
* and crashes dhcpd
The fix we are talking about makes this being ignored.
It might simply be that dhcpd is responding faster with an ioctl() than with older kernels.
So let's keep the patch for the moment.
BR,
Nikolaus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.goldelico.com/pipermail/letux-kernel/attachments/20170425/379ba902/attachment-0001.asc>
More information about the Letux-kernel
mailing list