[Letux-kernel] OMAP5: Debugging kernel not starting...

H. Nikolaus Schaller hns at goldelico.com
Tue Sep 11 15:07:49 CEST 2018


Hi Tero,

> Am 11.09.2018 um 08:02 schrieb Tero Kristo <t-kristo at ti.com>:
> 
> On 10/09/18 23:50, Tony Lindgren wrote:
>> Hi,
>> * H. Nikolaus Schaller <hns at goldelico.com> [180910 20:31]:
>>> Now, I get something for further analysis, which indicates how the kernel
>>> is stumbling...
>> Oh that same old omap5 reboot clock issue. Adding Tero and Peter to Cc.
>> Can you please also set this up as a bug at bugzilla.kernel.org? We've
>> already seen this before but I don't think there was any solution.
> 
> You should try to reproduce the issue on omap5uevm. Nothing much I can do if this issue is only present on Pyra HW.

Strangely it is not reproducible on OMAP5EVM. With the same kernel
uImage, defconfig and user-space.

Differences are of course DTBs and some u-boot specialities, but
both are derived from standard omap5 files from mainline.

I have digged through the old discussion on e2e and the gitorius code
gave some hints about which registers are interesting and should be
checked.

I did hack some code into U-Boot to print their values in misc_init_r()
after "power-on" and "reboot". Ans wrote a shell script wrapper for devmem2
to read the same registers after successful boot.

Here are the values of interesting registers on the Pyra:

> Pyra before boot (printed from U-Boot)
> 
> 4a0041e4 00000002: CM_IDLEST_DPLL_ABE
> 4a0041e8 00000000: CM_AUTOIDLE_DPLL_ABE
> 4a0041f0 00000001: CM_DIV_M2_DPLL_ABE
> 4a0041f4 00000001: CM_DIV_M3_DPLL_ABE
> 4a004500 00000003: CM_ABE_CLKSTCTRL
> 4ae06108 00000000: CM_CLKSEL_WKUPAON
> 4ae0610c 00000001: CM_CLKSEL_ABE_PLL_REF
> 
> Pyra after boot in Linux
> 
> Value at address 0x4A0041E4 (0xb6fe71e4): 0x12
> Value at address 0x4A0041E8 (0xb6f681e8): 0x1
> Value at address 0x4A0041F0 (0xb6fc01f0): 0x1
> Value at address 0x4A0041F4 (0xb6f901f4): 0x2
> Value at address 0x4A004500 (0xb6f6f500): 0x803
> Value at address 0x4AE06108 (0xb6fa1108): 0x0
> Value at address 0x4AE0610C (0xb6fa910c): 0x1
> 

> Pyra in U-Boot after "reboot" (printed from U-Boot)
> 
> 4a0041e4 00000012: CM_IDLEST_DPLL_ABE
> 4a0041e8 00000000: CM_AUTOIDLE_DPLL_ABE
> 4a0041f0 00000001: CM_DIV_M2_DPLL_ABE
> 4a0041f4 00000001: CM_DIV_M3_DPLL_ABE
> 4a004500 00000003: CM_ABE_CLKSTCTRL
> 4ae06108 00000000: CM_CLKSEL_WKUPAON
> 4ae0610c 00000001: CM_CLKSEL_ABE_PLL_REF
> 
> Pyra kernel does not boot (clock: dpll_abe_ck failed transition to 'locked')

and the same on OMAP5EVM:

> EVM before boot (printed from U-Boot)
> 
> 4a0041e4 00000002: CM_IDLEST_DPLL_ABE
> 4a0041e8 00000000: CM_AUTOIDLE_DPLL_ABE
> 4a0041f0 00000001: CM_DIV_M2_DPLL_ABE
> 4a0041f4 00000001: CM_DIV_M3_DPLL_ABE
> 4a004500 00000003: CM_ABE_CLKSTCTRL
> 4ae06108 00000000: CM_CLKSEL_WKUPAON
> 4ae0610c 00000001: CM_CLKSEL_ABE_PLL_REF
> 
> EVM after boot in Linux
> 
> Value at address 0x4A0041E4 (0xb6fae1e4): 0x12
> Value at address 0x4A0041E8 (0xb6fb91e8): 0x1
> Value at address 0x4A0041F0 (0xb6f251f0): 0x1
> Value at address 0x4A0041F4 (0xb6f961f4): 0x2
> Value at address 0x4A004500 (0xb6fcf500): 0x3
> Value at address 0x4AE06108 (0xb6f9b108): 0x0
> Value at address 0x4AE0610C (0xb6f5310c): 0x1
> 
> EVM in U-Boot after "reboot"
> 
> 4a0041e4 00000002: CM_IDLEST_DPLL_ABE
> 4a0041e8 00000000: CM_AUTOIDLE_DPLL_ABE
> 4a0041f0 00000001: CM_DIV_M2_DPLL_ABE
> 4a0041f4 00000001: CM_DIV_M3_DPLL_ABE
> 4a004500 00000003: CM_ABE_CLKSTCTRL
> 4ae06108 00000000: CM_CLKSEL_WKUPAON
> 4ae0610c 00000001: CM_CLKSEL_ABE_PLL_REF
> 
> EVM in Kernel after successful reboot
> 
> Value at address 0x4A0041E4 (0xb6fe81e4): 0x12
> Value at address 0x4A0041E8 (0xb6f7a1e8): 0x1
> Value at address 0x4A0041F0 (0xb6f4c1f0): 0x1
> Value at address 0x4A0041F4 (0xb6f2c1f4): 0x2
> Value at address 0x4A004500 (0xb6f1f500): 0x3
> Value at address 0x4AE06108 (0xb6f63108): 0x0
> Value at address 0x4AE0610C (0xb6fca10c): 0x1

So there is no diff in U-Boot after power-on between
EVM and Pyra and before the kernel starts.

After kernel has booted, there is a single difference
in CM_ABE_CLKSTCTRL.

The relevant bit is: CLKACTIVITY_DPLL_ABE_X2_ CLK

After reboot, there are bigger differences not wiped
out by u-boot.

So the question is what disturbs this clock,
especially as CLKACTIVITY_DPLL_ABE_X2_ seems
to say that the clock is active (on Pyra) but
not on EVM (I had expected the reverse to be more
likely).

And why things are not cleaned up before reboot.

I have not yet analysed in detail and carefully
all differences between u-boot code and device tree
source, but I currently only aware that there
are some additional audio subsystems enabled
on the Pyra, e.g. vibramotor, handsfree speakers.
Maybe one of these keeps the clock active?

So I'll continue there and report if I find something
which I can disable...

BR,
Nikolaus


> 
> -Tero
> 
>> Regards,
>> Tony
>>> root at letux:~# dmesg|fgrep early
>>> [    0.000000] Malformed early option 'console'
>>> [    0.000000] earlycon: omapserial0 at MMIO 0x48020000 (options '')
>>> 
>>> ^^^ ok, earlycon is working :)
>>> 
>>> [    0.000000] Kernel command line: console= root=PARTUUID=7d45bc20-02 rw rootfstype=ext4 rootwait console=ttyO2,115200n8 vram=12M omapfb.vram=0:8M,1:4M omapfb.rotate_type=0 omapdss.def_disp=lcd rootwait twl4030_charger.allow_usb=1 log_buf_len=8M ignore_loglevel earlyprintk earlycon
>>> [    0.000000] early log buf free: 63696(97%)
>>> root at letux:~# reboot
>>> 
>>> Broadcast message from root at letux (console) (Mon Sep 10 20:05:16 2018):
>>> 
>>> The system is going down for reboot NOW!
>>> 
>>> ....
>>> 
>>> [  164.453708] reboot: Restarting system
>>> 
>>> U-Boot SPL 2016.11-00292-g12720df (Aug 12 2018 - 18:32:51)
>>> OMAP5432-GP ES2.0
>>> 
>>> ....
>>> 
>>> ## Booting kernel from Legacy Image at 82000000 ...
>>>    Image Name:   Linux-4.18.0-letux+
>>>    Image Type:   ARM Linux Kernel Image (uncompressed)
>>>    Data Size:    4901632 Bytes = 4.7 MiB
>>>    Load Address: 80008000
>>>    Entry Point:  80008000
>>>    Verifying Checksum ... OK
>>> ## Flattened Device Tree blob at 88000000
>>>    Booting using the fdt blob at 0x88000000
>>>    Loading Kernel Image ... OK
>>>    Loading Device Tree to 8ffed000, end 8fffff57 ... OK
>>> 
>>> Starting kernel ...
>>> 
>>> [    0.000000] Booting Linux on physical CPU 0x0
>>> [    0.000000] Linux version 4.18.0-letux+ (hns at iMac.fritz.box) (gcc version 4.9.2 (GCC)) #2664 SMP PREEMPT Mon Sep 10 20:46:49 CEST 2018
>>> [    0.000000] CPU: ARMv7 Processor [412fc0f2] revision 2 (ARMv7), cr=10c5387d
>>> [    0.000000] CPU: div instructions available: patching division code
>>> [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, PIPT instruction cache
>>> [    0.000000] OF: fdt: Machine model: Pyra-Handheld-V5.1.2ff
>>> [    0.000000] Malformed early option 'console'
>>> [    0.000000] debug: ignoring loglevel setting.
>>> [    0.000000] earlycon: omapserial0 at MMIO 0x48020000 (options '')
>>> [    0.000000] bootconsole [omapserial0] enabled
>>> [    0.000000] Memory policy: Data cache writealloc
>>> [    0.000000] cma: Reserved 16 MiB at 0xfd800000
>>> [    0.000000] OMAP4: Map 0xfee00000 to (ptrval) for dram barrier
>>> [    0.000000] On node 0 totalpages: 519680
>>> [    0.000000]   Normal zone: 1536 pages used for memmap
>>> [    0.000000]   Normal zone: 0 pages reserved
>>> [    0.000000]   Normal zone: 196608 pages, LIFO batch:31
>>> [    0.000000]   HighMem zone: 323072 pages, LIFO batch:31
>>> [    0.000000] OMAP5432 ES2.0
>>> [    0.000000] random: get_random_bytes called from start_kernel+0x80/0x470 with crng_init=0
>>> [    0.000000] percpu: Embedded 17 pages/cpu @(ptrval) s40320 r8192 d21120 u69632
>>> [    0.000000] pcpu-alloc: s40320 r8192 d21120 u69632 alloc=17*4096
>>> [    0.000000] pcpu-alloc: [0] 0 [0] 1
>>> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 518144
>>> [    0.000000] Kernel command line: console= root=PARTUUID=7d45bc20-02 rw rootfstype=ext4 rootwait console=ttyO2,115200n8 vram=12M omapfb.vram=0:8M,1:4M omapfb.rotate_type=0 omapdss.def_disp=lcd rootwait twl4030_charger.allow_usb=1 log_buf_len=8M ignore_loglevel earlyprintk earlycon
>>> [    0.000000] log_buf_len: 8388608 bytes
>>> [    0.000000] early log buf free: 63696(97%)
>>> [    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
>>> [    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
>>> [    0.000000] Memory: 2025568K/2078720K available (6823K kernel code, 653K rwdata, 1824K rodata, 1024K init, 221K bss, 36768K reserved, 16384K cma-reserved, 1275904K highmem)
>>> [    0.000000] Virtual kernel memory layout:
>>> [    0.000000]     vector  : 0xffff0000 - 0xffff1000   (   4 kB)
>>> [    0.000000]     fixmap  : 0xffc00000 - 0xfff00000   (3072 kB)
>>> [    0.000000]     vmalloc : 0xf0800000 - 0xff800000   ( 240 MB)
>>> [    0.000000]     lowmem  : 0xc0000000 - 0xf0000000   ( 768 MB)
>>> [    0.000000]     pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
>>> [    0.000000]     modules : 0xbf000000 - 0xbfe00000   (  14 MB)
>>> [    0.000000]       .text : 0x(ptrval) - 0x(ptrval)   (7816 kB)
>>> [    0.000000]       .init : 0x(ptrval) - 0x(ptrval)   (1024 kB)
>>> [    0.000000]       .data : 0x(ptrval) - 0x(ptrval)   ( 654 kB)
>>> [    0.000000]        .bss : 0x(ptrval) - 0x(ptrval)   ( 222 kB)
>>> [    0.000000] Preemptible hierarchical RCU implementation.
>>> [    0.000000]  RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=2.
>>> [    0.000000]  Tasks RCU enabled.
>>> [    0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2
>>> [    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
>>> [    0.000000] GIC: Using split EOI/Deactivate mode
>>> [    0.000000] clock: dpll_abe_ck failed transition to 'locked'
>>> [    0.000000] OMAP clockevent source: timer1 at 32768 Hz
>>> [    0.000000] arch_timer: cp15 timer(s) running at 6.14MHz (phys).
>>> [    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x16ac02862, max_idle_ns: 440795202218 ns
>>> [    0.000005] sched_clock: 56 bits at 6MHz, resolution 162ns, wraps every 4398046511085ns
>>> [    0.008218] Switching to timer-based delay loop, resolution 162ns
>>> [    0.015171] clocksource: 32k_counter: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 58327039986419 ns
>>> [    0.025273] OMAP clocksource: 32k_counter at 32768 Hz
>>> [    0.031722] Console: colour dummy device 80x30
>>> [    0.036209] Calibrating delay loop (skipped), value calculated using timer frequency.. 12.28 BogoMIPS (lpj=61440)
>>> [    0.046753] pid_max: default: 32768 minimum: 301
>>> [    0.051557] Security Framework initialized
>>> [    0.055723] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes)
>>> [    0.062482] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes)
>>> [    0.070125] CPU: Testing write buffer coherency: ok
>>> [    0.075087] CPU0: Spectre v2: firmware did not set auxiliary control register IBE bit, system vulnerable
>>> [    0.084971] /cpus/cpu at 0 missing clock-frequency property
>>> [    0.090363] /cpus/cpu at 1 missing clock-frequency property
>>> [    0.095794] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
>>> [    0.101610] smp: CPU1 parked within kernel, needs reset (0x0 0x801248a0)
>>> [    0.161732] Setting up static identity map for 0x80100000 - 0x80100060
>>> [    0.168443] Hierarchical SRCU implementation.
>>> [    0.212907] smp: Bringing up secondary CPUs ...
>>> [    0.273213] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
>>> [    0.273219] CPU1: Spectre v2: using ICIALLU workaround
>>> [    0.284313] smp: Brought up 1 node, 2 CPUs
>>> [    0.288434] SMP: Total of 2 processors activated (24.57 BogoMIPS).
>>> [    0.294807] CPU: All CPU(s) started in HYP mode.
>>> [    0.299509] CPU: Virtualization extensions available.
>>> [    0.305759] devtmpfs: initialized
>>> [    0.319718] VFP support v0.3: implementor 41 architecture 4 part 30 variant f rev 0
>>> [    0.327882] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
>>> [    0.337934] futex hash table entries: 512 (order: 3, 32768 bytes)
>>> [    0.344594] pinctrl core: initialized pinctrl subsystem
>>> [    0.350243] really_probe: driver reg-dummy for reg-dummy bus platform
>>> [    0.357023] really_probe: driver reg-dummy for reg-dummy bus platform ok (1)
>>> [    0.364644] NET: Registered protocol family 16
>>> [    0.370796] DMA: preallocated 256 KiB pool for atomic coherent allocations
>>> [    0.378566] omap_hwmod: l3_main_3 using broken dt data from ocp
>>> [    0.385175] omap_hwmod: l3_main_2 using broken dt data from ocp
>>> [    1.723766] clock: dpll_abe_ck failed transition to 'locked'
>>> 
>>> ^^^ this seems to be different
>>> 
>>> [    3.028640] clock: dpll_abe_ck failed transition to 'locked'
>>> [    4.333541] clock: dpll_abe_ck failed transition to 'locked'
>>> [    5.638412] clock: dpll_abe_ck failed transition to 'locked'
>>> [    6.939455] clock: dpll_abe_ck failed transition to 'locked'
>>> [    6.947796] ------------[ cut here ]------------
>>> [    6.952460] WARNING: CPU: 0 PID: 1 at drivers/clk/clk.c:814 clk_core_disable_lock+0x18/0x24
>>> [    6.961024] aess_fclk already disabled
>>> [    6.964864] Modules linked in:
>>> [    6.967995] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-letux+ #2664
>>> [    6.974954] Hardware name: Generic OMAP5 (Flattened Device Tree)
>>> [    6.981136] [<c01112c8>] (unwind_backtrace) from [<c010c87c>] (show_stack+0x10/0x14)
>>> [    6.989075] [<c010c87c>] (show_stack) from [<c078f894>] (dump_stack+0x7c/0x9c)
>>> [    6.996489] [<c078f894>] (dump_stack) from [<c0132848>] (__warn+0xd8/0x108)
>>> [    7.003632] [<c0132848>] (__warn) from [<c01328bc>] (warn_slowpath_fmt+0x44/0x6c)
>>> [    7.011313] [<c01328bc>] (warn_slowpath_fmt) from [<c049d44c>] (clk_core_disable_lock+0x18/0x24)
>>> [    7.020335] [<c049d44c>] (clk_core_disable_lock) from [<c01200b8>] (_disable_clocks+0x18/0x70)
>>> [    7.029177] [<c01200b8>] (_disable_clocks) from [<c012116c>] (_enable+0x1ec/0x230)
>>> [    7.036947] [<c012116c>] (_enable) from [<c012259c>] (_setup.part.16+0xc8/0x404)
>>> [    7.044537] [<c012259c>] (_setup.part.16) from [<c01215c8>] (_setup+0x14/0x1c)
>>> [    7.051949] [<c01215c8>] (_setup) from [<c01216b4>] (omap_hwmod_for_each+0x3c/0x64)
>>> [    7.059810] [<c01216b4>] (omap_hwmod_for_each) from [<c0a0db90>] (__omap_hwmod_setup_all+0xa4/0xc0)
>>> [    7.069099] [<c0a0db90>] (__omap_hwmod_setup_all) from [<c0102eb4>] (do_one_initcall+0xc0/0x268)
>>> [    7.078121] [<c0102eb4>] (do_one_initcall) from [<c0a01050>] (kernel_init_freeable+0x220/0x36c)
>>> [    7.087050] [<c0a01050>] (kernel_init_freeable) from [<c07a25dc>] (kernel_init+0x8/0x110)
>>> [    7.095443] [<c07a25dc>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
>>> [    7.103207] Exception stack(0xed8adfb0 to 0xed8adff8)
>>> [    7.108390] dfa0:                                     00000000 00000000 00000000 00000000
>>> [    7.116786] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>>> [    7.125180] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
>>> [    7.131971] ---[ end trace 02345015e7dc8e1f ]---
>>> [    7.136720] omap_hwmod: aess: _wait_target_ready failed: -16
>>> [    7.142511] omap_hwmod: aess: cannot be enabled for reset (3)
>>> [    8.447580] clock: dpll_abe_ck failed transition to 'locked'
>>> [    9.752444] clock: dpll_abe_ck failed transition to 'locked'
>>> 
>>> --- and then everything hangs
>>> 
>>> If I enforce a warm RESET, the Pyra boots again up to the same point of failure.
>>> I have to power-off and on again through Palmas to get it booting again.
>>> 
>>> Searching for "clock: dpll_abe_ck failed transition to 'locked'" leads to:
>>> 
>>> 	https://e2e.ti.com/support/omap/f/885/t/328485?dpll-abe-ck-failed-transition-to-locked-
>>> 
>>> And:
>>> 
>>> 	https://www.pyra-handheld.com/pipermail/kernel/2016-May/000877.html
>>> 
>>> So we had discussed this already a while ago but did not find a solution...
>>> 
>>> Maybe I should try to disable ABE/AESS for the next experiment (not today).
>>> An try the same on OMAP5EVM (to find out if there are hardware dependencies involved).
>>> 
>>> BR and thanks,
>>> Nikolaus
>>> 
> 
> --
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki



More information about the Letux-kernel mailing list