[Letux-kernel] 1.5GHz problems

Tue Aug 2 08:45:55 CEST 2016

Hi Matthijs,

> Am 02.08.2016 um 02:55 schrieb Matthijs van Duin <matthijsvanduin at gmail.com>:
> 
> I'm missing the initial part of the thread and haven't read it all in
> detail yet, but I can offer my first thoughts:
> 
> 
> I recently posted that I discovered that the Cortex-A15 subsystem's
> async bridges to the L3 and ABE interconnects are being overclocked
> (according to the datasheet) when the cpu is run at 1.5 GHz.  The OS
> is supposed to reduce their clock dividers prior to switching to 1.5
> GHz clock, but this did not seem to be happening.
> 
> If you check CM_MPU_MPU_CLKCTRL (0x4a004320), bits 25 and 26 are zero
> by default, which means L3bridge = cpu/4 and ABEbridge = cpu/8. They
> should be set to 1 before going to 1.5 GHz according to the datasheet,
> which makes the dividers /8 and /16 respectively. Unfortunately though
> this means that at OPP_HIGH they are clocked 25% slower than at
> OPP_NOM, and (based on testing) the bridge clock speed actually seems
> to be the bottleneck for L3 access. Yuck.

So they should be modified dynamically when switching CPU frequencies?

> 
> No idea why you'd have problems with it though when the uEVM seems to
> be fine with the L3 bridge apparently being overclocked by 50% (!!!).
> 
> This theory does have the benefit of being dependent specifically on
> cpu clock frequency and not on how heavily the cpu is actually being
> exercised. For everything else I can think of the only explanation
> would be "it just happens to be triggered by the way you exercise the
> hardware" which seems very weak.  E.g. the cortex-A15 does have a few
> "nice" errata resulting in core deadlock. Of course it doesn't hurt to
> try enabling CONFIG_OMAP5_ERRATA_801819 if it isn't already (see
> https://patchwork.kernel.org/patch/6960921/ ).

I had tried both (enable erratum and the CM_MPU_MPU_CLKCTRL
change) but did not see an obvious difference.

Which does not prove that we don't need them anyways...

> 
> 
> Have you been able to check using JTAG in what sort of state the
> core(s) or SoC in general are?  Even if the ARM subsystem is totally
> locked up, if you can connect to DAP then you can inspect things like
> PRCM registers and use I2C to check the configuration of the PMIC. (if
> you need it, I have some javascript helper code for CCS debugserver to
> perform I2C requests)

Yes, that would be really a good tool to find out what is going on. And make
some post-mortem register dumps.

Unfortunately I neither have JTAG equipment nor do we have access to the
JTAG interface on the Pyra CPU board. Since neither is needed in normal
situations. For a big pile of money we could have built some Pyra CPU
boards where these signals are made available on external test points.

BR,
Nikolaus