[Letux-kernel] 1.5GHz problems

H. Nikolaus Schaller hns at goldelico.com
Sat Jul 30 21:54:13 CEST 2016


Hi,

> Am 30.07.2016 um 21:14 schrieb Michael Mrozek <EvilDragon at openpandora.org>:
> 
> Am Sat, 30 Jul 2016 20:02:14 +0200 hat "H. Nikolaus Schaller"
> <hns at goldelico.com> geschrieben:
> 
> Hi,
> 
>> The CPU board I am running these test on has additional wires which
>> allow to measure the voltages VDD_MPU, VDD_CORE, VDD_DDR3, VDD_MM,
>> 1V8, VSYS.
> 
> This might be a good idea. Maybe the combination of Palmas and the
> charger / battery chip causes an issue here and the voltage isn't
> increased for some reason.

Very unlikely that the charger / battery has such an influence. How should it
know that I increase the cpu-freq...

There are other factors (e.g. Speakers on/off, Display on/off) which also have
a high influence on VSYS and currents flowing through connectors but turning
them on or off makes no such difference as cpufreq.

The Palmas has several programmable voltage regulators between VSYS and
all these voltages above. So they should block changes in VSYS.

If the voltages turn out that they are not increased, there would be a software
bug.

> 
> What you could also test would be setting 1GHz to run with a higher
> voltage and measure whether the voltage REALLY is increased.
> This way, the unit doesn't crash.

Yes, we will see that.

> 
> Another thing I'd like to remind here:
> 1,5GHz work fine (even with full load) with the EVM and the iGEPV5
> board.

I haven't tested the Letux kernel on the EVM for a while, but I should do
that as well.

> So it probably is something that's different between those and our
> boards.
> 
> * Are there differences in the board files that could cause
>  our CPU to misbehave?

Yes. The Pyra has a different DT file than EVM. And for the IGEP I
don't even know what it has. There is omap5-board-common but
we have a lot of extensions.

And there could be some problem in a driver for a device that does
not exist on either EVM or IGEP.

In addition there are a lot of changes going on in this area, like:

https://lkml.org/lkml/2016/5/4/735

> 
> * Is there a difference in our hardware setup that could lead to that?
>  We got different RAM, a different quartz, different power setup (we
>  don't have a simple AC like the devboards, but a battery / charger
>  circuit)

RAM and power setup / battery / charger is something I would remove
from the list of potential reasons. Because they are exactly the same
at 500 MHz, 1 GHz or 1.5 GHz. There is nothing controlled differently.

All these external clocks run all at the same speed and voltage. I.e.
neither RAM, nor I2C nor video signals run faster when changing the
cpufreq.

The quartz also runs at the same frequency and voltage.

The only thing we had discussed a while ago was a potential temperature
dependency. But I can also exclude that with the tests done today.

The SoC die temperature when running at 500 MHz is ~55°C. And
when switching to 1.5 GHz it rises to 65°C in 2-3 seconds and then
the CPU hangs. This does not change the quartz temperature.

> 
> 
>> Anyways, please all kernel developers think about potential kernel
>> issues (scheduling, SMP, locking, interrupts, I&D-caches) that might
>> lead to such a behaviour. And potential tests (I can add printk etc.
>> where needed).

What puzzles me most is that the system hangs at low system load.
As soon as cpufreq-set goes to >ca. 1 GHz.

In that situation the OMAP should be in idle 98% of the time and is just
blinking some I2C LEDs and waiting that I type the next command over
UART console. Well, DSS is also running in the background. But not much
more.

BR,
Nikolaus



More information about the Letux-kernel mailing list