[Letux-kernel] thermal madness

H. Nikolaus Schaller hns at goldelico.com
Sun Sep 15 14:36:39 CEST 2019


Hi Andreas,

> Am 15.09.2019 um 14:04 schrieb Andreas Kemnade <andreas at kemnade.info>:
> 
> Hi,
> 
> On Sat, 14 Sep 2019 12:28:06 +0200
> "H. Nikolaus Schaller" <hns at goldelico.com> wrote:
> 
>>> Am 14.09.2019 um 12:23 schrieb Andreas Kemnade <andreas at kemnade.info>:
>>> 
>>> On Fri, 13 Sep 2019 22:27:11 +0200
>>> Andreas Kemnade <andreas at kemnade.info> wrote:
>>> 
>>>> some more testing here:
>>>> root@(none):/# cpufreq-info 
>>>> cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
>>>> Report errors and bugs to cpufreq at vger.kernel.org, please.
>>>> analyzing CPU 0:
>>>> driver: cpufreq-dt
>>>> CPUs which run at the same hardware frequency: 0
>>>> CPUs which need to have their frequency coordinated by software: 0
>>>> maximum transition latency: 300 us.
>>>> hardware limits: 300 MHz - 1000 MHz
>>>> available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
>>>> available cpufreq governors: conservative, userspace, powersave, ondemand, performance
>>>> current policy: frequency should be within 300 MHz and 1000 MHz.
>>>>                 The governor "ondemand" may decide which speed to use
>>>>                 within this range.
>>>> current CPU frequency is 600 MHz (asserted by call to hardware).
>>>> cpufreq stats: 300 MHz:94.86%, 600 MHz:2.49%, 800 MHz:0.92%, 1000 MHz:1.73%  (51)  
>>> 
>>> should this be 1000Mhz: 0%? If not enabled the boost switch.  
>> 
>> I had it enabled... 
>> 
>> And we can remove the turbo-mode; tag anyways as soon as thermal throttling works.
>> 
> I am not sure about that. How do we test that 1Ghz works well?

If there is heavy load it will run permanently at 1GHz.
Until thermal throttling switches to 800/600MHz for a moment.

Mine currently says (with turbo-mode tags already removed):

root at letux:/# cpufreq-info 
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq at vger.kernel.org, please.
analyzing CPU 0:
  driver: cpufreq-dt
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 300 us.
  hardware limits: 300 MHz - 1000 MHz
  available frequency steps: 300 MHz, 600 MHz, 800 MHz, 1000 MHz
  available cpufreq governors: conservative, userspace, powersave, ondemand, performance
  current policy: frequency should be within 300 MHz and 1000 MHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 1000 MHz (asserted by call to hardware).
  cpufreq stats: 300 MHz:77.01%, 600 MHz:7.83%, 800 MHz:0.52%, 1000 MHz:14.64%  (56341)
root at letux:/# /root/temperatures 
Sun Sep 15 12:33:55 UTC 2019 67° 4181mV 1000MHz
root at letux:/# 

Maybe it is a misunderstanding what the tubo-mode tag does.
It just disables the 1GHz OPP until the /sys/.../boost is
enabled. Then it becomes available and is no longer "1000 Mhz: 0.00%".

Everything else runs the same, depending on the govenor.
And you can still cpufreq-set -f 1g.

So it does *not* switch to "always 1GHz" and removing it
just gives green light to the on-demand govenor.

Basically I added the turbo-mode to the 1GHz OPP because we
can't guarantee thermal limits without thermal management.
So it is limited until someone explicitly allows it to be used.
Either by the echo command or by adding thermal management to
the kernel.

> I remember hw troubles not showing up in memtest but in gcc giving
> internal errors and corrupted filesystems. So maybe compiling the
> kernel in 1Ghz would be a good test. Or compiling anything with a
> reproducable build infrastructure. Debian pbuild system e.g. should
> give the same build results everywhere.

Yes, that looks like a good test.

Mine is with the high_load script that does some heavy NEON calculations
(haven't checked what Grazvydas has really coded there). But it would
report floating point calculation errors by cross- and double-checking
the results.

What is not clear how it uses the caches and it might have low memory
bandwidth needs.

But memory access speed is independent of OPP (at least for clock).

BR,
Nikolaus



More information about the Letux-kernel mailing list