[Letux-kernel] 1.5GHz problems

H. Nikolaus Schaller hns at goldelico.com
Sun Jul 31 23:37:17 CEST 2016

More test results / observations:

A) Pyra: MIPI/DSI is not working if govenor "powersave" and 500 MHz is selected at boot (and OMAP5 running @500MHz only)
I get "DSI protocol violations" which means the CPU is too slow to process something.
To me this indicates that something is CPU clock frequency dependent which should not be the case since DSI should be asynchronous, or not?

B) Pyra: LEDs seem to be blinking slightly irregular @500 MHz only boot (may be harmless)
Sometimes the serial console appears to show delays (looks as if some IRQ blocking process runs a little slow?)

C) 5432EVM (letux-4.7.0 w/o Pyra Mainboard connected)
I have modified the DT so that it also has 500 / 750 / 1250 MHz OPPs like for the Pyra.
Termperature measurements (with cooling fingers installed)

idle at 500MHz => ca. 46°C
high-load at 500MHz => ca. 52°C

idle at 1000MHz => ca. 50°C
high-load at 1000MHz => ca. 73°C

idle at 1500MHz => ca. 55°C
high-load at 1500MHz => ca. 99°C
cools down to 73° within 5 seconds...

But in one attempt @1.5GHz my EVM did suddenly hang!
Unfortunately not reproducible.

D) 5432EVM with Pyra Mainboard connected (booted at 500 MHz)

* bq2429x_charger 1-006b: bq24296_battery_probe(): Failed in reading register 0x0a
   I think it is normal that it needs either battery or USB power to respond on I2C
* ssd2858 was initialized w/o errors - but does not show anything
* re-initialization of ssd fails with DSI protocol errors (have not analysed futher - may be a setup problem of the old prototype hardware)

* kernel did hang once @1.5Ghz after typing "reboot" and did not shut down.
* did hang another time @1.5GHz with no obvious reason
* kernel did hang once immerdiately after typing cpufreq-set -f 1.5GHz over ssh/ethernet and ./high-load 

hang means:
* heartbeat LEDs stop blinking (usually they blink until kernel prints
	[info] Will now halt.
	[  293.933938] dsi: mipi_debug_disable()
	[  293.969283] reboot: Power down
* Palmas does not react to power-on (because it is still on)
* EVM reboots on Reset

So on my EVM there is also a problem with 1.5GHz although not as strong as on the Pyra CPU board.
For better analysis someone else should be able to reproduce and then apply JTAG to find out what the
OMAP is doing (or not doing) in that state.

E) Pyra CPU Temperature measurements (same as test C)

dle at 500MHz => ca. 42°C
high-load at 500MHz => ca. 54°C

idle at 1000MHz => ca. 62°C
high-load at 1000MHz => ca. 90°C

idle at 1500MHz 
heartbeat stops almost immediately after switching to 1.5 GHz

BTW: display also doesn't work (maybe it is related to 500 MHz boot) any more in the current setup

I booted again and was able to make this log

root at letux:~# cpufreq-set -f 1.5ghz
root at letux:~# ls
b_host.sh       femtocom.c       mic-test        tam
batt            findhwmon        mipi-test       temperatures
bl              findiio          modprobe-test   test_omapfb_vsync.c
blanviewd.c     findregulator    mqtmoko         tvout
boe-w677l       findthermal      palmas-dump     twl
boe-w677l.prog  flash-nand       panelselect     useful
bootstrap       fm-demo          pin             vibra.py
bt-scan         gps-demo         pppd            video-demo
camera-demo     gps-on           profile         volumed
capture-demo    hello.c          sd-maximise     wlan-on
capture.c       high-load        si4721.c        wlan-scan
charger         hw-test          somefile.mp3    wwan
config.tgz      kbl              somefile.wav    wwan-off
dcs             ledtest          somefile4.mp3   wwan-on
debugdsi        lsHSO            sound-demo      wwan-status
dial            makerootfs       ssd2858         wwan.conf
fbpng.c         measure-power    success-s90451  x
fbtest          measure-suspend  suspend-test
femtocom        mic-present      sys-profile
root at letux:~# ls -l

Then everything hangs. Heartbeat stops blinking.

Interestingly, I have *never* seen any console message when it hangs!

This raises another question: what would happen if one core runs wild?
Does the other core continue in SMP mode? Would it be able to print some
messages? Is there some special CONFIG for kernel debugging of SMP
or scheduling?

F) Pyra: Voltage measurements before and after cpufreq-set -f 1.5GHz

Measured by Oscilloscope (GND reference on Mainboard!)
			after boot		cpufreq-set	cpufreq-set
						1GHz		1.5GHz
VDD_MPU:	0.96V		1.16V		1.4V - on one case Palmas did shut down so I could not measure
VDD_DDR3:	1.32V		1.32V		1.32V
VDD_CORE:	1.12V		1.16V		1.16V

Measurement by external Voltmeter (reference to CPU GND)
			after boot		cpufreq-set	cpufreq-set	hangs
						1GHz		1.5GHz		@1.5GHz
VDD_MPU:	0.843V		1.053V		1.241V		1.247V
VDD_CORE:	1.035V		1.034V		1.036V		1.036V
VDD_DDR3:	1.209V		1.210V		1.201V		1.203V
VDD_MM:	1.041V		1.041V		1.041V		10.40V
1V8:			1.788V		1.789V		1.799V		1.790V
VSYS:		3.624V		3.622V		3.627V		3.562V

	operating-points = <
		/* kHz    uV */
		500000 850000
		750000 950000
		1000000 1060000
		1250000 1150000
		1500000 1250000

So the VDD_MPU is very precise! Just 7mV lower than defined by software.

other observations:
* no visible difference when high-load is started @500MHz
* it appears as if noise increases on VDD_MPU if high-load is started @1GHz
* it appears as if noise increases even more on VDD_MPU if high-load is started @1.5GHz - until activities break down

Interesting is that sometimes the Palmas shuts down after cpufreq-set to 1.5 GHz instead
of making the CPU just hang. But I have seen it first time after connecting the Oscilloscope
probes. This would mean that the Palmas detects some regulator overcurrent or something
else going wrong.

Sometimes the CPU immediately hangs after switching to 1.5GHz. But one time it continued
to blink the LEDs so that I could measure all the voltages. After starting high-load it stopped

And it might be that we have too much noise on VDD_MPU @1.5GHz.
This is now a test scenario that I can compare to the EVM as soon as I find time for doing that.

This was a good hint to boot in "powersave" mode and manually switch to 1.5GHz.
Before this, I wasn't able to do such measurements. And without measurements
it is not possible to judge if something is ok or not ok - or if some change is an improvement.

Anyways these results now need some interpretation. Especially why VSSYS goes down
from 3.6V to 3.5V! And the other voltages remain the same. This either means that the
Palmas draws even more current from VSYS - or the bq24297 has reset itself to the default
3.5V regulation mode. Or something else?

Strange. Weird. Odd. Puzzling.

But we come closer to an understanding.


More information about the Letux-kernel mailing list