[Letux-kernel] [PATCH 00/20] A bunch of JZ4730 fixups for letux-kernel

H. Nikolaus Schaller hns at goldelico.com
Tue Dec 29 11:38:25 CET 2020


> Am 29.12.2020 um 09:28 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
> 
> Hi Paul,
> 
>> Am 28.12.2020 um 23:29 schrieb Paul Boddie <paul at boddie.org.uk>:
>> 
>> 
>>> I still have the printk sched_clock timer issue. Which also may have
>>> an influence since I don't know if it is used by
>>> wait_for_completion_timeout() and triggers "early" timeout. Then I would
>>> try to debug this in the i2c driver...
>> 
>> [...]
>> 
>>> To me the pattern looks like integer overflow during
>>> multiplication / division or expansion from 32 bit
>>> to 64 bit time value.
>>> 
>>> Question: how does sched_clock() work on the jz4730?
>>> 
>>> Obviously it does not use the simple jiffies based clock:
>>> 
>>> https://elixir.bootlin.com/linux/latest/source/kernel/sched/clock.c#L64
>>> 
>>> It likely does use
>>> 
>>> https://elixir.bootlin.com/linux/latest/source/kernel/time/sched_clock.c#L82
>>> 
>>> But I could not identify a call to sched_clock_register()
>>> or how the setup works. And which registers sched_clock() reads.
>>> 
>>> So we probably should focus on that first before I
>>> continue with the i2c driver.
>> 
>> So, among the other things I have been looking at is this. I went searching 
>> for timer/counter details starting at time_init (arch/mips/kernel/time.c). 
>> This led me to some likely definitions applicable to the JZ4730 such as 
>> init_r4k_clocksource (arch/mips/kernel/csrc-r4k.c) which uses a variable 
>> called mips_hpt_frequency that is defined in plat_time_init (arch/mips/
>> generic/init.c).
>> 
>> It is in the plat_time_init function that an attempt is made to get a clock, 
>> but we haven't defined one in the device tree for the JZ4730. So, with the 
>> assumption that this might make a difference, I have made a patch that should 
>> add a CPU node with a reference to the appropriate clock.
>> 
>> Paul<0001-Add-missing-CPU-node-to-make-timer-available.patch>
> 
> i have added the patch, which looks good to have in ay case, but it has no
> effect.
> 
> I took the opportunity to print /proc/interrupts and test the "sleep" command:
> it took real time 30 seconds for the "sleep 10"...
> 
> root at letux:~# cat /proc/interrupts && sleep 10 && cat /proc/interrupts 
>           CPU0       
>  2:      24680      MIPS   2  SoC intc cascade interrupt
>  8:       8277       TCU   0  TCU0
>  9:        413      INTC   1  10042000.i2c
> 10:       1563      INTC   9  ttyS0
> 14:       8656      INTC  14  10021000.mmc
> 15:       6190      INTC  13  ohci_hcd:usb1
> 16:          0     GPIOA   0  matrix-keypad
> 17:          0     GPIOA   1  matrix-keypad
> 18:          0     GPIOA   2  matrix-keypad
> 19:          0     GPIOA   3  matrix-keypad
> 20:          0     GPIOA   4  matrix-keypad
> 21:          0      INTC  21  13020000.dma
> 29:          0     GPIOA   5  matrix-keypad
> 30:          0      INTC  30  ingenic-drm
> 31:          0     GPIOA   6  matrix-keypad
> 32:          0     GPIOA   7  matrix-keypad
> 33:          0     GPIOC   0  10021000.mmc cd
> 34:          0     GPIOA  16  Touchpad Left
> 35:          0     GPIOA  13  Touchpad Right
> ERR:          0
>           CPU0       
>  2:      26964      MIPS   2  SoC intc cascade interrupt
>  8:       9278       TCU   0  TCU0
>  9:        413      INTC   1  10042000.i2c
> 10:       1676      INTC   9  ttyS0
> 14:       8663      INTC  14  10021000.mmc
> 15:       7355      INTC  13  ohci_hcd:usb1
> 16:          0     GPIOA   0  matrix-keypad
> 17:          0     GPIOA   1  matrix-keypad
> 18:          0     GPIOA   2  matrix-keypad
> 19:          0     GPIOA   3  matrix-keypad
> 20:          0     GPIOA   4  matrix-keypad
> 21:          0      INTC  21  13020000.dma
> 29:          0     GPIOA   5  matrix-keypad
> 30:          0      INTC  30  ingenic-drm
> 31:          0     GPIOA   6  matrix-keypad
> 32:          0     GPIOA   7  matrix-keypad
> 33:          0     GPIOC   0  10021000.mmc cd
> 34:          0     GPIOA  16  Touchpad Left
> 35:          0     GPIOA  13  Touchpad Right
> ERR:          0
> root at letux:~# 
> 
> exactly 1001 TCU0 interrupts, some for ttyS0 as expected
> and 1200 USB interrupts.
> 
> All that is not really fast.
> 
> So we should IMHO understand which version of sched_clock()
> is used and which registers it is reading.
> 
> kernel/time/sched_clock.o exists so I assume that it uses
> this version which calls some read_sched_clock function
> pointer.
> 
> But what is the function behind the read pointer?
> 
> I think this needs some more printk - but printk is
> not yet available before the really early stage setup
> is done.

Ok, I managed to hack printing the values of the "cd" struct:

https://elixir.bootlin.com/linux/latest/source/kernel/time/sched_clock.c#L60

[    2.902204] adapter_test: read_sched_clock=ingenic_tcu_timer_read+0x0/0xb8
[    1.979722] adapter_test: epoch_ns=1800520832
[    2.117934] adapter_test: epoch_cyc=3950
[    2.242847] adapter_test: sched_clock_mask=65535
[    2.388993] adapter_test: mult=2275555556
[    2.522934] adapter_test: shift=17
[    2.633836] adapter_test: ingenic_tcu_timer_read+0x0/0xb8

So we do an ingenic_tcu_timer_read() which makes the hickups...

This is supposed to return an u64 from a regmap:

https://elixir.bootlin.com/linux/latest/source/drivers/clocksource/ingenic-timer.c#L52

Our code expands this to

static u64 notrace ingenic_tcu_timer_read(void)
{
	struct ingenic_tcu *tcu = ingenic_tcu;
	unsigned int reload;
	unsigned int count;

	if (tcu->soc_info->jz4740_regs) {
		regmap_read(tcu->map, TCU_REG_TCNTc(tcu->cs_channel), &count);
	} else {
		regmap_read(tcu->map, TCU_JZ4730_REG_TCNTc(tcu->cs_channel), &count);
		regmap_read(tcu->map, TCU_JZ4730_REG_TRDRc(tcu->cs_channel), &reload);
		count = (u16)reload - (u16)count;
	}

	return count;
}

Hm. 16 bit. If I get it right we access the OTCNT0 and OTDR0 at 0x10002014 and 0x10002010.
These are 32 bit registers. So why do we 16 bit truncation?
Strangely, experiment shows that removal of the (u16) doesn't make a difference.

So here is the devmem2 of the related registers:

root at letux:~# devmem2 0x10002000
/dev/mem opened.
Memory mapped at address 0x77eb6000.
Value at address 0x10002000 (0x77eb6000): 0xFFFFFF03
root at letux:~# devmem2 0x10002010
/dev/mem opened.
Memory mapped at address 0x77eb4000.
Value at address 0x10002010 (0x77eb4010): 0x23F
root at letux:~# devmem2 0x10002014
/dev/mem opened.
Memory mapped at address 0x77ed8000.
Value at address 0x10002014 (0x77ed8014): 0x9D7B
root at letux:~# devmem2 0x10002018
/dev/mem opened.
Memory mapped at address 0x77ece000.
Value at address 0x10002018 (0x77ece018): 0xFFFF0022
root at letux:~# devmem2 0x1000201c
/dev/mem opened.
Memory mapped at address 0x77ef3000.
Value at address 0x1000201C (0x77ef301c): 0x9D7B
root at letux:~# devmem2 0x10002010
/dev/mem opened.
Memory mapped at address 0x77ec5000.
Value at address 0x10002010 (0x77ec5010): 0x23F
root at letux:~# devmem2 0x10002010
/dev/mem opened.
Memory mapped at address 0x77ea4000.
Value at address 0x10002010 (0x77ea4010): 0x23F
root at letux:~# devmem2 0x10002030
/dev/mem opened.
Memory mapped at address 0x77e09000.
Value at address 0x10002030 (0x77e09030): 0xFFFF
root at letux:~# devmem2 0x10002034
/dev/mem opened.
Memory mapped at address 0x77e19000.
Value at address 0x10002034 (0x77e19034): 0xABA3
root at letux:~# devmem2 0x10002038
/dev/mem opened.
Memory mapped at address 0x77e76000.
Value at address 0x10002038 (0x77e76038): 0xFFFF0042
root at letux:~# devmem2 0x1000203c
/dev/mem opened.
Memory mapped at address 0x77e5c000.
Value at address 0x1000203C (0x77e5c03c): 0xD1C6
root at letux:~# devmem2 0x10002030
/dev/mem opened.
Memory mapped at address 0x77e88000.
Value at address 0x10002030 (0x77e88030): 0xFFFF
root at letux:~# devmem2 0x10002034
/dev/mem opened.
Memory mapped at address 0x77e09000.
Value at address 0x10002034 (0x77e09034): 0x9E61
root at letux:~# 

Looks as if we are using OS Timer 1? Right?

Now I also understand why it does not make a difference to remove the (u16)
because the timer is running on 16 bit precision anyways and initialized
with 2^16-1.

OTCSR1 is 0x0042 which means PCLK/64. What frequency is PCLK?
If it were let's say 100MHz, we still get a 16 bit overflow only
after 40 msec.

Hm. I think here I am lost with further analyses...
Or I have a compiler error expanding to u64.

Or one of the calculations following outside the ingenic
specific code is going wrong.

Maybe one of the values in

[    1.979722] adapter_test: epoch_ns=1800520832
[    2.117934] adapter_test: epoch_cyc=3950
[    2.242847] adapter_test: sched_clock_mask=65535
[    2.388993] adapter_test: mult=2275555556
[    2.522934] adapter_test: shift=17

leads to some integer overflow.

BR,
Nikolaus

 



More information about the Letux-kernel mailing list