[Letux-kernel] [PATCH 00/20] A bunch of JZ4730 fixups for letux-kernel
H. Nikolaus Schaller
hns at goldelico.com
Tue Dec 29 16:26:58 CET 2020
Hi Paul,
I had another idea.
> Am 29.12.2020 um 11:38 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
>
> OTCSR1 is 0x0042 which means PCLK/64. What frequency is PCLK?
> If it were let's say 100MHz, we still get a 16 bit overflow only
> after 40 msec.
>
> Hm. I think here I am lost with further analyses...
> Or I have a compiler error expanding to u64.
>
> Or one of the calculations following outside the ingenic
> specific code is going wrong.
>
> Maybe one of the values in
>
> [ 1.979722] adapter_test: epoch_ns=1800520832
> [ 2.117934] adapter_test: epoch_cyc=3950
> [ 2.242847] adapter_test: sched_clock_mask=65535
> [ 2.388993] adapter_test: mult=2275555556
> [ 2.522934] adapter_test: shift=17
>
> leads to some integer overflow.
new debugging code:
{
int i;
unsigned int start = 0x10002034;
void __iomem *addr;
addr = ioremap(start, 4);
if (!addr) {
printk("ioremap failed %08x\n", start);
return;
}
for(i=0; i<20; i++)
printk("%s: sched_clock=%llu jiffies=%lu OTCNT1=%u\n", __func__, sched_clock(), jiffies, readl(addr));
iounmap((void *) addr);
}
[ 2.101805] adapter_test: sched_clock=2100486110 jiffies=4294937380 OTCNT1=44298
[ 2.341874] adapter_test: sched_clock=2340156249 jiffies=4294937381 OTCNT1=30493
[ 2.576336] adapter_test: sched_clock=2575104165 jiffies=4294937381 OTCNT1=16960
[ 2.810277] adapter_test: sched_clock=2809027776 jiffies=4294937381 OTCNT1=3485
[ 1.903802] adapter_test: sched_clock=1902552082 jiffies=4294937381 OTCNT1=55698
[ 2.137743] adapter_test: sched_clock=2136510415 jiffies=4294937381 OTCNT1=42223
[ 2.377586] adapter_test: sched_clock=2375885415 jiffies=4294937382 OTCNT1=28436
[ 2.612048] adapter_test: sched_clock=2610798610 jiffies=4294937382 OTCNT1=14903
[ 2.845972] adapter_test: sched_clock=2844722221 jiffies=4294937382 OTCNT1=1429
[ 1.939479] adapter_test: sched_clock=1938246526 jiffies=4294937382 OTCNT1=53642
[ 2.173420] adapter_test: sched_clock=2172170137 jiffies=4294937382 OTCNT1=40168
[ 2.413246] adapter_test: sched_clock=2411545138 jiffies=4294937383 OTCNT1=26381
[ 2.647725] adapter_test: sched_clock=2646493054 jiffies=4294937383 OTCNT1=12848
[ 2.881649] adapter_test: sched_clock=2880416665 jiffies=4294937383 OTCNT1=64909
[ 1.977829] adapter_test: sched_clock=1976562499 jiffies=4294937383 OTCNT1=51435
[ 2.211753] adapter_test: sched_clock=2210520832 jiffies=4294937383 OTCNT1=37959
[ 2.451649] adapter_test: sched_clock=2449947915 jiffies=4294937384 OTCNT1=24170
[ 2.686076] adapter_test: sched_clock=2684861110 jiffies=4294937384 OTCNT1=10638
[ 1.782239] adapter_test: sched_clock=1780989582 jiffies=4294937384 OTCNT1=62700
[ 2.016180] adapter_test: sched_clock=2014930554 jiffies=4294937384 OTCNT1=49225
What we can deduce is that the OTCNT1 is properly decrementing.
Parallel to that the sched_clock increments.
But each time the OTCNT1 wraps around (restarts at 0xffff) the sched_clock jumps
backwards.
I guess since the sched_clock is prepared to handle a limited precision timer
it detects this case and adds the wrap-around value. I.e. it should keep its own
extension of the 16 bits of the timer run by read_sched_clock.
This logic seems to have a wrong assumption what to add. Rather it subtracts
something...
No let's try to understand how a call to sched_clock() is translated into
a call to ingenic_tcu_timer_read() and where this wrap-around detecion takes
place.
BTW: the frequency the OTCNT1 wraps around seems to be almost the same as the
jiffies increment. But it is a little faster. You can see it that the OTCNT1 is
roughly 30000 each time the jiffies increment, but it comes 2000 counts further
each time.
So if the wrap-around detection happens by the jiffies interrupt it may loose
some double-wrap every now and then. So we should run the OTCNT1 a little slower
or the jiffies a little faster? If not then we don't care.
After some search I think I have found how it works:
https://elixir.bootlin.com/linux/latest/source/kernel/time/sched_clock.c#L127
Line 135 reads the clock register (value 0..65535)
Line 136 subtracts the previous value stored in rd.epoch_cyc and masks it by
the sched_clock_mask (65535) to correctly handle wrap-around.
Then it calls cyc_to_ns to convert to ns. And adds the last ns value.
Line 138&139 update the stored values.
Calling sched_clock() takes the last epoch_ns value and adds as much as it
has been advanced.
So there are two potential reasons why the clock jumps backwards then...
a) the cyc_to_ns() function has a (compiler?) bug and returns negative values
b) update_sched_clock() is not called before the counter wraps around
I can't exclude a compiler bug for 64 bit multiply and shift in cyc_to_ns()
but it is unlikely that it is a general bug. Other architectures use it as
well.
Now, who is calling update_sched_clock() regularly? Which timer/interrupt?
Seems to be some hrtimer(). Is it running faster than OTCNT1 wraps around?
Or is the irq disabled for too long time?
Basically the hrtimer is initialized by generic_sched_clock_init().
And the timer frequency defined by cd.wrap_kt. I have not yet printed the
value but can easily do if needed.
So I'd say the problem is NOT OTCNT1 but we have learned that the (u16) can
be removed.
It is the hrtimer which seems to be broken.
BR,
Nikolaus
More information about the Letux-kernel
mailing list