[Letux-kernel] Timers (was Re: [PATCH 00/20] A bunch of JZ4730 fixups for letux-kernel)
H. Nikolaus Schaller
hns at goldelico.com
Tue Jan 19 10:15:48 CET 2021
> Am 19.01.2021 um 09:49 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
>
> Hi Paul,
>
>> Am 18.01.2021 um 14:01 schrieb H. Nikolaus Schaller <hns at goldelico.com>:
>>
>> Hi Paul,
>>
>>> Am 18.01.2021 um 01:03 schrieb Paul Boddie <paul at boddie.org.uk>:
>>>
>>> Nikolaus (and others),
>>>
>>> Just following up to something from a while ago...
>>>
>>> On Wednesday, 30 December 2020 18:02:40 CET Paul Boddie wrote:
>>>> On Wednesday, 30 December 2020 15:10:05 CET H. Nikolaus Schaller wrote:
>>>>>
>>>>> Which counter on the chip is used for that so that I could check its
>>>>> settings with devmem2?
>>>>
>>>> Unless I am mistaken, the r4k_read_sched_clock function (arch/mips/kernel/
>>>> csrc-r4k.c) will be used, and this takes advantage of the Count coprocessor
>>>> register (CP0 register 9 select 0).
>>>
>>> In fact, I am not sure this is being included in the build. The arch/mips/
>>> Kconfig file does not set CSRC_R4K for any of the Ingenic-related options.
>>>
>>> I was just reviewing the timer-related code, and I can only suggest the
>>> improvements covered by the patch included here, with the casting to 16-bit
>>> values removed, and with an adjustment setting the counter upon reload
>>> appropriately remembering that the OST counts down, not up, unlike the TCU on
>>> the JZ4740.
>>>
>>> Looking at my Fiasco timer initialisation code, the input clock frequency
>>> being EXCLK or 3686400Hz and a "scheduler granularity" of 1000 results in a
>>> reload value for the timer of 3686, so in that environment a 16-bit value is
>>> sufficient.
>>>
>>> The other patch included with this message trims redundant definitions from
>>> the clock driver.
>>
>> I'll try both but I am quite sure that the OST is running properly and having
>> 16 bits only is not a limitation or a reason for the observed problems. Extending
>> to 32 bit may only hide them better so that it occurs more rarely.
>>
>> I already found that there are calculations to set up a hr-timer to regularily
>> and early enough read the OST and handle its 16 bit overflow.
>>
>> The problem seems to be that this hrtimer isn't scheduled as fast as requested.
>> So it is a problem of the hrtimer and not the jz4730 specific OST.
>>
>> The main reason I can imagine is that somewhere there is a spinlock or
>> interrupts are disabled for a too long time.
>>
>> Anyways it will be interesting what your patches really change. Maybe we
>> get another hint towards a debugging strategy.
>
> I could not apply the second patch. Maybe it contains fixes that I have already
> done.
>
> The first one has no visible effect. Still:
>
> root at letux:~# cat </dev/tcp/time.nist.gov/13 && sleep 10 && cat </dev/tcp/time.nist.gov/13
>
> 59233 21-01-19 08:38:37 00 0 0 738.1 UTC(NIST) *
>
> 59233 21-01-19 08:38:55 00 0 0 448.1 UTC(NIST) *
> root at letux:~#
>
> Hm. A good theory would be that the hr-timers are running half-speed compared
> to what the timeout calculations assume. This would explain the double sleep
> time and that the sched_clock() overflow detection fails (if it is not sampled
> fast enough).
>
> Here are some findings about hr-timers:
>
> https://elinux.org/High_Resolution_Timers
> https://lwn.net/Articles/167897/
>
> According to the second article, timers can be attached to processes (threads)
> by setting ->data and having the callback NULL. Then they awake the process
> after they timeout.
>
> I have not checked how the sleep syscall is implemented but the best idea would
> be to start a hr-timer, make the process sleep and wake up after timeout.
>
> Studying how this all works could reveal why we get twice the expected
> sleep time...
Indeed, the nanosleep() syscall is implemented by using a hrtimer:
https://elixir.bootlin.com/linux/v5.11-rc3/source/kernel/time/hrtimer.c#L1950
And I assume that the standard sleep() syscall is no longer a real syscall
but a glibc wrapper for nanosleep().
letuxes, it is
https://github.com/lattera/glibc/blob/master/sysdeps/posix/sleep.c (line 55).
So we have learned that sleep 10 indeed calls hrtimer_nanosleep() resp.
do_nanosleep() and there we somehow get the factor 2 compared to what
user-space assumes.
We are getting closer...
BR,
Nikolaus
More information about the Letux-kernel
mailing list