[Letux-kernel] Debugging 4.11-rc6 on GTA04

H. Nikolaus Schaller hns at goldelico.com
Fri Apr 14 18:52:31 CEST 2017


Hi Andreas,

> Am 14.04.2017 um 16:02 schrieb Andreas Kemnade <andreas at kemnade.info>:
> 
> On Thu, 13 Apr 2017 17:56:08 +0200
> "H. Nikolaus Schaller" <hns at goldelico.com> wrote:
> 
>> Hi Andreas,
>> 
>>> Am 12.04.2017 um 07:07 schrieb Andreas Kemnade <andreas at kemnade.info>:
>> 
>> --- snip ---
>> 
>>>> Maybe mainline has another bug. Or our latest patches trigger that in 4.11
>>>> but not in 4.10.
>>>> 
>>> Display problems seem to start with this patch:
>>> 
>>> 897145d0c7010b4e07fa9bc674b1dfb9a2c6fff9 is the first bad commit
>>> commit 897145d0c7010b4e07fa9bc674b1dfb9a2c6fff9
>>> Author: Jyri Sarha <jsarha at ti.com>
>>> Date:   Fri Jan 27 12:04:55 2017 +0200
>>> 
>>>   drm/omapdrm: Move commit_modeset_enables() before commit_planes()
>>> 
>>>   Move drm_atomic_helper_commit_modeset_enables() call to before
>>>   drm_atomic_helper_commit_planes() call and have a
>>>   omap_atomic_wait_for_completion() call after both.
>>> 
>>>   With the current dss dispc implementation we have to enable the new
>>>   modeset before we can commit planes. The dispc ovl configuration
>>>   relies on the video mode configuration been written into the HW when
>>>   the ovl configuration is calculated.
>>> 
>>>   This approach is not ideal because after a mode change the plane
>>>   update is executed only after the first vblank interrupt. The dispc
>>>   implementation should be fixed so that it is able use uncommitted drm
>>>   state information.  information.
>> 
>> ***should*** probably means it does not :(
>> 
>> I wonder why Tomi did approve that. I think we should discuss with him
>> or ask to revert?
>> 
> yes, i thnk so.
> 
>> Maybe this bug is hidden with my setup of loaded kernel modules and other
>> letux-base patches which modify initialization sequence? Or it depends
>> on other details.
>> 
> I could reproduce that with letux_defconfig. Then I reverted said patch.
> No display problem.
> Then I checked out HEAD^ (so again at 1467f43)
> compiled modules and just copied omapdrm to sd card.
> Again no display problem!
> I changed kernel commandline to init=/init-measure-current.sh
> which loads stuff piece by piece and measures current
> Now the display problem is there again.
> So it depends on module load order/timing. And the order is perhaps
> influenced by the order in which the modules are copied to the sdcard.

I think it is not dependent on order on SD card (because modules are located by
name and not file location). But it depends on the device tree and presence of
other modules which match earlier processed DT nodes. And initialization
seems to run in multiple threads, especially for i2c devices. Those can
need more or less time to complete probing depending on subtle hardware
fluctuations (e.g. speed of chip initialization). I think +/-0.1 seconds
is not uncommon and order may swap between boot attempts. On the other hand
some drivers synchronize because the check the availability of all dependent
resources and return EPROBE_DEFER if any one is not yet initialized.

Of course the speed of the SD card may have an influence because loading a
module file takes more or less time...

So probing isn't predictable since EPROBE_DEFER was introduced, even on
single processor boards. SMP processors are even more randomized. I don't
know if there is a config to enforce some strict probing sequence.

> 
> [...]
>>> kernel config used is attached. I have compiled statically most of the
>>> twl4030 / usb stuff because we have patches in our feature branches to
>>> make that work as modules.
>> 
>> Yes, indeed. I should try to upstream such things...
>> 
> Especially I am talking about this:
> commit ea858c8cdbcb2758458cbcc003e61041d75aa31f
> Author: H. Nikolaus Schaller <hns at goldelico.com>
> 
>    drivers:power:twl4030-charger: don't check if battery is present
> 
> commit fe605a5f153e8612475c460ba0714e45f3aecd3e
> Author: H. Nikolaus Schaller <hns at goldelico.com>
> 
>    fix: reorder to check resources first
> 
> commit ee7bfe0377e7cfc1e14d80f241eb990a0b8cf8d9
> Author: H. Nikolaus Schaller <hns at goldelico.com>
> 
>    drivers:power:twl4030-charger: fix problem with EPROBE_DEFER
> 
> commit 9d7b2d776637c2d132883728b6c9ff4e939d1861
> Author: H. Nikolaus Schaller <hns at goldelico.com>
> 
>    drivers:power:twl4030-charger: don't return after allocating irq

Yes. This is all in the feature branch work/hns/power/twl4030_charger-v3

Now as I can reliably boot with the latest -rc, I think I should
give a high priority to devote some time to fix/rework these patches
and upstream the most important ones.

This time not all in a big patch set because that lengthens discussions
and people comment different views :).

So it seems to be a better strategy to submit single commits. And as soon
as it is accepted submit the next one. This works as long as they are
independent.

BR,
Nikolaus

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.goldelico.com/pipermail/letux-kernel/attachments/20170414/de6bc237/attachment.asc>


More information about the Letux-kernel mailing list