[Letux-kernel] LetuxOS: Replicant for GTA04

H. Nikolaus Schaller hns at goldelico.com
Wed Oct 21 09:39:29 CEST 2020

Hi Andreas,

> Am 21.10.2020 um 09:01 schrieb Andreas Kemnade <andreas at kemnade.info>:
> On Tue, 20 Oct 2020 20:31:25 +0200
> "H. Nikolaus Schaller" <hns at goldelico.com> wrote:
>> While we are waiting for v5.10-rc1, I have invested some
>> time to resurrect our Replicant 4.2 image for the GTA04.
>> I have some progress towards running a more modern kernel :)
>> The biggest missing piece was that my build wrapper did not call
>> the Letux/scripts/replicant-modules script. So there were no
>> modules in the right format to install and the kernel failed
>> to load them.
>> The next problem was that init.gta04.rc is not maintained
>> in the kernel tree but depends on it. So the version I originally
>> installed did not even call load_modules.sh.
>> I have changed that, so that the kernel and the startup tricks
>> and helpers can be kept more easily in sync.
>> So far I can now build a new µSD through makesd by
>> 	DEV=/dev/sdb makesd -v latest replicant -r https://download.goldelico.com/gta04-replicant/4.2/20170423-kernel-4.10-replicant.tbz
>> and then manually install a newly built kernel, device tree
>> and repacked modules.
> Yes, the old image I have built. It already runs on the GTA04A5.

Good! What I want to achieve by moving to newer kernel releases
is to switch to a longterm kernel and have some newer kernel features.
And potentially be able to run it on other devices which are
not supported by the 4.10 kernel.

>> As soon as I run the next full build of all letux kernels
>> (maybe next Monday when v5.10-rc1 is out) there should be
>> new replicant compatible kernel packages so that makesd can
>> do a full install. I'll report how to do this this later.
>> The status of my tests is:
>> Kernel 4.14.202 boots up to the Replicant logo and then stucks.
>> Kernel 4.19.152 writes ANDROID to the framebuffer but then reboots - this kernel should already support the GTA04A5 if I remember correctly.
>> Kernel 5.4.72 panics somewhere in omapdrm.
>> Kernel 5.9.1 drivers/staging/android/logger.c does not compile (problems with struct timespec).
>> So there is some progress and potentially fixable issues.
>> The most important one is that I do not know how to debug that there
>> is no progress after the Replicant logo with 4.14 kernel. There is
>> no console message, processes are running but nothing visible happens.
>> Touch remains inactive. Console shell works.
>> Ideas? Suggestions?
> What I am wondering about is whether it is really a good idea to stick
> with replicant 4.2. Maybe it is easier to get replicant 6 running.

Well, from my perspective it is much more complex since I have never had
a working replicant 6 image... And I can't build new rootfs images at
the moment.

So it seems much easier to modernize the kernel and patch some minor issues
with replicant 4.2. And there is only a small piece missing (boot logo hangs)
where I am just missing ideas how to debug it.

As soon as this is really working (and easy to install), we can have
a look at building replicant (4.2, 6, >6) from sources.

IMHO there is no strace available to check what /system/bin/bootanimation
is doing (what it is waiting for).

Well, I was wrong. There *is* strace :)

root at android:/ # kill 2044
[  212.338470] binder_alloc: 2044: binder_alloc_buf, no vma
root at android:/ # [  212.350494] binder: 2024:2043 transaction failed 29189/-3, size 12-0 line 2933
[  212.358978] binder: send failed reply for transaction 1159 to 2044:2063
[  212.366851] binder: undelivered TRANSACTION_ERROR: 29189

root at android:/ # 
root at android:/ # strace /system/bin/bootanimation
execve("/system/bin/bootanimation", ["/system/bin/bootanimation"], [/* 22 vars */]) = 0
mprotect(0x40035000, 4096, PROT_READ)   = 0
gettid()                                = 14072
set_tls(0x40037218, 0x400371dc, 0x40037314, 0x40, 0x400371dc) = 0
getpid()                                = 14072
sigaction(SIGILL, {0x4002bc41, [], SA_RESTART|SA_SIGINFO}, NULL, 0x36f8) = 0
sigaction(SIGABRT, {0x4002bc41, [], SA_RESTART|SA_SIGINFO}, NULL, 0x36f8) = 0
sigaction(SIGBUS, {0x4002bc41, [], SA_RESTART|SA_SIGINFO}, NULL, 0x36f8) = 0
sigaction(SIGFPE, {0x4002bc41, [], SA_RESTART|SA_SIGINFO}, NULL, 0x36f8) = 0
sigaction(SIGSEGV, {0x4002bc41, [], SA_RESTART|SA_SIGINFO}, NULL, 0x36f8) = 0
sigaction(SIGSTKFLT, {0x4002bc41, [], SA_RESTART|SA_SIGINFO}, NULL, 0x36f8) = 0
sigaction(SIGPIPE, {0x4002bc41, [], SA_RESTART|SA_SIGINFO}, NULL, 0x36f8) = 0


setpriority(PRIO_PROCESS, 0, -4)        = 0
open("/dev/binder", O_RDWR)             = 3
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
ioctl(3, 0xc0046209, 0xbeefca68)        = 0
ioctl(3, 0x40046205, 0xbeefca6c)        = 0
mmap2(NULL, 1040384, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 3, 0) = 0x40435000
madvise(0x40435000, 1040384, 0xc /* MADV_??? */) = -1 EINVAL (Invalid argument)
madvise(0x40533000, 1048576, 0xc /* MADV_??? */) = -1 EINVAL (Invalid argument)
mprotect(0x40533000, 4096, PROT_NONE)   = 0
clone(child_stack=0x40632ef8, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|0x400000) = 14258
brk(0x1129000)                          = 0x1129000
gettid()                                = 14072
getpid()                                = 14072
getuid32()                              = 0
ioctl(3, 0xc0186201, 0xbeefc840)        = 0
ioctl(3, 0xc0186201, 0xbeefc9c0)        = 0
ioctl(3, 0xc0186201, 0xbeefc8e8)        = 0
madvise(0x40733000, 1048576, 0xc /* MADV_??? */) = -1 EINVAL (Invalid argument)
mprotect(0x40733000, 4096, PROT_NONE)   = 0
clone(child_stack=0x40832ef8, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|0x400000) = 14260
write(4, "14072", 5)                    = 5
ioctl(3, 0xc0186201

This means the bootanimation hangs in a binder ioctl which did succeed several times before.

Or the write(4, "14072", 5) makes trouble a little later. It is strange that the file desciptor 4
isn't the result of any open or creat or dup. So it must be inherited from the parent?

Hm. If I run with 4>&1 it will switch to use fd = 5...
But I can't find a syscall returning this file descriptor.

Maybe something else triggers a bug in /system/bin/bootanimation?

Debugging is laborious and painstaking :)

Ah, some new idea: the clone() syscall with CLONE_FILES should share the
file descriptors with the child process. So it may interfere also with
file descriptor 3.

So at the moment the most suspect is /dev/binder and its driver (since that
may depend on kernel version). Hm. Binder is a standard feature of mainline
kernels... So it should be tested by many others.

Binder has a debugfs and several BINDER_DEBUG flags. Maybe this is something
to play with...

>  The
> most interesting problem there is afaicr graphics consuming too much
> cpu power, even the "bootanimation" slowing down everything far away
> from usability. 

Well, we should have the SVRSGX infrastructure running on dm3730 now...
But I have no idea if that can be activated in a replicant environment.


More information about the Letux-kernel mailing list