[Letux-kernel] [BUG] mmc: core: adjust polling interval for CMD1

Ulf Hansson ulf.hansson at linaro.org
Tue Mar 1 14:38:35 CET 2022


On Thu, 17 Feb 2022 at 21:12, H. Nikolaus Schaller <hns at goldelico.com> wrote:
>
> Hi,
> Jean Rene Dawin did report to me a problem on the Beagle Bone Black starting
> with our disto kernel based on v5.17-rc1:
>
> >> since kernel 5.17-rc1 I noticed slower emmc performance on Beaglebone
> >> Black, but didn't check the logs.
> >> When I tried to run 5.17.0-rc3-letux+ it booted fine, but during IO
> >> traffic there were messages like
> >>
> >> [  662.529584] mmc1: error -110 doing runtime resume
> >> [  669.293590] mmc1: Card stuck being busy! __mmc_poll_for_busy
> >>
> >> [  739.076072] mmc1: Timeout waiting for hardware interrupt.
> >> [  739.145676] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
> >> [  739.231053] mmc1: sdhci: Sys addr:  0x00000000 | Version:  0x00003101
> >> [  739.316472] mmc1: sdhci: Blk size:  0x00000200 | Blk cnt:  0x00000400
> >> [  739.401937] mmc1: sdhci: Argument:  0x00342d30 | Trn mode: 0x00000023
> >> [  739.487439] mmc1: sdhci: Present:   0x01f70000 | Host ctl: 0x00000000
> >> [  739.573007] mmc1: sdhci: Power:     0x0000000f | Blk gap:  0x00000000
> >> [  739.658609] mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00003c07
> >> [  739.744224] mmc1: sdhci: Timeout:   0x00000007 | Int stat: 0x00000002
> >> [  739.829896] mmc1: sdhci: Int enab:  0x027f000b | Sig enab: 0x027f000b
> >> [  739.915623] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000001
> >> [  740.001394] mmc1: sdhci: Caps:      0x07e10080 | Caps_1:   0x00000000
> >> [  740.087208] mmc1: sdhci: Cmd:       0x0000193a | Max curr: 0x00000000
> >> [  740.173051] mmc1: sdhci: Resp[0]:   0x00000900 | Resp[1]:  0x00000000
> >> [  740.258928] mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
> >> [  740.344854] mmc1: sdhci: Host ctl2: 0x00000000
> >> [  740.402796] mmc1: sdhci: ============================================
> >>
> >> and finally IO errors and a corrupted filesystem.
> >>
> >> 5.17.0-rc4-letux+ shows the same behaviour.
>
> I checked with my BeagleBoard Black (am3358) and can confirm this observation.
> It happens only with the integrated eMMC but not with the µSD connected to
> the other mmc interface.
>
> A git bisect found:
>
> 76bfc7ccc2fa9d382576f6013b57a0ef93d5a722 is the first bad commit
> commit 76bfc7ccc2fa9d382576f6013b57a0ef93d5a722
> Author: Huijin Park <huijin.park at samsung.com>
> Date:   Thu Nov 4 15:32:31 2021 +0900
>
>   mmc: core: adjust polling interval for CMD1
>
>   In mmc_send_op_cond(), loops are continuously performed at the same
>   interval of 10 ms.  However the behaviour is not good for some eMMC
>   which can be out from a busy state earlier than 10 ms if normal.
>
>   Rather than fixing about the interval time in mmc_send_op_cond(),
>   let's instead convert into using the common __mmc_poll_for_busy().
>
>   The reason for adjusting the interval time is that it is important
>   to reduce the eMMC initialization time, especially in devices that
>   use eMMC as rootfs.
>
>   Test log(eMMC:KLM8G1GETF-B041):
>
>   before: 12 ms (0.311016 - 0.298729)
>   [    0.295823] mmc0: starting CMD0 arg 00000000 flags 000000c0
>   [    0.298729] mmc0: starting CMD1 arg 40000080 flags 000000e1<-start
>   [    0.311016] mmc0: starting CMD1 arg 40000080 flags 000000e1<-finish
>   [    0.311336] mmc0: starting CMD2 arg 00000000 flags 00000007
>
>   after: 2 ms (0.301270 - 0.298762)
>   [    0.295862] mmc0: starting CMD0 arg 00000000 flags 000000c0
>   [    0.298762] mmc0: starting CMD1 arg 40000080 flags 000000e1<-start
>   [    0.299067] mmc0: starting CMD1 arg 40000080 flags 000000e1
>   [    0.299441] mmc0: starting CMD1 arg 40000080 flags 000000e1
>   [    0.299879] mmc0: starting CMD1 arg 40000080 flags 000000e1
>   [    0.300446] mmc0: starting CMD1 arg 40000080 flags 000000e1
>   [    0.301270] mmc0: starting CMD1 arg 40000080 flags 000000e1<-finish
>   [    0.301572] mmc0: starting CMD2 arg 00000000 flags 00000007
>
>   Signed-off-by: Huijin Park <huijin.park at samsung.com>
>   Link: https://lore.kernel.org/r/20211104063231.2115-3-huijin.park@samsung.com
>   Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
>
> Reverting this makes v5.17-rc[1-4] work.
>
> Any suggestions or fixes?
>
> BR and thanks,
> Nikolaus Schaller
>
> Reported-by: jdawin at math.uni-bielefeld.de
>

Thanks for reporting and bisecting!

What changed with the offending commit is two things:

1) We are sending the CMD1 more frequently, initially in the loop in
the __mmc_poll_for_busy. Step by step, we increase the polling period.
2) We may end up using a slightly shorter total timeout for polling
time, compared to what we used before the offending commit.

Hopefully the problem is related to 2), in which case I think the
below patch should help. Can you please give it a try?

Kind regards
Uffe

From: Ulf Hansson <ulf.hansson at linaro.org>
Date: Tue, 1 Mar 2022 14:24:21 +0100
Subject: [PATCH] mmc: core: Extend timeout to 2s for MMC_SEND_OP_COND

It looks like the timeout for the MMC_SEND_OP_COND (CMD1) might have become
a bit too small due to recent changes. Therefore, let's extend it to 2s,
which is probably more inline with its previous value, to fix the reported
timeout problems.

While at it, let's add a define for the timeout value, rather than using
a hard-coded value for it.

Reported-by: Jean Rene Dawin <jdawin at math.uni-bielefeld.de>
Reported-by: H. Nikolaus Schaller <hns at goldelico.com>
Cc: Huijin Park <huijin.park at samsung.com>
Fixes: 76bfc7ccc2fa ("mmc: core: adjust polling interval for CMD1")
Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
---
 drivers/mmc/core/mmc_ops.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
index d63d1c735335..1f57174b3cf3 100644
--- a/drivers/mmc/core/mmc_ops.c
+++ b/drivers/mmc/core/mmc_ops.c
@@ -21,6 +21,7 @@

 #define MMC_BKOPS_TIMEOUT_MS           (120 * 1000) /* 120s */
 #define MMC_SANITIZE_TIMEOUT_MS                (240 * 1000) /* 240s */
+#define MMC_OP_COND_TIMEOUT_MS         2000 /* 2s */

 static const u8 tuning_blk_pattern_4bit[] = {
        0xff, 0x0f, 0xff, 0x00, 0xff, 0xcc, 0xc3, 0xcc,
@@ -232,7 +233,8 @@ int mmc_send_op_cond(struct mmc_host *host, u32
ocr, u32 *rocr)
        cmd.arg = mmc_host_is_spi(host) ? 0 : ocr;
        cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R3 | MMC_CMD_BCR;

-       err = __mmc_poll_for_busy(host, 1000, &__mmc_send_op_cond_cb, &cb_data);
+       err = __mmc_poll_for_busy(host, MMC_OP_COND_TIMEOUT_MS,
+                                 &__mmc_send_op_cond_cb, &cb_data);
        if (err)
                return err;

-- 
2.25.1


More information about the Letux-kernel mailing list