[Letux-kernel] [BUG] mmc: core: adjust polling interval for CMD1

Ulf Hansson ulf.hansson at linaro.org
Thu Mar 3 11:40:55 CET 2022


On Wed, 2 Mar 2022 at 10:40, H. Nikolaus Schaller <hns at goldelico.com> wrote:
>
> Hi,
>
> > Am 02.03.2022 um 09:20 schrieb Jean Rene Dawin <jdawin at math.uni-bielefeld.de>:
> >
> > Ulf Hansson wrote on Tue  1/03/22 14:38:
> >> On Thu, 17 Feb 2022 at 21:12, H. Nikolaus Schaller <hns at goldelico.com> wrote:
> >>>
> >>
> >> From: Ulf Hansson <ulf.hansson at linaro.org>
> >> Date: Tue, 1 Mar 2022 14:24:21 +0100
> >> Subject: [PATCH] mmc: core: Extend timeout to 2s for MMC_SEND_OP_COND
> >>
> >> It looks like the timeout for the MMC_SEND_OP_COND (CMD1) might have become
> >> a bit too small due to recent changes. Therefore, let's extend it to 2s,
> >> which is probably more inline with its previous value, to fix the reported
> >> timeout problems.
> >>
> >> While at it, let's add a define for the timeout value, rather than using
> >> a hard-coded value for it.
> >>
> >> Reported-by: Jean Rene Dawin <jdawin at math.uni-bielefeld.de>
> >> Reported-by: H. Nikolaus Schaller <hns at goldelico.com>
> >> Cc: Huijin Park <huijin.park at samsung.com>
> >> Fixes: 76bfc7ccc2fa ("mmc: core: adjust polling interval for CMD1")
> >> Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
> >> ---
> >> drivers/mmc/core/mmc_ops.c | 4 +++-
> >> 1 file changed, 3 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
> >> index d63d1c735335..1f57174b3cf3 100644
> >> --- a/drivers/mmc/core/mmc_ops.c
> >> +++ b/drivers/mmc/core/mmc_ops.c
> >> @@ -21,6 +21,7 @@
> >>
> >> #define MMC_BKOPS_TIMEOUT_MS           (120 * 1000) /* 120s */
> >> #define MMC_SANITIZE_TIMEOUT_MS                (240 * 1000) /* 240s */
> >> +#define MMC_OP_COND_TIMEOUT_MS         2000 /* 2s */
> >>
> >> static const u8 tuning_blk_pattern_4bit[] = {
> >>        0xff, 0x0f, 0xff, 0x00, 0xff, 0xcc, 0xc3, 0xcc,
> >> @@ -232,7 +233,8 @@ int mmc_send_op_cond(struct mmc_host *host, u32
> >> ocr, u32 *rocr)
> >>        cmd.arg = mmc_host_is_spi(host) ? 0 : ocr;
> >>        cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R3 | MMC_CMD_BCR;
> >>
> >> -       err = __mmc_poll_for_busy(host, 1000, &__mmc_send_op_cond_cb, &cb_data);
> >> +       err = __mmc_poll_for_busy(host, MMC_OP_COND_TIMEOUT_MS,
> >> +                                 &__mmc_send_op_cond_cb, &cb_data);
> >>        if (err)
> >>                return err;
> >>
> >> --
> >> 2.25.1
> >
> > Hi,
> >
> > thanks. But testing with this patch still gives the same errors:
> >
> > [   52.259940] mmc1: Card stuck being busy! __mmc_poll_for_busy
> > [   52.273380] mmc1: error -110 doing runtime resume
> >
> > and the system gets stuck eventually.
>
> Same result from my tests.
>
> BR and thanks,
> Nikolaus

Alright, it starts to sound like we might need a revert (or at least a
way to restore the previous behaviour) - even if that would just paper
over the real problem. The real problem is more likely related to how
the host driver manages restoring of the power to the card, which
happens when runtime resuming it.

In any case, just to make sure the loop timeout itself isn't the
problem, can you run the below debug patch please? The intent is to
figure out how long the worst case timeout is, when it's working with
CMD1. As soon as the timeout exceeds the worst case, there is a
message printed to the log.

The below patch is based upon that the offending commit has been reverted.

Kind regards
Uffe

From: Ulf Hansson <ulf.hansson at linaro.org>
Date: Thu, 3 Mar 2022 11:00:04 +0100
Subject: [PATCH] mmc: core: DEBUG - Measure and log worst case CMD1 loop

Signed-off-by: Ulf Hansson <ulf.hansson at linaro.org>
---
 drivers/mmc/core/mmc_ops.c | 11 +++++++++++
 include/linux/mmc/host.h   |  1 +
 2 files changed, 12 insertions(+)

diff --git a/drivers/mmc/core/mmc_ops.c b/drivers/mmc/core/mmc_ops.c
index 9946733a34c6..3f03d9e8a3a4 100644
--- a/drivers/mmc/core/mmc_ops.c
+++ b/drivers/mmc/core/mmc_ops.c
@@ -177,11 +177,15 @@ int mmc_send_op_cond(struct mmc_host *host, u32
ocr, u32 *rocr)
 {
        struct mmc_command cmd = {};
        int i, err = 0;
+       s64 cmd1_ms;
+       ktime_t time_start;

        cmd.opcode = MMC_SEND_OP_COND;
        cmd.arg = mmc_host_is_spi(host) ? 0 : ocr;
        cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R3 | MMC_CMD_BCR;

+       time_start = ktime_get();
+
        for (i = 100; i; i--) {
                err = mmc_wait_for_cmd(host, &cmd, 0);
                if (err)
@@ -211,6 +215,13 @@ int mmc_send_op_cond(struct mmc_host *host, u32
ocr, u32 *rocr)
                        cmd.arg = cmd.resp[0] | BIT(30);
        }

+       cmd1_ms = ktime_to_ms(ktime_sub(ktime_get(), time_start));
+       if (!err && cmd1_ms > host->cmd1_ms) {
+               pr_warn("%s: CMD1 timeout increased to %lld ms, loop=%d\n",
+                       mmc_hostname(host), cmd1_ms, i);
+               host->cmd1_ms = cmd1_ms;
+       }
+
        if (rocr && !mmc_host_is_spi(host))
                *rocr = cmd.resp[0];

diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 7afb57cab00b..c2ca3bb05620 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -301,6 +301,7 @@ struct mmc_host {
        unsigned int            f_min;
        unsigned int            f_max;
        unsigned int            f_init;
+       s64                     cmd1_ms;
        u32                     ocr_avail;
        u32                     ocr_avail_sdio; /* SDIO-specific OCR */
        u32                     ocr_avail_sd;   /* SD-specific OCR */
-- 
2.25.1


More information about the Letux-kernel mailing list