[Letux-kernel] [PATCH 0/6] bq2429x: Clean up driver for future fixes

H. Nikolaus Schaller hns at goldelico.com
Sun Aug 16 22:40:13 CEST 2020

> Am 16.08.2020 um 22:33 schrieb David Shah <dave at ds0.me>:
> Hi Nikolaus,
> I also saw the high temperature issue on 5.6 when charging sometimes - I guess
> you know already from your previous bq work but "60°" just means too hot to
> charge as there is no exact readout. This also resulted in a "HOT" print from
> the driver.

Yes, there are only three levels: too cold, of, too hot. And these are mapped
to -10°C, 22.5°C and 60°C.

But the battery wasn't charging (battery is full) and after a reboot it did read
out correctly. So it was a driver glitch. And I never had that with the old driver.

I remember that these bits reporting NTC temperature are not easy to be handled
correctly by the driver so the rework may have changed something subtle.

> Maybe we should be lowering the charge current a bit for safer and more reliable
> charging, I don't know how much control over this the bq gives. The battery
> didn't feel that hot externally, but the thermistor will be sensing the
> temperature inside the pack which will be higher.

If we would (could) do a redesign, there are successor chips of the bq2429x which
have a real ADC that can be read out through i2c so that we would know the temperature
much better...


> Best
> David
> On Sun, 2020-08-16 at 22:09 +0200, H. Nikolaus Schaller wrote:
>> Hi Nick,
>> I have started a little to test your patches on top of v5.8-rc7 (including
>> also David's patches so that the Panel is working again).
>> There was a missing comma - easy to fix for compilation.
>> Basically the driver seems to work, but OTG doesn't. It does not even try to
>> switch power on, although DWC3 detects it.
>> Unplugging the OTG cable once did lead to:
>> root at letux:~# [  397.149733] ------------[ cut here ]------------
>> [  397.154707] WARNING: CPU: 0 PID: 37 at drivers/regulator/core.c:2603
>> _regulator_disable+0x3c/0x168
>> [  397.170413] unbalanced disables for otg_path
>> And once the temperature report was 60°C.
>> So I think there is something wrong with interrupts / status change detection.
>> Just a note: I have tested on an older Pyra prototype with tca6424 and there,
>> IRQs are definitively broken. So we must poll the status bits.
>> Really nice is reading the bit fields from
>> 	cat /sys/class/power_supply/bq24297/device/registers
>> Much better information than the pure register values.
>> Still, using regmap still allows to read them the "old" way for comparison
>> (except the volatile ones):
>> 	cat /sys/kernel/debug/regmap/1-006b/registers
>> There are differences in registers 0 and 1 compared to the old driver. The
>> other registers are the same (except 8 and 9).
>> But this is just a first and quick "hello" test, nothing fundamental. I'll try
>> to find out more in the next days as I find time.
>> And after upgrading to v5.9-rc1.
>> BR and thanks,
>> Nikolaus
>>> Am 13.08.2020 um 17:51 schrieb Nick Elsmore <nicholaselsmore at gmail.com>:
>>> Hi Nikolaus,
>>>> * Interrupts/poll events are often missed, possibly due to hangups in the
>>>> work queue.
>>> I haven't observed this in my tests. Do you have a specific example? Or a
>>> theory when and why it can happen?
>>> I instrumented the driver to print before var_lock was acquired (I left
>>> loglevel below debug, because currently there are lots of debug prints which
>>> flood the output and make it difficult to filter out whats important).  I
>>> noticed that I would often not see bq2429x_usb_detect executing until a
>>> number of seconds or minutes after I plugged or unplugged the USB OTG.  If I
>>> plugged or unplugged multiple times, I would eventually get all of the calls
>>> occurring at once after the delay.  This leads me to believe that the
>>> workqueue is hanging somewhere.
>>> Interestingly, I would sometimes see no new interrupts for this device
>>> triggered in /proc/interrupts.  This contradicts the theory that the issue
>>> is a hangup in the workqueue, but it could be a combination of both issues
>>> in the workqueue and IRQ handler.
>>>> * Plug/unplug events are not properly detected, due to either work queue
>>>> hangups as described above or faulty logic determining plug status
>>> Yes, plug/unplug detection is quite some guesswork. A better strategy would
>>> be to connect to the USB-PHY...
>>> I do not see why we would need to connect to the USB-PHY.  Interrupts from
>>> the bq2429x should be sufficient to determine whether or not the bq2429x is
>>> connected to VBUS and whether or not the bq2429x has determined this to be a
>>> good power source and is using it.  I have patches which implement the
>>> correct logic for handling this in my queue.
>>>> * Watchdog is left on (if not disabled by bootloader) and not serviced. 
>>> It should be disabled by bootloader and kept in disabled state by the
>>> driver.
>>> Contrary to what we all would suspect it is not a safety watchdog which
>>> would e.g. cut processor power or charging after a kernel panic or stall.
>>> It just a mechanism to auto-reset the bq2429x registers to default values
>>> which could also be done through writing all registers.
>>> So I do not see how to use it for practical purposes.
>>> Yes.  We should not assume, however, that the bootloader has disabled the
>>> watchdog.  We should make sure its disabled.
>>>> This is not properly handled.
>>>> * The following properties are missing:
>>> I am not sure if the charger can report a reasonable value for health.
>>> We'll see.  I have patches in my queue that determine reasonable values for
>>> health, but we will see how they fare in review.
>>> Generally I find the patches good from a cursory glance, but I have to do a
>>> life check since it is not easy to see from the git diffs, what is changed
>>> (any why) and if all functions are transformed correctly. This takes at
>>> least until next week.
>>> Yep, while these patches don't (shouldn't) modify any of the logic, they are
>>> rather large and introduce a good amount of restructuring.  I have submitted
>>> these patches first before integrating any logic modifications as its
>>> possible they have introduced regression.  From my tests, the behavior seems
>>> the same with and without the patches.
>>> Thanks,
>>> Nick Elsmore
>>> _______________________________________________
>>> Kernel mailing list
>>> Kernel at pyra-handheld.com
>>> http://pyra-handheld.com/cgi-bin/mailman/listinfo/kernel
>> _______________________________________________
>> Kernel mailing list
>> Kernel at pyra-handheld.com
>> http://pyra-handheld.com/cgi-bin/mailman/listinfo/kernel

More information about the Letux-kernel mailing list