[Gta04-owner] Speex echo cancelation now working?

NeilBrown neilb at suse.de
Mon Apr 16 02:59:31 CEST 2012

On Thu, 29 Mar 2012 23:21:33 +0000 Radek Polak <psonek2 at seznam.cz> wrote:

> Hi,
> i think i have fixed speex echo cancellation in gsm voice routing program. 
> It's now pushed on github:
> 	https://github.com/radekp/gta04-gsm-voice-routing

Hi Radek,

 I finally got up to the stage of making real phone calls on my GTA04 and
 this was very helpful!  Thanks.

 I've been examining it to make sure I understand what is happening and there
 are a few peculiarities.  The thing that stood out for me was the apparent
 need to set start_threshold so high.  I would have thought we want to start
 playing samples as soon as possible, but the setting you use doesn't start
 playing until the buffer is full.
 So I tried reducing it can got terrible clicks and over-runs (as I'm sure
 you know).

 Continuing exploration found two more interesting things.

 1/ At the point where you do echo cancellation, the two input buffers are
    different ages.  One was captured just recently (over the last 32ms)
    while the other was captured before that (between 64 and 32 ms ago).
    You can show this by calling snd_pcm_delay() on each handle (r0.handle
    and r1.handle).  I found that r0.handle was consistently 256 samples old
    while r0.handle was fresh.

    The sound devices don't actually start capturing until the first read()
    call (or a call to snd_pcm_start()).
    Your code repeatedly reads from the GSM source until it gets a successful
    read, then it reads from the microphone.  So we don't start recording
    from the microphone until we already have a 32ms buffer (256 samples)
    from the GSM source.  This means we are always 32ms out of sync.

    This can easily be addressed by inserting:

     while (route_stream_read(&r1))

    before starting the "while (!terminating) {" loop.

    Doing this discards the first full period received from the GSM
    source, but allows the two streams to be more in-sync: The
    snd_pcm_delay is the same for both. This might allow us to reduce the
    size of the 'tail' given to speex_echo_state_init() which is higher than
    it should need to be.

    However the snd_pcm_delay() values are not both 0!!  They alternate
    between both being around 50 (which is acceptable I think) and 256
    (which is much too high).

 2/ I sprinkled some calls to gettimeofday() around the loop and printed out
    the time differences.  Just doing this for the first 10 times through the
    loop is enough to see clear patterns without generating too much output.

    It seems that reading from the 'microphone' device sometimes takes well
    over 50ms which is much too long considering that each period is only
    32ms long.
    One read will take 55msec, the next (starting another 8 msec later due to
    the other processing that happens) takes less than one msec.

    So it seems that the sound device is waiting until two periods have been
    recorded before returning anything.  Then it returns the first and the
    second is immediately available.

    So either we have something wrong in the configuration, or there is a bug

    One thing I would like to try is to use "hw:0" rather than "default" for
    the mic/speaker device.
    The important difference there is that hw:0 insists on stereo while
    "default" makes use of the "plug" ALSA module to convert between mono and
    stereo as required.  I wonder if the strange timing issue is caused by

    As we are only using one channel of the stereo it should be fairly easy
    to use snd_pcm_readn() and snd_pcm_writen() to transfer the real sound on
    one channel (left, I think) and  silence on the other channel, and keep
    "plug" completely out of the picture.

I probably won't have time to play with this for a few days, so I thought
I'd explain where I was up to in the hope that someone else might like to
try experimenting.

My main goal is to be able to increase the volume on calls.  If I try
that, I start getting really bad echo.  I'm hoping that if we can sort
out the timing issues so that there is less delay between record and
play, then the echo cancellation might be able to do a better job.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 828 bytes
Desc: not available
URL: <http://lists.goldelico.com/pipermail/gta04-owner/attachments/20120416/62045ed0/attachment.bin>

More information about the Gta04-owner mailing list