sound clipping detection algorithm

Go To Last Post
18 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am building a sound application, similar to
AVR335: Digital Sound Recorder with AVR and Serial DataFlash.

The ADC with 5V reference captures 8-bit long sound data @ 8KHz
sampling rate with min=0 and max=255.
The whole thing is working ok.

Now I want to detect the clipping of the audio input, in order to
inform the user to decrease manually the volume control
(or better, do it automatically with a digital potensiometer,
like a digital AGC).

The sinus waveform with 8-bit samples(generated with matlab),
shows 7 continuous samples of 255($FF) and 0($00) respectively.

So, the clipping detection algorithm should not be triggered by
only 7 continuous $FF or $00 but more, but how many?

Or is this not a good way to detect clipping?

Practice Safe hex.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You cant hear distortion till it gets to about 10%.... you can see 1% clipping easily but cant hear it..... I'd look for 5% of the samples to be clipped... say 10 out of 256. At this point you have most of a dsp agc or compressor. next step is to adjust gain to keep the cliping minimized.

Imagecraft compiler user

Last Edited: Mon. Sep 26, 2005 - 02:41 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

How about this. Every time you detect an FF increase a clipping counter by 1. At the end of 256 samples check the value of the clipping counter. To determine the maximum number of clipped samples per sampling period simply listen to it, turn the volume up till you can hear the distortion, use some kind of debug to the AVR to determine the value of the clipping counter when you can hear the distortion and then set your clipping routine to run at a slightly lower value (you want it to do something before you would normally notice it) a single byte counter will give you a responce time and window of about 3ms. If you try this out let me know how well it works.

-Curiosity may have killed the cat
-But that's why they have nine lives

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

or could use a npn transistor switch, connected to an input pin and the other side your analog input , triggered to activate at clipping .

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

What you want to do is a "sound compressor".

I´d say: all values from 1 to 254 are valid.
The values 0 and 255 may be valid but may be clipped.

So why not spend the 0.1dB of dynamics and detect clipping with EVERY 0 and 255?

Klaus
********************************
Look at: www.megausb.de (German)
********************************

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanx,

@ MegaUSBFreak:
-----------------------------
If every $00/$FF signals clipping, it would destroy dynamics of the signal, since my sound would probably have a few $00/$FF(noise spikes) because it is live audio from MIC.
More samples provide more reliable results.
But I admit that in applications with a good quality samples(allready compressed, derived from a digital source) it would be better and easier.

Actually, what I was looking for was the 5% that Bob said.
And Sceadwian, the 256 samples time window is what I had in mind, but I'm curious, how much better whould be if that window was indeed 256 samples but not at specific times, just shifted 1 sample in every sampling. That way it would not miss a possible clipping, happened between two subsequent windows. It's more difficult to implement though. Worths the trouble?

Practice Safe hex.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

8 bit is really +-7 bit.....at any instant the signal can be 127 bits greater than minimum... so the noise is 42db.. thats worse than AM radio, worse than records.... why not use 10 bit? 8 bit sounds baaad and there aint much you can do about it

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In this application, the 8-bits are enough. No need for more.

Practice Safe hex.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No need for more.
=====================
Unless intelligibility has any importance in the app

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Believe me Bob,
I do see your point. It's just that in this specific application, 1 or 2 bits more are only going to increase the cost and not the performance/quality.
Nevertheless, all similar applications in this part of the industry, use 8-bits. It's an unofficial standard. It has been tested and works ok. :)

Practice Safe hex.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Can you describe the app? Telephones are fairly popular audio devices. They digitize at 13 or 14 bits and use a ulaw table. This makes sense, because you are trading range for resolution. Recorded speech is ok with 8 bits, because it can be recorded at 16 bits and smoothed out to fit in 8 bits. If you expect someone to talk into a mic and control the dynamic range of his voice in the presence of outside noise, he had better be a professional voice over announcer.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Increment the counter for each clipped sample detected. If the value after 32 samples is less than 4 then the distortion is less than 10% of the samples. At a window size of 32 every clipped sample is roughly 3% distortion. That gives you a 3ms window/responce time, I was wrong on my math above for 256 samples it's actually 30ms not 3 (damn decimal places) Experimenting with the number of samples and maybe using a little fuzzy logic (increase the counter by 2 if a sample is 256 or 1 if it's 255) might give you slightly better results as that'll cause the counter to start going up even before the rail to rail clipping starts.

-Curiosity may have killed the cat
-But that's why they have nine lives

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

1 or 2 bits more are only going to increase the cost and not the performance/quality.
Nevertheless, all similar applications in this part of the industry, use 8-bits

Which AVR are you using that doesn't already have a 10-bit ADC?

You can sample at 10-bits, detect clipping, then convert to 8-bit u-law, and save.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

After some thinking, even the 1 byte clipping dection is enough for now. In this version, the input gain is fixed to a value, calibrated by a pot. If during the assembly, I apply a 3Vp-p sinus and a LED turns ON on clipping, I can calibrate the input range to its limits quickly without errors. But for later calibrations, where the steady sinus may not be easy to get, the multi-sample technique is necessary.

Now, the audio output cannot be more than 8-bit with this setup(m128 PWM) and the specs said 8-bit PCM. What benefit do I have if I use u-law? Convert 10-bit input to 8-bit(u-law), store and then restore, decode to what? 8-bit linear? Is that possible?

However, you cought me. I haven't though of u-law. Going to search for u-law examples.

P.S.
I'm not sure if the processing overhead for u-law will fit in AVR's real time sound processing time. It does a few other stuff and I don't want to increase more the interrupt dead time. I could loose a byte from UART.
P.S.2
I'm not a newbie in AVRs.

Practice Safe hex.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A WM8816 has a peak detector.

Vref = k/256 * 18V

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Is it for speech or tones? Speech has peaks that ALWAYS clip, according to Murphy's law. DTMF tones might be perfectly ok! They fit inside the allowable range, and 42db SN is way good enough.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

8bit u-law is typically used to create 12-14 bit linear. 8bit u-law to 8 bit linear would actually corrupt your signal and offer no benefit. If you want to compress the audio a little for bandwidth or space take look into ADPCM encoding. For regular waves (sine waves) it compress quiet well. As bandwidth increases ADPCM compression rates go down, after you hit a certain point of bandwidth usage you actually end up using extra data just for the header and getting no compression. For simple 1 voice notes and low quality speach ADPCM is ideal. Smoothing routines on 'raw' audio can increase the compressability of a wav in ADPCM. u-law would be most usefull if you had a 12 bit ADC and DAC and only an 8 bit bus. It would take a trained ear or a computer to detect the compression ADPCM is the most micro controller friendly as far as code implementing goes.

-Curiosity may have killed the cat
-But that's why they have nine lives

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

It's mainly for speech(any other audio source is possible too but not common). This clipping feature is just one extra feature that I though I can add to the existing design. It's only software. For the first calibration, the simple 1 sample peak detection on a steady sinus, shown by a LED, is enough.

I did a research on u-law. It is used to compress more than 8 bits to 8 bits and then decompress to original bits. Doing 10-bit ADC sound reading to 8-bits u-law and then 8-bits linear is of no use.
ADPCM is nice, I have done tests in the past for various bit lengths. This project does not need that compression. I have managed to fulfill my needs with the existing memories.

In this project, all sound related stuff is 8-bits. Only the ADC readings could be 10-bits but what could I do with them? So, I ended up with ADC @ 11.0592MHz/64, 8-bit, left adjusted and the whole ADC capturing can be done in a timer's compare interrupt every 125us, consuming total of about 15us. I decided to do the AD conversion in the interrupt, to make sure that will be no IO activity during the conversion(except PWM). One noise source less. 8-bit proved to be ok.
The same timer compare interrupt is also used for some other sound related functions, defined by flags in a global register(asm), so the flags checking consumes some time by themselves.

Practice Safe hex.