Streaming audio files from AVR

Go To Last Post
24 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi, I want to store a few small audio files on an AVR, and play them through a small audio circuit/speaker. The files are simply the words 'red' 'blue' 'green' 'yellow' 'orange' and 'purple'.

I'm not sure what format to save them in to take up the smallest amount of memory, and still sound reasonable. Could they be stored in a single AVR without having to interface with any additional memory chips (boardspace is also an issue)?

Also, Id like suggestions about how to interface the avr with the speaker in order to save the most amount of boardspace. Could I simply have an output pin of the AVR driving a square wave which changes in frequency? I already have a DAC available to use as part of another chip already in the circuit, so would it be better to use this, and a VCO?

Thank you

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You can do what you want with an AVR. Doesn't the Butterfly have a similar capability in the demo firmware?

Anyway, if this is the primary or only activity for your AVR, then consider using one of the Winbond (now Nuvoton) chips made for answering machine/note recorder and similar. I've used the 1700 series:
http://www.nuvoton.com/hq/enu/Pr...

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

AVR335

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Roman Black's technique is very efficient:

http://www.romanblack.com/picsound.htm

It shouldn't be difficult to port the code to the AVR, someone might have already done it.

Leon Heller G1HSM

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

You can also find some discussions in avrfreaks like here https://www.avrfreaks.net/index.p... where some R-R DACs are evaluated.

If you need telephone voice quality then 8khz 10-bit could do, but it depends on your needs.

Carlos

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

First use audacity to record the words on a pc as 16 bit mono 22050 (11KHz freq resp), then use samplerate conversion to save as 16KHz (8KHz freq resp) and also try 8 bit mono at the low samp rate. I think you have about 4 seconds of sound, so you might need 32k or more of flash to store it, but mega128s are pretty cheap. I think you can play the samps right into the pwm OCR register, and have an opamp lopass filter driving a small amp and speaker.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Maybe you can find some useful information here:http://elm-chan.org/works/sd8p/report.html

/Martin.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Telephones (POTS) used to be 3.4kHz (6.8kHz sampling), to give you a rough benchmark for voice quality.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Getting a 3.4khz voice message might have been marvelous in 1940, but record a 'hi there' at a standard rate, then samplerateconvert it down to 6khz and listen to it... it sounds like you're talking thru a pillow. A little faster sounds a lot better.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The first samplers/synths used 33kHz samplingrate -> 33000/2.2*=15000 Hz audio bandwidth. 8)
(*nyquist) (15kHz bw is also used in FM stereo broadcast) :idea:

RES

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks all, so I would store the sound files in flash program memory, alongside my program code, and not in EEPROM?

I've already decided I'm going to record the sounds with audacity, I just need to borrow a microphone. quality is not of massive importance, It just really depends on the size of the sound files, which will depend on what format I save them in. Since I have a spare DAC in the circuit already (part of a quad DAC, with the other 3 already being used), I might as well use it, since boardspace is an issue, so I want to keep adding additional circuitry to a minimum. I can drive the audio amplifier circuit straight off of the DAC can't I?

The DAC is interfaced with using the AVR's SPI.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don't think you will have enough space in EEPROM. Flash should be fine, I made some 'sound players' a while a go and you just have to take into account the chip you are using so you have enough Flash for program + audio data. Therefore, sampling rate and bit depth have an important influence on FLASH usage as well.

Carlos

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yeah, you can drive the amp with the DAC, unless the amp or the DAC really blow. (Er, low impedance input amp or high impedance output / uA-limited output DAC.) Some DACs can even kick out a few watts themselves.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just use the mic on your laptop or cheap mic on a headset. You're not recording a Stradivarius.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

bobgardner wrote:
First use audacity to record the words on a pc as 16 bit mono 22050 (11KHz freq resp), then use samplerate conversion to save as 16KHz (8KHz freq resp) and also try 8 bit mono at the low samp rate. I think you have about 4 seconds of sound, so you might need 32k or more of flash to store it, but mega128s are pretty cheap. I think you can play the samps right into the pwm OCR register, and have an opamp lopass filter driving a small amp and speaker.

Hi, is 'samplerate conversion' an extra tool to use after recording? I've recorded it in 16-bit mono at 22050Hz, and in the same quality tab, there are options for 'real time sample rate converter' and 'high quality same rate converter' whose settings can be either 'fast sinc interpolation' or 'high quality sinc interpolation', and I'm not sure what to set these as.

I'm also not sure what to put as the dithering settings in this tab, and also what file format to export them as (I'm guessing WAV, then I don't need any decoding?)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The wav file has a 44 byte header that you can keep and skip over, or just delete it. A couple of avrfreaks have contributed their 'wav to c' converter utility. The output is suitable for including in the project for compilation. After you record the word and edit off the silence before and after the actual utterance, save it and play it back. Sounds ok? Try and convedrt it to 8 bit? Play it back. Still sounds ok? Try it at 11025. Still sounds ok? Try it at 8khz. Yuck. Sounds crappy. Dont use that one.

Imagecraft compiler user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Definitely use a stand-alone serial EEPROM and an AVR that has an TWI (which is the same as I2C) interface. These devices like the 24LC256 are very cheap and easy to work with.

I have a serial EEPROM programmer in the Projects section that works well. You can program your voice_spoken word audio files directly into the serial EEPROM directly from the PC with this programmer. Then you can put the serial EEPROM onto the AVR board. Use PWM and a RC low-pass filter to make the audio waveform and a cheap, found-everywhere LM386 audio amplifier to drive a little speaker or headphones.

With a $2 serial EEPROM you can use a Tiny48 AVR instead of an expensive Mega128 or greater. It doesn't make any sense to use expensive program flash ROM or static RAM to store unchanging data.

Use 8-bit samples for voice. The only difference between 8 and 16 bits is the amount of background hiss.

Instead of using words, could you use tones for each of the six audio events? Say a burst of three high pitch beeps for 'red', a falling 'zip' for 'green', a ripping sound for 'blue', etc... Much more data space efficient than actual word audio events. Think in terms of 1980s video arcade games. All those sounds were made with programmed square waves using AVR-class 8-bit microprocessors that were running the game and the video at the same time as making the sounds, all at 4-8 MIPS clock speeds.

Also consider using 'yel' for 'yellow' and 'pur' for 'purple'. Two syllable audio events take much more memory than one syllable events.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks, audacity only has options for 16-bit, 24-bit and 32-bit floating point (Its version 1.2.6). I'll see how small I can get the file size before I decide how to store it, as price is not that much of an issue (within reason), whereas boardspace is.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

audacity only has options for 16-bit, 24-bit and 32-bit floating point (Its version 1.2.6).

This is one of the worst thing they ever did to Audacity. The belief seemed to be that it would only ever be used for editing audio that would become MP3/WMA/AAC on your iPod and they dropped the 8bit audio support - tragic IMHO.

I still use CoolEdit 2000 but sadly it was bought up and is no longer available.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'd use goldwave, it is amazingly good and has an option to save raw files in practically 'any' format, even u-law: http://www.goldwave.com

The evaluation version is very useful and lets you do most of the normal functions, if not all.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the ides, I'm now using goldwave, and wav files from this page. I'm not sure what sample rate and bit depth these have but I have set goldwave to save them with an 8-bit depth, at 8kHz. Thing is when I open a file, then save it in this format, the file sizes actually increase dramatically instead of decreasing, which I don't understand. I don't think theyh are compressed in any way... I just thought they were raw PCM wav files.

Also, when I have the wav files how I want them, how can I program them into program memory (even if it is more costly, its a one off build and must be compact), alogside the program code? I'm using avr studio 4, and AVRISP mkII.

Thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I would suggest you to save into raw format. Goldwave has an option for that, and you can choose the bitdepth you need. You would normally get a smaller file size with that.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Thing is when I open a file, then save it in this format, the file sizes actually increase dramatically instead of decreasing, which I don't understand. I don't think theyh are compressed in any way... I just thought they were raw PCM wav files.

Guess again...

This 12K file is really 105K of PCM. When the 22K/16bit is downsampled to 8K/8bit the ".raw" file is 38.3K

(MP3 expanded out to PCM is going to get huge!)

[BTW this is the file-open dialog in CoolEdit 2000]

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Oh ok I thought you were taking a wav file. Could you find the function to create a raw file? You can then choose the bit depth and encoding.
You should also resample it before to, for instance, 8khz or whatever sampling frequency you need.

This way, a 8khz 8-bit mono raw file should be exactly 8000 bytes per second or 1 byte per sample