[TUT] [PC] [AVR] [SOFT]Understanding bi-phase-mark coding

Go To Last Post
6 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Biphase-Mark encoding and decoding - a short guide.

Biphase-Mark encoding is sometimes recommended as a communication method to get data from one place to another. This guide is intended to show how it works, how to code it and decode it, and how to recognise it.

What is it?
It's a method of encoding data onto either an audio signal, or on a logic-level signal path. Its characteristics include:
- self clocking; you don't need to know the data rate to decode it
- average DC is zero; capacitively coupled audio signals will still decode properly
- robust; it depends on phase-inversions to identify bits
- able to handle either synchronous or asynchronous data
- able to manage a wide range of data rates
- able to be safely decoded on changing data rates - e.g. from a spooling tape

The signal is defined such that there is a phase change in the level of the signal at the end of each bit. If the bit is a one, then there is a further phase change in the centre of the bit period. So, a continuing stream of zeros at a 1kbit/s data rate would be a 500Hz signal; a continuing stream of ones would be a 1kHz signal. Normal data would have a mixture of both; it has a very distinctive sound.

Biphase-mark encoding is widely used in the broadcasting industry for the distribution and recording of linear time-code; the audio signal is recorded on analogue video or audio tapes and can be read both during normal playback and while spooling quickly. It's also often distributed around studio areas as a source of time-of-day data; a reader can be plugged into a jack and the time can be immediately displayed.

Another surprisingly common place to find biphase-mark encoding is in the SP-DIF digital connection (and of course the closely related AES-EBU digital audio).

The software
In the attached files, I have included the source code for an encoder - biphase.c - and a decoder - unbiphase.c. Rather than try and provide an example for AVR - where it can be implemented in so many different ways, depending on the type of AVR chosen, resources required by other parts of the program, and so on - I have provided generic C programs for a PC. They are written for GCC under Linux though they should work with minimal changes - if any - on any of the MS C compilers. The only point to note is that they require variables of int=32 bits, short=16 bits, and char = 8 bits.

It should be noted that neither of these programs is robust enough for production software. Both have far too little error checking, particularly when opening, reading from, or writing to files; neither behave properly in respect of the Microsoft RIFF format files.

Synchronous vs. Asynchronous

Synchronous data is presented as a continuous bitstream; immediately after one byte of data follows the next. There are no start or stop bits (though parity may be included in the bitstream), so some other mechanism must be used to synchronise the start of the data. Also, if the data input stops, the bitstream must continue. This is generally managed by having a 'sync byte' in the bitstream which is output whenever there is no 'real' data to output. The sync byte must be something which is unambiguous and cannot be mistaken for the real data; this therefore reduces the number of data byte options available in the main data - for example, on an eight-bit system, the code '0xFF' might be reserved as a sync byte. However, if there is the possibility of entering the data stream at a random position, you have to ensure that the data itself in any random position cannot simulate the sync byte - for example, the sequence '0x0F', '0xF0' would look like a sync byte.

Linear Time Code - as mentioned above - is designed to be read not only at multiple and varying speeds, but also forwards and backwards. To achieve this capability, a sixteen-bit sync block is included which cannot occur as a legal bit pattern elsewhere in the 80-bit packet, which is repeated every 25th of a second. The pattern in the sync block can be identified to give the direction of the data.

Asynchronous data is most familiar to most as the non-return to zero RS232 style data. The data can occur at any time, which implies an idle state on the line. On the RS232 format (ignoring the actual transmission levels) this is a '1'. To mark that data is starting, a start bit is used - a zero - followed by a number of data bits. After an optional parity bit, one or more stop bits - '1's - will restore the line to the idle level. As a result, each 8-bit byte of data takes a minimum of ten bit periods to transmit. Because the start bit can occur at random times, the line has to be sampled somewhat faster than the nominal data rate; normally, sixteen times. This enables the following bits to be sampled somewhere near their centre point for improved noise performance.

A disadvantage of such data is that the clock rates of the transmitter and receiver must be close. Once the two rates differ by around 5%, the sampling point of the last bit can slip to the previous or following bit, producing an error. This is a common problem with AVRs clocked on their internal uncalibrated clocks; either the clock must be calibrated in software or crystal or other high-accuracy clocks must be used. But because the bit timing is self-clocked on the biphase-mark coding, clock drift is not important; the data is guaranteed.

After some consideration, I decided to implement a non-synchronous format for my example. The main driver for this is that most people are more familiar with the format, and it provides a more complex example - though not as complex as a finished implementation might be.

My format - when writing to audio - therefore has a short period of '1's as an idle state, followed by the data itself in the one start bit, eight data bits (bit 0 first), no parity bit, and one stop bit form. After the last data, another short period of '1's is recorded - unnecessary in this instance but with a more complex system with say short frames of data containing some sort of error correction, providing an ideal inter-block idle state.

The audio format is mono WAVE format, 16,000 bits per second, sixteen bits per sample. Since I am recording a square wave, this is excessive but means I can perform other unpleasantries on the audio data - filtration, compression etc - with standard tools.

RIFF files
RIFF files are a Microsoft standard packaging stream for - usually - multimedia storage. They may contain audio, video, random data, and metadata, contained in chunks. The WAVE format is reserved for audio applications.

A chunk consists of a four-byte ascii header with the name of the chunk, four bytes (unsigned long) to give the length of the chunk data, and then the chunk data itself. A chunk can contain sub-chunks to any number or level, though it is unusual to get any deeper than three levels. Note that the WAVE ident does not have an immediate data length following it. Although MS refer to this as a chunk, my feeling is that it should be considered as part of the data - so the chunk consists of the WAVE identifier and a minimum of two subchunks - the fmt and data chunks.

The start of a minimal WAVE file looks like

RIFF		<- the riff chunk ident
xxxx		<- the length of the RIFF chunk (usually the entire remainder of the file)
WAVE		<- the WAVE chunk ident
fmt 		<- the format chunk identifier
xxxx		<- format chunk length
..
..			<- data depends on the length and type of the format chunk
data		<- the data chunk identifier
xxxx		<- data chunk length
..
..			<- many many data samples

The fmt chunk data describes the format of the data (PCM, MP3, or other compression), mono or stereo, bit rate, sample rate and so on. The audio data needs to be understood in the terms of this header. In this case, we're going for a straight-forward PCM mono signal at sixteen bits; hard to get wrong.

I should stress that the method I have used to create the file is not recommended by Microsoft (or me, really) in that all I do is preload the chunk structure as a single 'waveheader' structure. This produces a legal file, and I haven't come across any software which can't read it, but modern wave structures include extra fields. The file reader I provide in unbiphase.c can cheerfully read this format, but may break on other wave files - though it will not crash, you may find that you're treating the extra fields (or indeed, extra chunks) as audio data.

The official way to deal with these files is to use Microsoft's RIFF API; creating chunks on demand to write and 'walking the chunks' to find the chunks in which you are interested and ignoring the rest. However, that option is not immediately available in the GCC libraries, and would distract from the point of the examples.

Coding and decoding
Coding biphase mark data is simple; one needs to arrange a phase change at the bit rate and another phase change in the middle of that period, for the '1'. How that is arranged is up to the programmer. In this version, I've just used an integer number of audio samples to time the phase change. Alternate approaches might include timer interrupts every half a bit period, with a routine that decides whether or not a phase change is required at any particular interrupt. It can also be derived in hardware, using a couple of flip-flops.

Decoding is also reasonably simple, depending on whether you know the data rate and that it will be stable. If you do know it, then it's as easy as looking at the interval since the last phase change. If it's more than 3/4 of the bit period, you've just received a zero; otherwise, you
got half of a one. If you already had the first half, this is the second half and you can return a one; otherwise, flag this as a first-half received. Note that because you don't necessarily know the phasing of an arbitrary stream of ones, it's possible that the following half-cycle might be a zero; this unambiguously identifies the zero but data prior to that must be considered suspect.

If you don't know the bit rate, it's a little more complex. Now, you need not only the period of the current half-cycle but that of the previous half-cycle. If these two are the same (within 0.75 and 1.5 times) then we can safely assume that we have another of what we previously had... but we might not know yet what it is. If the current period is more than 1.5 times the previous, then we have definitely received a zero. If instead, it is less than 0.75 times the previous period, then we have definitely received the first half of a one. Only after such an event can we unambiguously decode the continuing bit stream. Once the bit stream is identified, ones and zeros will follow automatically, even if the data rate changes, provided the change is less than approximately one sixth of a bit period per bit.

Biphase
Biphase requires two parameters: the name of the file to be encoded, and the file for the output data. It will produce a (quite large) audio file as described above. You will need to provide the complete filename; I recommend .wav as the suffix. The input file can be anything; text, executables, whatever. Note that the output file will cheerfully overwrite anything already existing with the same filename.

Most of the code is actually concerned with managing the creation of the audio file; creating the file, setting the header values, writing the header, and then after the data is written going back to the header to fill in the file sizes.

The biphase coding section is only three subroutine - and that for clarity... write_1() and write_0() make a fast cycle or a slow half-cycle respectively; 'last' holds the value written by a subroutine on its last half-cycle (and returned by the routine) so that the phase is maintained between calls. write_byte() packages the a given byte by wrapping it in a start and stop bit, and then calling write_1() or write_0() as appropriate to create the audio data. It also passes the 'last' value back. Everything is wrapped up in a simple routine that reads bytes from the input file and calls write_byte() for each byte.

Sequential bytes follow with no idle time other than the stop bit which is always a one. This gives the fastest possible transmission time for the selected bit rate. The bit rate itself can be modified quite a lot; provided there's at least one sample at each level between transition times it can be decoded - so with the audio package as described, it will work as fast as 8kb/s. Of course, if you are using a direct connection between processor, the signal can be arbitrarily fast, provided you have sufficient time to decode. On a 16MHz AVR I would expect (I haven't tried!) to be able to read at least half a megabit per second.

That's pretty much it... simple!

Unbiphase
Unbiphase also expects two parameters: a mono 16-bit wave file - the bit rate is unimportant provided that the data is recorded at 1000 bits per second - and a filename to write the decoded output. It will overwrite any existing file of the same name.

The main part of the program looks after opening the audio file. As discussed above, this is not a standard way of writing RIFF files but it suffices for the purposes of this demonstration. The wave header information is read and sufficient checks are made to ensure that the file is of a suitable type. Because we have a defined bit rate, we can use the simpler of the methods discussed above to decode the signal - we have access to the sample rate in the wave header we can easily calculate how many samples mark the necessary periods.

Reading the data itself is broken into four layers of increasing complexity:
- At the lowest level, is_edge() decides whether the current sample represents a phase-change aka zero crossing
- get_period(), the next level up, repeatedly calls is_edge on each sample to return the period between phase changes
- get_bit() calls get_period once or twice as necessary to return a bit as either a one or a zero
- get_byte(), the highest level, unpacks the RS232-style packaging by waiting for the start byte and then shifting in the next eight bits. The stop bit provides somewhere for the next iteration of get_byte() to wait for the next start byte.

Decoding from a stored audio signal is simple since the timing information can be extracted effectively instantly; on something like the ACR the timing can be done several ways. If actual audio is being decoded, then the ADC complete interrupt can be used as a regular tick, or a logic-level signal can be used to drive an interrupt pin. An internal timer/counter is stored after every interrupt, and reset for the next period (the interrupt sense can be inverted after every interrupt).

The main routine simply calls get_byte() while there are still audio samples remaining.

Neil

Attachment(s): 

Last Edited: Mon. Dec 3, 2007 - 05:58 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Biphase mark coding - part II - performance

Having shown that the data can be successfully encoded and decoded, I decided to see how much abuse the signal could take before it fell over.

The original audio file test.wav was created by encoding the source code for biphase.c; all further processing was done within Audacity under Kubuntu linux using the standard defaults. There are a touch under ten thousand characters in the original text file.

'diff' was used to check the accuracy of the transcription; the audio after processing was exported from Audacity as a .wav file and then decoded using unbiphase.

Remember that these are all being decoded with the basic version of the code, with simplistic zero crossing/edge detection.

Test 1 - no processing (test.wav)
On inspection, the audio has a classic square wave signal shape. As expected, the file produced no change between the input and the output.

Test 2 - Low Pass (testlopass.wav)
This Audacity process - with the corner frequency set at 993Hz - shows classic exponential rise-times of the square wave after filtration. However, the decoded audio once again had no errors.

Test 3 - High Pass (testhipass.wav)
This Audacity process results, as expected, in a file full of spikes at each bit edge and little DC component; however, once again, the result decodes perfectly.

More to follow...

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Biphase mark coding - part III - more performance

Since the coding survives basic frequency response changes successfully, let's have a look at more extreme changes to the signal. In all these cases, we're deliberately inducing distortions in the signal.
In these tests, we use Audacity to speed up and slow down the signal, changing the pitch. I don't know the method used by Audacity, but the normal translation method involves oversampling the original signal, digital filtration to the desired bandwidth, and resampling the resulting file. So not only do we have data edges arriving at the wrong speed, but they're likely to have added jitter - since they're now unlikely to be at exact integer multiples of samples. On top of that, we won't have exact square waves any more - as an examination of the waveform will soon show.

Speeding it up - testfast15.wav
Well, well, well... it can cope with this; no errors on decoding. At 25% (test file not included) it doesn't work at all well, though. I had expected it to cope with 25% fast, and I believe the errors are rather to do with the extra jitter and sample position rather than the limits of the decoding - remember that we are still using the simple decoder which is expecting a fixed data rate.

slowing it down - testslow25.wav
And sure enough, once again there are no errors. Much slower than this, though, and this version will fail. However, using an adaptive decoder, where the previous bit periods are used to detect one/zero differences.

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Biphase mark coding - part IV - Taking it to extremes

Ok, we've demonstrated that it can cope with a fair amount of mangling; what happens if we try compressing the signal?

Well, it happens that zip or rar will actually compress the original un-meddled-with square-wave signal very well indeed; it only contains two different values for the audio signal. However, more general audio compression methods are psycho-acoustic models; rather than trying to reproduce the original waveform they endeavour to produce a signal which the ear will accept as similar to the original. In this process, both the phase of the signal components, and some of the original components themselves, are lost.

Therefore, we might reasonably expect the data to be somewhat mangled after it's been through this type of perceptual encoding. In this experiment, I treat it to both mp3 coding and ogg coding. Using the default parameters for Audacity, the mp3 file is compressed at a lower bitrate than the ogg file; the original 2.3MB file compresses to about 220kB in mp3, and 400kB in ogg.

In both cases, we compress the original audio using Audacity, close the file, reopen the resultant .ogg or .mp3 file, and export the result as a .wav file - so this has tested both the compression and decompression paths.

Note that the attachment regime on AVRFreaks does not allow posting of ogg or mpg files directly; also, if I try and rar the decoded wav files they end up the same size as the original, so instead I have renamed the original ogg andmp3 files with a .txt suffix; remove this and save before decoding.

MP3 - the people's favourite

A curious result. What we see is that the data has failed horribly - but not as might be expected. Instead of a complete failure, or randomly correct bytes throughout the file, we find that it appears to have worked in short bursts - first a few dozen random garbage characters, then a few dozen error free. I don't know why this should be, but I believe it may be due to the way the mp3 codec manages its framing gain stages.

OGG - license free compression

But here's a surprising result - a completely error free output. To be fair, the compression is not as severe as the mp3 compression, and back-to-back subjective listening tests have shown a better sound quality than the same bit-rate as mp3 on the same source material.

However, it seems that if one wanted - for example - to control using a spare audio channel, compressed .ogg files are the way to go.

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Biphase-mark coding - part V - the bit you've been waiting for: AVR code

So, finally, a short application to allow the AVR to use biphase mark coded signals. I should stress that this has not been built on real hardware; it has, however, been tested under emulation/simulation on the WinAVR platform with a stimulus file created in a manner similar to the code above. The code and the associated stimulus file are attached below.

In this application, aimed at the ATMega8 running at 8MHz, the logic-level data is applied at the PD2 pin - INT0. The output is piped directly to PD0 - TX - and is presented as a bit-by-bit copy. Since the original code is at 1000 bits per second, the output is perfectly valid RS232 at 1000 baud. The start and stop bits are present in the original coding.

I'd recommend - for playing - using the audio output from a PC to deliver the code; use an op-amp or comparator to detect the zero crossings and provide a logic-level signal to the input pin.

The code is in two parts.

In main, the direction is set for Port D. We enable interrupts on INT0 on any change of logic level (which provides interrupts on every edge) and finally enable interrupts globally. We set the timer1 - having calculated that a count of about 4,000 or 8,000 clocks will occur between edges, timer 1 is the right size, though it would be simple to use, for example, one of the eight bit counters with a prescale of 64. Then, we just sit in a loop and amuse ourselves with anything else going on.

In the interrupt section, we decode in a very similar way - but structurally simpler - to the PC examples earlier. We can't hang around in the interrupt waiting for the second half of a '1' so instead, we hold a static flag that remembers when we've had the first half. Rather than building the data byte in memory, in this case we just spit it straight out of the TX pin, as each bit is decoded. Nothing to stop *you* managing the code, of course...

Conclusion
So, here it is. I think I've demonstrated that biphase mark encoding is a versatile, robust, and easy method of moving data or commands around. It's easy to decode on an AVR (there are only fifteen lines of actual code in the routine described above!) and uses very little processor time. It will work over a huge range of speeds, and over very long distances (I've actually used it over an audio circuit between the BBC's Delhi office and Bush House in London). It can be carried as a logic signal, as audio, over wire, optical fibre, or radio.

It's not as fast as USB2, or even I2C. The AVR can manage RS232 data remarkably faster, certainly faster than you could decode biphase mark - but biphase mark is much more robust in terms of data rate, jitter, and changing clock rates.

In a production environment, you'd want to stiffen things up a little. You'd probably want to use an algorithm which compares the length of a previous cycle to decide whether a particular bit is a one or a zero, rather than using an absolute length. You'd probably want to provide some sort of packaging for the message - framing information, a more complex protocol, whatever. This is what you would do anyway in any comms method where you don't have full control over both ends of the link, and the link itself.

Enjoy.

Neil

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Sir haw can i decode wave sounds that have jitter ? haw can i eliminate the jitter?using an oscilloscope?
thx