Possible bug in atmega128

Go To Last Post
36 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi. I Think I have found a bug on atmega 128. Here is the situation:

I have some RANDOM erratics values on R10 and r11 on my Timer Interrupt 1. After long debug, I have conclued of a problem on these two registers on the Atmega 128.

here is my code.

First, I initialise all my Timer interrupt 1 routine and also the STACK. After All that, a put an Endless loop on the main program like that:

START:
rjmp START

in my Interrupt routine, I have this code:

Interrupt1:

push r10
push r11

ldi r10,0xff ;init the two registers to 255
ldi r11,0xff

; here is the test routine just to know if the CPU is doing something

lds r2,testvar
inc r2
sts testvar,r2
out porta,r2

;here is the trap to see if there is a problem with r10 and r11
mov r20,r10
cpi r20,0xff
brne DoNothing
mov r20,r11
cpi r20,0xff
brne DoNothing
pop r11
pop r10
reti

DoNothing:
rjmp DoNothing

Now, Guess what? I monitor the PORTA output, and after a random short time, it stop to increment.

But when I make the same thing in the avr studio simulator for 2 entire days, All is OK.

If someone can confirm me if it's a known problem with the atmega128 or any other atmega, please, let me know... Otherwise, Will I have to send it to Atmel?

Thanks.

Cedric, www.innovativedevice.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi,

You exit the interrrupt doing nothing. And exactely then you dont fetch bach r10 and r11. This causes and interrupt overflow.

Klaus
********************************
Look at: www.megausb.de (German)
********************************

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ldi register,value is only valid for R16-R31.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No, Because Normally, the CPU don't go there because the value of r10 and r11 is never affected... it's just a trap to see if there is a problem. The problem is the CPU go there because the R10 and R10 are not equals to 255, but there is no change at all of the registers values in the code.

Anyway, the bug is maybe at other place in the code... the program is so complexe...

Cedric, www.innovativedevice.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mabcom has a good point. You can not have assembled the code you wrote without an error - there is no instruction code for "ldi r10,0xFF". If you think you're running this code, single-step through it and see what happens when you reach the "ldi r10" line.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

True, I have not assembled the code. I have just composed it on the fly, to show you the problem because the original code is too long too messy to post it here. Just replace the LDI r10,0xff by:

ldi r16,0xff
mov r10,r16

Cedric

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Are there other other interrupts running? confusing SREG?

Klaus
********************************
Look at: www.megausb.de (German)
********************************

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hi Folks..

Ok, I have found the problem. And I think that the solution can be useful for a lot of people;

The problem was real. There was false erratic DATA on few register, and Try to debug all that without success...

I use very fast output on two ports (one output each 4 clocks at 18 MHZ).

So, this create big noise. In a noisy environment, you MUST PROGRAM the CKOPT Fuse... It's the second time I have bugs for this problem...

The CKOPT fuse give a Rail to Rail output for the Xtal, Useful specialy for very noisy environment... (moreover, I'm overcloced... This can't help me for the Xtal Stability...)

Thanks anyway for your help.

Too many times we are searching in Code when the bug is in the hardware itself...

Cedric, www.innovativedevice.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A bug in the hardware - yes your hardware, not the CPU. All bets are off when you start overclocking processors. Some may work well, others mail fail miserably.

Randy

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

hi,

same as randy.

Quote:
MUST PROGRAM the CKOPT

this is not my experience.
switched high power lines 2500V /4000A next to AVR. Never had problems with this noise. A proper layout is recommended. But if my AVRs fail in future - thanks for the hint.

Klaus
********************************
Look at: www.megausb.de (German)
********************************

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I doubt that highly. You switched 10Megawatts how close to the AVR? I don't have my field strength calculator handy but I'm sure the AVR was quite some distance away.

Go electric!
Happy electric car owner / builder

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mast3rbug

I found a bug in mega128 too since I was in the senoir college year doing project about motor controller.

The problem is I can't change the output signal while I'm in the interrupt routine--timer interrupt. Also I think that is a cpu bug. I use very high Xtal about 33MHz.

and my code was written in C, GCC compiler. Someone thought it's a code optimization problem, but the code I tested it's short about 10 lines.

I am not sure that this is the same problem.

"Chill out with Atmel Corp."
- Scud88.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I think you should re-read Randy's post above.

Go electric!
Happy electric car owner / builder

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I don't think it's a problem to overclock an atmega 128 to 18 mhz. Do you remember the Atmega128 card in the past for listening to Satellite TV Illegally in the past? All the cards was clocked to 18 mhz without problem.

Cedric, www.innovativedevice.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I will restate what has already been said for effect.

[rant]
The mega128 has an ABSOLUTE MAXIMUM CLOCK FREQUENCY OF 16MHz. If you decide to overclock the processor to a higher frequency by 2MHZ or 16MHZ the result is the same, ALL BETS ARE OFF - the processor MAY FAIL. These are not bugs, these are YOUR DESIGN ERRORS because YOU have EXCEEDED THE ABSOLUTE MAXIMUM RATINGS. Just because it worked it one case, does not mean it will work in all cases.
[/rant]

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

in my Interrupt routine, I have this code:

Interrupt1:

push r10
push r11
...

No real need to go any further, is there? No preservation of SREG.

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@SGOMES.

It´s installed in a silicon producing plant.
In fact it´s not 10MW because we have different voltages and currents.
The prower transformer is about 1MW. 2500V @ 400A the other windings 200V 4000A. The copper (not in transformer) wires are two lines with about 20mm x 150mm water cooled at the 4000A. 100Hz switching frequency. The AVR is about 300-500mm away. ABS case with NO additional shielding. Data signal lines are fiber opto. Supply is from 24V DC or 230V AC with special made transformer (low AC-coupling PRI-SEC). Sounds unbelievable but it works fine (in time). But I know it is critical. And maybe the change of a single wire can be a problem.

Klaus
********************************
Look at: www.megausb.de (German)
********************************

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mast3rbug wrote:
In a noisy environment, you MUST PROGRAM the CKOPT Fuse... It's the second time I have bugs for this problem...

This almost matches the description given in the data sheet word for word, why call it a bug?

scud88 wrote:
I found a bug in mega128 too since I was in the senoir college year doing project about motor controller.

The problem is I can't change the output signal while I'm in the interrupt routine--timer interrupt. Also I think that is a cpu bug. I use very high Xtal about 33MHz.

and my code was written in C, GCC compiler. Someone thought it's a code optimization problem, but the code I tested it's short about 10 lines.

Sounds more like abuse and bad programming, why blame Atmel?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

the problem don't come from the overclock. The problem was the same at 16 MHZ.
The problem was really the Noisy environment clock option. my design is very noisy, Like my other pas design that was not overclocked. Anyway, the only known problem with atmega128 @ 18 MHZ is the acces to the internal EEPROM and I don't use it. Anyway, my project can't run at lower speed than that. I must live with that

Cedric. www.innovativedevice.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

glitch
There are non so deaf as those who will not listen.

Keep it simple it will not bite as hard

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@mast3rbug: 1) Please don't design any medical equipment EVER. 2) When you get out into the "real" world you are going to have design reviews. They are just part of the job. During these reviews, engineer's with WAY more experience and knowledge are going to tell you to make changes you might not understand. JUST DO IT.

@MegaUSBFreak: I stand corrected! Scary stuff! I design laser controllers for a living. We switch kwatts and have troubles with emissions. And of course from my sig you can tell I have a car that switches tens of kwatts - can't even listen to the radio in that bad boy. I congratulate you on your success! I'm sure it was tough!

Go electric!
Happy electric car owner / builder

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Atmel bug? Don't think so...

"Stuck on stupid? Ah ha...

If you push the product beyond the guarenteed specifications, don't be blaiming the maker of an excellent product for your own stupidity!!!

Finding a true bug in any of Atmel's devices is about as likely as the next president of the UnitedStates will be me.

And if there is a design bug, find another way!

You can avoid reality, for a while.  But you can't avoid the consequences of reality! - C.W. Livingston

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@sgomes:

Quote:
Please don't design any medical equipment EVER

i totally agree with you.

High current switching:
I think we don´t have the HF pulses like with laser pulsing.
We switch with 100Hz and not that high dU/dt. That helps a lot.

Klaus
********************************
Look at: www.megausb.de (German)
********************************

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hope you weren't confused. The "please don't design any medical equipment EVER" comment wasn't meant for you. Although if you can't avoid Megawatts in your designs I guess it COULD apply to you too.. hehehe :lol:

Go electric!
Happy electric car owner / builder

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

@sgomes

Quote:
comment wasn't meant for you

i know.

Klaus
********************************
Look at: www.megausb.de (German)
********************************

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Microcarl: Just take a look at the atmel website and check for WORKAROUND PDF... You will see a lot of bug.. and it's normal. Designing a microcontroller is not an easy task. The real bugs comes out often on the real world, on real designs. So, before talking about things that you don't know, don't talk. There is already one workaround found by me on the microchip website. Who are stupid?

Sgomes: I Know that. This present design is no to be released in a Medical or wide public product. I did not want to speak about it imediately, but I can explain it briefly and you will understand;

I'm making a color videogame system (8 colors) with all the synch generated in software. The Game system will have a complet set of library/functions to make a complete game easily including musics, fast scroling and sounds effect. All software generated.

-96x96, 8 colors interupt driven video map with all library to make Block copy/move/collision checking.

-3 sound channels (one square, one sawtooth, one sinewave with special effects on each channel, and 7 complete octaves)

This kit will be available soon as a learning kit on HOWTO program videogames on hardware. I'm working presently on the Bitmap Exporter utility. Here is a ScreenShot of it.

I will post short videos in a new post of the graphic engine in few days. I will let you know.

Cedric, www.innovativedevice.com

Attachment(s): 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I find my Gameboy Advance works better as a game machine than does my Mega128!

Where are the monkeys and barrels?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kartman: I really don't want to be in competition with any other game system. It's just for fun, and for peoples that want to learn how games works.

A lot of peoples know the Atmega ASM, but the gameboy advanced? humm...

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

--Changes in operation when outside specified parameters, especially "Absolute Maximum Ratings", are NOT "bugs".

--As I said, a "bug" in an ISR where there is no consideration to SREG is NOT a "bug".

--Given the errata lista on the AVRs I have worked with, I would not say that AVRs have an excessive number of items; I wouldn't even say "lots" as you did. I haven't dug up the "workaround" pdf -- neither Google nor a site search uncovers any document of that name on Atmel's site -- so I can't comment further. You imply that it is a composite document. If it indeed covers every Atmel microcontroller model in every revision and every speed grade then yes, I guess I would expect quite a few entries. Hundreds of new introductions in the last 10 years + a few errata in first silicon = quite a few entries, just to start. That still ain't lots of bugs.

[rant] If you know that your design is noisy (when all bets are off anyway--it AIN'T BUGS when you have ground bounce, etc.) and you know you are running outside stated parameters, WHY did you even post "Possible bug in atmega128"? Even if we DID all agree with you wholeheartedly, what do you expect us to do about it? [/rnat]

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Theusch, Here, here!!!

mast3rbug,

Quote:
Microcarl: Just take a look at the atmel website and check for WORKAROUND PDF... You will see a lot of bug.. and it's normal. Designing a microcontroller is not an easy task. The real bugs comes out often on the real world, on real designs. So, before talking about things that you don't know, don't talk. There is already one workaround found by me on the microchip website. Who are stupid?

And overclocking a component beyond the manufacturers specifications and then finding that the device doesn't work properly isn't a bug. I personally view this practice as very risky business - a practice looking of a disasterous opportunity.

There is no doubt that Atmel has experienced bugs with their products. But when one doesen't follow the rules, that is not good design practice, and it's not Atmel's problem.

While you think I don't know what I'm talking about, I've been working in the electronics field and doing embedded design since the mid 1970's - starting with the Fairdhild F8, then on to the Z80, the 6502, the 6800, the 68HC11, and now the Atmel line is my controller of choice. Every design that I have undertaken has worked. I have never found a real bug in a CPU/MCU. But then too, I don't violate the manufacturers specifications. I have found issues that I didn't particularlly care for, though.

And then too, I'm not one of those who are continually coming on to this site wanting answers to even the simplest of things.

As to the very first question at the beginning of this thread:

Quote:
I have some RANDOM erratics values on R10 and r11 on my Timer Interrupt 1. After long debug, I have conclued of a problem on these two registers on the Atmega 128.

Obvoiusly, you should spend much more time reading the datasheets and learning how intrrupts work - before you start claiming "It's a Bug".

I have completed several Mega 8535, Mega 32, Mega, Mega 64, and Mega 128 designs, all with multiple intrrupts running simultainoulsy - without fonflicts or inconsistancies.
A couple of these designs run in an industrial environment 24/7. They are one of the few designs that never have to be worked on in the facility.

You can avoid reality, for a while.  But you can't avoid the consequences of reality! - C.W. Livingston

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I have not included the sreg backup int the example because is was not necessary for the example. It was only a minimal reconstitution of the situation to demontrate that the two registers have changed INSIDE the interrupt, even after loading new data; Also, all these problem was at 16 mhz. Ok, my design work at 18, but not in development phase.

ldi r20,255
mov r10,r20

few nop...

cpi
and here, the data is changed without changing it..

reti...

And it was not an overclock problem, just a fuse setting.

The register changed this value BECAUSE my hardware is very noisy. And not because the SREG that is backuped in my code but not in the example.

Anyway, it appair that you don't want to understand.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In either case, it is still not a bug in the silicon, but rather a bug in your implementation of the hardware. Next time be more careful about declaring a bug in the chip. While Bugs do exist, and one might find them, unknown ones are pretty rare, especially for a chip that's been out as long as the mega128. So when you encounter a problem, be sure you exhaust all possibilities, before claiming to have found a bug in the silicon.

Writing code is like having sex.... make one little mistake, and you're supporting it for life.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Quote:

Anyway, it appair that you don't want to understand.

Oh, no I do. I fully understand that you are crying Wolf on a silicon bug without having any controlled conditions to demonstrate it.

And if you want positive resonses from me, you need to post code that exhibits the situation.

So let's say I'm using an AVR or any other chip that is rated to 85C. I run them at 100C, and then complain on newsgroups and to the manufacturers that there is a silicon bug?

AVRs aren't particularly sensitive to noise, probably in part to the rather sedate frequencies that they run at. The probably aren't the champions in that area. I can say from several production designs in noisy environments that the AVRs were not the weak link in any of the designs when the root cause(s) were ferreted out. Heck, we were getting noise spikes up to 50V from an AC drive until we tamed it, which beat the heck out of DS485 tranceivers. But never had it actually damage an AVR. Yes, signals moved wrt ground and it >>loooked<< like the AVR was screwing up, but it was really only acting on the signal level presented at the input pin--just as the datasheet said.

Hmmm--maybe I should post some piece of crap program and claim that model x and model y and model z of AVR has a severe silicon problem, and don't every use them. [I'll conveniently use the models that we need for production but are in short supply.]

Then outerspace will read the posts, and shelve all of his AVRs of that model.

Weird world.

Actually, I retract my statement above, there IS one thing I don't understand. Exactly what did you attempt to accomplish with your post? What would have made you happy to see me type?

Lee

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

My 2 cents …

mast3rbug wrote:
And it was not an overclock problem, just a fuse setting.

The register changed this value BECAUSE my hardware is very noisy. And not because the SREG that is backuped in my code but not in the example.

Anyway, it appair that you don't want to understand.

As you keep pointing out, your problem is caused by a noisy environment.

WHY IS THIS A CHIP BUG?

mast3rbug wrote:
So, this create big noise. In a noisy environment, you MUST PROGRAM the CKOPT Fuse... It's the second time I have bugs for this problem...

The CKOPT fuse give a Rail to Rail output for the Xtal, Useful specialy for very noisy environment... .

The data sheet discusses how to deal with noisy environments.

BUT YOU KEEP CLAIMING ITS A CHIP BUG.

You made a mistake, everyone does now and then, not a big deal (even though you were reckless and irresponsible to blame your error on a chip bug).

Your ‘staying the course’ is what smells.

squiggy

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

??

yeah I said that in the first message only.. When I have found the bug, all was OK.

I Know that the problem was in my hardware. So what is the problem?

When I talked about the CKOPT fuse, it was just to explain what I have done to regulate the problem.

The CKOPT is for that and I now it.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

A question just for fun, Someone know the percentage of hobbyists using Atmel Micro VS Microchip?

And what about the industry?

Personnaly I prefer Atmel. A little more power, and a more flexible instruction set.