Lose RS485 wire A in one direction, not the other

Go To Last Post
20 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Heres one thats got me stumped

 

I have an RS485 link from my head end controller to my pool's RTU  19.2k speed, cable length is about 50 feet.

 

Here's the cable:

https://www.markertek.com/produc...

 

This link has been running 24/7 for several years without a problem up until the other day where I no longer get readings displayed on the touchpanel.

 

The RTU is indicating that it is getting valid packets from the head end, and is responding to them with TX, and RX led's that are connected to the AVR so if they are not lit, it means the AVR is not getting something it recognises.

 

At the Head end I left out the LEDs - Stoopid me, but a scope and Logic Analyser shows that I am not seeing return data from the RTU on the "A" wire.  Hmm Driver chip died?  Ok, I'll replace it even though the RTU says all is good.

 

I replaced the Driver in the RTU and things are still the same, no data on "A" wire.  I metered the lines and the data lines show 60 ohms, which is correct as I have 120 ohm resistors across the A and B lines.  The 18 gauge lines are to provide power to the RTU and they meter out good too.  I do not see a short from either data line to the shield either.

 

Next I replace the Driver at the Head End, but still, NO DATA ON THE "A" wire at the head end.  I see data leaving the Driver at the Head end, but I only see a reply on wire "B"....WTF?

 

So, I take the Laptop and the Logic Analyser and go to the RTU at the pool.  Put the Analyser on the "A" and "B" lines and I see Data coming FROM the Head End on BOTH "A" and "B" and I see data LEAVING the RTU on both the "A" and "B" lines.

 

How the hell is that possible?  I get data on both lines in one direction, but on the reverse only one line?

 

I yank the termination resistors and no change.  Same thing.  No data on the "A" line in the reverse direction.  The RTU is Sending the data, but it's not showing up on the "A" line.

 

A cursory check of the wire shows nothing, but I will look again in a little while.  I do not expect to see any damage as the lines are all metering out ok, but who knows.

 

My question is, Has anyone ever seen this before?  Data from the head end goes through to the remote, remote responds, but data only comes through on one of the lines?

 

I have confirmed that the RTU driver is operating by connecting the logic analyser at the connectors pins and I see data on both "A" and "B" lines.

 

As a last ditch effort I can change the line out, but I would like to see if I can determine why this is happening first.

 

JIm

 

This topic has a solution.

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

Last Edited: Thu. Aug 9, 2018 - 03:07 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I had a somewhat similar situation, might have even been the A side... I kept swapping parts and finally tracked it down a failed leg of a TVS diode. 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Thanks for the feedback!

 

I have two issues against your suggestion....

 

1) I am getting data out of the driver on both A and B from the RTU at the RTU.  I do not have data at teh Head end A line.

 

2) I do not have any TVS/suppression diodes on this link!  OOPS!  The RTU was made with spare parts I had lying around, and the Head End not having any was a brain fart

 

I pulled the RTU from the pool area, and pulled back the communications line.  Cursory inspection checks out ok, now to put it in the test bench and see whats going on.

 

 

Jim

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hairline break in the A line, acting like a cat's whisker, or a 'fox hole radio'?  Long shot, I know ;-) ... but if so, and if the break was disturbed when you pulled back the line, then the behaviour may cease.  Either it may not work at all, or it may work fine.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"Read a lot.  Write a lot."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

Last Edited: Mon. Sep 25, 2017 - 02:04 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Did you ever sort this out?

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"Read a lot.  Write a lot."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No, as I have been dealing with some other pressing issues that were/are more important.

 

I did move the cable in question, and the RTU to the lab and set it up right next to the head end, and it still does it.  I also noticed that the same thing is happening on a 3 metre long piece of cable as well.  THere is also a small hiccup at the end of both the poll and response that dos not seem to affect the RTU, but might be causing the Head end to take issue.  I will try and get to this in the next few weeks and report back. 

 

Jim

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Been a while since I reported back on this issue. 

 

I have the RTU and the suspect cable connected to the head end controller and things were still not right with the A line.  I started poking around some more and could not find anything wrong with the RS485 line.  EVERYTHTHING meters out as OK.  All connections are where they belong, and are solidly made.

 

What I DID find is the ground/shield line for the RS232 line to the Crestron system was loose.  I re-terminated the two data lines and the ground/shield and...........things are back to working again!

 

 

For the life of me I cannot understand why the ground/shield for the RS232 affected the RS485, and ONLY the A wire, and in ONLY one direction.  The ground/shield for both the 485 and the 232 are connected together on the PCB and to the power supply ground.  Both Driver chips are +5vdc powered and both are powered from the same supply.

 

If time allows I will look into this some more as I cannot come up with a viable reason for the cause/effect on this.

 

Cannot mark SOLVED just yet

Jim

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

  

David (aka frog_jr)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Are the RS-232 and RS-485 cables running together?  I've had cases where the RS-232 was spiking other signals in the same cable.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

No. Separate lines.

Jim

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Pool system RTU has been running for a few days now in its normal environment and all seems fine.  I have no idea why that ground wire on the RS-232 line messed up the A wire on the RS-485 line but it is what it is I guess.

 

Just wanted to close the thread.

 

Jim

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

This reply has been marked as the solution. 
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Well well well, its back again, and this time I think I found the problem.

 

We had a violent rain/lightning storm last night.  So bad that two of the 485 Drivers in the network took one for the team.  After replacing them, everyone came back online EXCEPT the Pool RTU.  Same issue.  I can see data tx and rx led's blinking, but the head end Crestron panel is not displaying temperatures or pressures.

 

WTF!? angry  I thought this was fixed.

 

Ok, so I get the logic analyser out and throw it on the comms lines and I can see the packets and they look ok.  I then check the RX pin of the 485 driver that feeds the Host AVR, and I notice that the analyser is reporting a framing error in the delimiter byte!  Thats not good as that means that the packet will be rejected and hence nothing on the display panel.  Why is this happening?  Hmmmmm...  Lemme take a peek at the code in the RTU that does the transmitting and see what I have going on.

 

#define F_CPU 4915200ul
.
.
.
.
.
.

COMM_485 = COMM_485 | (1<<COMM_485_TR);		//set driver for transmission
	for(uint8_t i = 0x00; i < 24; i++)
	   {
		putchar0(sense_buffer[i]);

           }//end of for loop 

		putchar0(0x0D);		//send delimiter
		_delay_ms(1);       //WAIT BEFORE FLIPPING THE PIN
		COMM_485 = COMM_485 & ~(1<<COMM_485_TR);  //back to receive mode
							

Ok, this all looks fine to me.  Lemme throw the analyser on the RTU's driver and see what spits out........HEY WHAT THE F........!?

 

_delay_ms(1) is by no means a 1ms delay!  More like 15 MICROseconds!  Thats not gonna work.

 

I change it to 10ms and THAT works.  9.9ms and the screen is back!  Yipee, but that does not answer why 1ms is 15us. 

 

I change _delay_ms(10) to _delay_ms(2), and take a peek...not much better...35 microseconds.

 

I change _delay_ms(2) to _delay_ms(3), and take a peek...not much better...1.25 milliseconds, and the screen is back. 

 

I change _delay_ms(3) to _delay_ms(5), and take a peek... better...4.8 milliseconds, and the screen works.

 

So it would appear that the issue is that the RTU is releasing the line BEFORE the last byte has completely left the AVR.

 

Now, for those of you C programming x-perts I went and cleaned up my act and did this:

COMM_485 = COMM_485 | (1<<COMM_485_TR);		//set driver for transmission
	for(uint8_t i = 0x00; i < 24; i++)
	   {
		putchar0(sense_buffer[i]);
	   }//end of for loop 

		putchar0(0x0D);		//send delimiter
		while ( !( UCSR0A & (1<<TXC0)) );   //wait for complete transmission
		COMM_485 = COMM_485 & ~(1<<COMM_485_TR);   //back to receive mode
		UCSR0A = UCSR0A | (1<<TXC0);	//clear the flag

And this too works perfectly so this is the way it will stay.

 

Incidentally with this change the tx/rx line flips 52 microseconds after the delimiter completes transmission.  Which has me wondering if the 15 microseconds I was originally getting was the edge of tolerance for the packet to go through successfully.

 

As an aside observation I failed to mention in the OP, the Crestron Display screen behavior was always erratic at times.  The readings would scramble, and it would take a minute or two to resync the display.  I always thought this was because of what I had to do in the Crestron Programming to get the gear communicating, but now I know it was the RTU the whole time.  The display is far more stable and has hiccuped only once, but recovered in a couple of seconds.

 

Jim

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

Last Edited: Thu. Aug 9, 2018 - 03:06 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

For lightning protection I use tranzorbs on each end and use a 0 volt wire. It is the wiring that picks up the lightning so the transzorbs channel the energy back into the wire rather than via your 485 transceivers.
I had a site where the 0V wasnt wired - it died in a thunderstorm. It was only on investigation that i found there was no 0V wire as the engineer thought it wasn’t necessary. Other sites had the 0V wire and survive.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Nice Troubleshoot'n there Jim!

 

Thanks for the update.

 

 

 

Jim

 

Click Link: Get Free Stock: Retire early!

share.robinhood.com/jamesc3274

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Kartman wrote:
For lightning protection I use tranzorbs on each end and use a 0 volt wire.

UI usually do as well, but this was a throw together of parts I had laying around and the original intent was to stay inside the house so adding the pool RTU was not in the spec.  I am going to add some external tranzorbs to the connector next Digikey order.

 

The lightning storm was particularly violent.  So much so that it knocked out a substation for the Long Island Railroad - Burned it out pretty bad - that has been causing horrific delays for three days now:

https://newyork.cbslocal.com/201...

 

 

JIm

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

jgmdesign wrote:

_delay_ms(1) is by no means a 1ms delay!  More like 15 MICROseconds!

 

Are you going to investigate why that happened ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

angelu wrote:

jgmdesign wrote:

_delay_ms(1) is by no means a 1ms delay!  More like 15 MICROseconds!

 

Are you going to investigate why that happened ?

 

I thought about it, and then thought that if I simply put in the thread what I found....9 times out of 10 someone here knows the answer already and would chime in.  Quite frankly I would not know where to start digging.....Wheres Cliff? cheeky

 

Jim

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I usually do RS485 direction handling in the Transmission complete interrupt, which is called (if enabled) after the last stop bit has been shifted out and there is no more data in UDR.

 

No need to wait in software.

ISR (USART_TXC_vect) {
/* Uart Transmission complete Interrupt.
The last stopbit is shifted out of the Uart Shift register.
A whole packet has been send. Restart with the 2nd packet or do some cleanup.*/

	NETWORK_RS485_PORT &= ~NETWORK_RS485_ENABLE_BIT;// Set the RS485 chip to receive mode.
	/* Disable this interrupt to prevent it from triggering prematurely in
	case the cpu is too busy with other things and the uart runs out of data. */
	UCSRB &= ~(1<<TXCIE);	// I disable myself.
}

 

The main gotcha here is that you have to keep UDR filled, and a transmision of a packet must be continuous.

So the UDRE interrupt vector is used to keep the transmission going:

ISR (USART_UDRE_vect) {
/* Uart Data Register Empty interrupt routine.
Sends the Next byte of A Packet to the UART. If the last bye is put in UDR
this interrupt disables itself and enables the transmission complete interrupt
to clean up. */

	UDR = *pTxd;												DEBUG( 0x60);
	TxDBytes--;				// A byte has been transmitted, so decrement...
	pTxd++;
	if( ! TxDBytes) {											DEBUG( 0x61);
		// If the last byte has been send.
		UCSRB &= ~(1<<UDRIE);	// Then I'm finished and disable myself.
		// Bug: Clear interrupt flag for txcie before enabling interrupt?
		UCSRB |=  (1<<TXCIE);	// Transmission complete interrupt takes over.
	}
}

Setting op a transmission and catching the data on the other end are a bit more involved.

 

Also:

If 2 of the transeivers on the same bus have been taken out by lightning, the others also become suspicious, even if they still work.

Damage can sometimes be observed by a change in the logic levels of the bits or artifacts (lower slew rate) during switching of a bit.

 

Paul van der Hoeven.
Bunch of old projects with AVR's:
http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The _delay_ms() version was timing from the last insertion into the UDR0 transmit buffer. With 8N1, and with a one-deep transmit buffer plus the transmit shift register, there will be between 10 and 20 bits left to transmit (depending on how many bits were left in the transmit shift register at the moment of the last insertion). At 19.2k, that's between 0.52 and 1.04 ms. Since you're sending more than two bytes, your putchar function will spin in a busy loop for each additional byte until the transmit shift register is emptied and the buffered byte is transferred to it, so by the time you get to the _delay_ms(), you're more or less guaranteed to have 20 pending bits, i.e. 1.04 ms left. Shutting down after only 1ms means that the last bit gets truncated.

"Experience is what enables you to recognise a mistake the second time you make it."

"Good judgement comes from experience.  Experience comes from bad judgement."

"Wisdom is always wont to arrive late, and to be a little approximate on first possession."

"When you hear hoofbeats, think horses, not unicorns."

"Fast.  Cheap.  Good.  Pick two."

"Read a lot.  Write a lot."

"We see a lot of arses on handlebars around here." - [J Ekdahl]

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

joeymorin wrote:
you're more or less guaranteed to have 20 pending bits

 

Not so sure about 20 bits, but I see the logic behind where the DELAY_MS()  looks to be not working properly.

 

angelu wrote:
Are you going to investigate why that happened ?

 

Now that joey has provided some food for thought I have a very simple way of testing _delay_ms() now.  Stay tuned, i'll do this later

 

JIm

 

 

Tested and confirmed that there is nothing wrong with the _delay_ms(1).  As joey mentions the USART buffering, I brain farted it when I wrote the app a few years back and did not account for it.  So, in fact the _delay_ms() was being called once the 'putchar0()' returned and the USART was still spitting out bits.

 

The 'putchar0' function:

#define DATA_REGISTER_EMPTY (1<<UDRE0)
.
.
.

void putchar0(char c)
{
	while ((UCSR0A & DATA_REGISTER_EMPTY)==0);
	UDR0=c;
}

As you can see I am not checking the TXC flag here, so once UDRE0 is empty the app exits, meanwhile there is still bits getting sent out the TX pin, and when the last time the FOR loop calls this, and exits the next command was the _delay_ms(), which was timing out just as the last bit was leaving.  Bad programming on my part.

 

My revised code I posted earlier fixes this issue thankfully and I can move on with the other modules I am making for the system.

 

I have realised that I need to put together a better protocol now that this network will be getting more modules added than the two I have.  There are two more on the bench that will be going to PCB fab shortly so I really need to get a more robust protocol in place.

 

JIm

 

 

If you want a career with a known path - become an undertaker. Dead people don't sue! - Kartman

Why is there a "Highway to Hell" and only a "Stairway to Heaven"? A prediction of the expected traffic load?  - Lee "theusch"

 

Speak sweetly. It makes your words easier to digest when at a later date you have to eat them ;-)  - Source Unknown

Please Read: Code-of-Conduct

Atmel Studio6.2/AS7, DipTrace, Quartus, MPLAB user

Last Edited: Thu. Aug 9, 2018 - 07:49 PM