TWI twi_master_write() Hanging

Go To Last Post
7 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Pertinent info:

SAMG55 Micro

AS 7.0.1417

ASF 3.35.1

 

I've been having trouble with a project occasionally hanging, an issue that is not easily reproducible.  I know why it's hanging, but not how it got to that state.  The trouble lies in the TWI ASF function twi_master_write(), this chunk of code in particular:


while (1) {
        status = p_twi->TWI_SR;
        if (status & TWI_SR_NACK) {
            return TWI_RECEIVE_NACK;
        }

        if (status & TWI_SR_TXRDY) {
            break;
        }
    }

Execution gets stuck here forever if neither the NACK or TXRDY (Transmit Holding Register Ready) TWI status register bits are set. 

 

My status register at the time of infinite loop is 0x0100F008.  The only other piece of information I have is that if I pause execution in this loop, and then step through it, SOMETIMES it will resolve and I can resume normal code operation.    

 

1. Any ideas on how it can get to this state, and then stay there?

2. Hard wait loops are a common theme in a lot of these ASF drivers.  Other than using the watchdog, is there a way to force a break out of them if they get hung up?

4. This is a larger scope question - when bugs are found in ASF, what is the best way to fix them and keep track of changes to these pseudo-system files?  I really don't like editing ASF files because if I update to a newer version on down the line, my changes are blown away.  However, I've had to do it on multiple occasions and still don't have a great way to track it.  I use git for version control of the project itself.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hillridge wrote:
My status register at the time of infinite loop is 0x0100F008.

Can you xlate that status for us, I don't have a data sheet...

 

Also can you provide some context, something like the SAM is twi master talking to xxxxx slave waiting for xxx data to be recv'd.

Bus speed is 400/100k talking to the slave device on the same pcb 5 cm away.....    

 

What trouble shooting have you done so far, like debug mode was mentioned, but have you placed a Logic analyzer on the bus?  Are you getting the response you expect?

Are you getting any response from the slave at all?

 

Jim

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In my case the SCL, TXBUFE (TX Buffer Empty), RXBUFF (RX Buffer Full), ENDTX (End of TX Buffer), ENDRX (End of RX Buffer), and SVREAD (ignored since I'm not operating as a slave) bits are set.

 

That suggests that the receive buffer filled and is awaiting processing, but getting stuck in the loop prevents processing.

 

The SAMG55 is the master, and has multiple slave devices on the bus (all on the same PCB, within a few inches).  In this case it is talking to a Si7021 temperature/humidity sensor that is maybe 2" from it.  3k pullups on SCL and SDA, operating at 100Khz.  The slave works the vast majority of the time, this hanging condition only happens sporadically.  I'm polling it many times a second, and it can take an hour or more for the error to show up.  Admittedly, I should reduce this as there's no need to update the data that fast, but it helped expose the problem faster.  

 

I have not confirmed it, but I suspect this same issue has happened with other slave devices.  Another test routine that uses a different I2C device has locked up on me, but I was not debugging at the time.  I will set that up to run now and see if it fails in the same way.

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Hillridge wrote:
3k pullups on SCL and SDA, operating at 100Khz.

That seems reasonable to me, although the data sheet shows 10k pull ups.

 

 

You did not say if you have a LA, or DSO available, I would set up a DSO to capture the bus to see whats happening.

As a test, lower the bus speed (50k) and see if that makes any difference (not likely, but worth the test)

 

Jim

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

That took less time than expected!

I'm seeing a similar failure with a totally different slave device.  This is still in twi_master_write(), just before the previously posted code:

while (cnt > 0) {
		status = p_twi->TWI_SR;
		if (status & TWI_SR_NACK) {
			return TWI_RECEIVE_NACK;
		}

		if (!(status & TWI_SR_TXRDY)) {
			continue;
		}
		p_twi->TWI_THR = *buffer++;

		cnt--;
	}

My status register is the same as it was in the previous post as well.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I do have a LA, but won't be able to hook it up until later tonight as it is at a different office.  3K seemed to be a good compromise between the recommended pull up values of all the devices on the bus.  

 

I'll drop the speed a bit and re-run the test.  It may actually make it fail sooner if I'm overwhelming the RX Buffer.

Last Edited: Thu. Dec 14, 2017 - 02:25 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

One more thing I thought of, when the master is reading data from the slave, it is important to NAK the last byte read from the slave so it will release the bus and the master can then send a stop!  If this NAK is not sent the slave will not release the bus and the master will hang trying to send the stop!

 

A LA or DSO can confirm this is being done correctly.

 

Jim