Jack Ganssle's "Reason #4" on why embedded software projects run into trouble

Go To Last Post
15 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

This one is SOOOO prevalent! Been there, done that, got the T-shirt, myself. Optimistic Code, that is. Have a read:

 

https://www.embedded.com/electro...

 

Jim

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

Last Edited: Wed. Sep 19, 2018 - 06:50 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

At least they got free shipping.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

<mounting my high horse...>

 

Another argument for using Ada.  Even if you don't explicitly handle all possible exceptions, at least Ada code can catch them and report the exact location, making a fix quicker and cheaper.

 

I haven't written much Ada code, but I had a real-life experience with this exception-catching when I was writing a tutorial on Ada for ARM Cortex M parts.  I deliberately added some overflow and out-of-range possibilities in my example code to show how Ada would catch and pinpoint them.  What I didn't know was that I had a hidden intermediate-calculation overflow condition in the same code, which Ada happily brought to my attention.  And all that checking is essentially free to add (and with minimal runtime impact).

 

There may be other embedded-suitable languages that can do the same thing, but I only know about Ada's capabilities.

Last Edited: Wed. Sep 19, 2018 - 07:21 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I am paranoid about serial receive buffers. Check and double-check pointers/indexes there. Not so rigorous about transmit buffers because "I" populate them; such hubris  will bite some day, pretty sure.

 

Jim

 

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This is likely to spur the same debate as checking malloc() returns: in a deeply embedded system, what do you do AFTER you detect the impossible error?

 

It also bugs me that this, combined with various encapsulation and modularization paradigms, means that you're doing the same check many times in quick succession:

//stack-trace:
   ASSERT(validptr(buf));
   myformat(char *buf,...);
      ASSERT(validptr(buf));
      sprintf(buf, ...);
         ASSERT(validptr(buf));
         vsprintf(buf,...);
         ASSERT(validptr(buf));
            strcpy(buf, ...);
	        ASSERT(validptr(buf));

     

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

westfw wrote:

This is likely to spur the same debate as checking malloc() returns: in a deeply embedded system, what do you do AFTER you detect the impossible error?

Agree, it can be hard to impossible to handle unexpected errors once software is in the field.  So catch as many as possible in the development and testing phase, and make those that get through that net as easy as possible to fix.  It's better that the end user never sees an error, but the next best is for the end user to call in and say "I got error 93 (overflow) at line 278 in file doodad.ada"  (which is just about exactly what I got in my example, leading to a 5-minute fix).

 

Last Edited: Wed. Sep 19, 2018 - 09:17 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ka7ehk wrote:
Have a read:
Thanks for another good read!

Exception handlers are exceptionally hard to get right. 

In any computer language though especially in C; can be done in C though by some effort and might not be portable.

Research (see resources) has shown that code with lots of asserts is less buggy and delivered faster than that without. 

The Ganssle Group

Adding Automatic Debugging to Firmware for Embedded Systems

by Jack Ganssle

Major rewrite: May, 2014

http://www.ganssle.com/item/automatically-debugging-firmware.htm

...

(about 2/3 page for Faults/KLOC vs Asserts/KLOC, and, Cost to Fix vs Bug (by testing, by asserts)

...

Unfortunately, C is pretty much devoid of runtime error checking. 

At least there's some in C11 Annex H.

 (But there’s work being done to add pointer checks to C – see [Checked C research paper]

The first of two parts of an article in Electronic Design is about Checked C :

https://www.avrfreaks.net/forum/jack-ganssles-reason-8-why-embedded-software-projects-run-trouble#comment-2552346

 


https://www.avrfreaks.net/forum/jack-ganssles-reason-8-why-embedded-software-projects-run-trouble#comment-2541376

...

More on assertions and exceptions at 18m24s for about 12m in

...

https://www.avrfreaks.net/forum/floating-point-math-mega128#comment-2551701

...

fyi, C11 has Language Independent Arithmetic (LIA) (Annex H) which adds integer_overflow.

C11 is in GCC and has integer overflow; Microsoft Visual C++ has C89.

...

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

kk6gm wrote:
And all that checking is essentially free to add (and with minimal runtime impact).
Same with Checked C.

kk6gm wrote:
There may be other embedded-suitable languages that can do the same thing, but I only know about Ada's capabilities.
Rust is one and it's an embedded memory-safe computer language; Rust is in-work for bare board, frameworks, and RTOS whereas Ada's been in those for quite some time.

Ada is not memory-safe without "adequate" contracts; if willing to implement formal methods then SPARK (an Ada variant) is memory-safe.

The appeal of Checked C is it's close to C.

 


Checked C: Making C Safe by Extension

Archibald Samuel Elliott, University of Washington

Andrew Ruef and Michael Hicks, University of Maryland

David Tarditi, Microsoft Research

https://www.microsoft.com/en-us/research/uploads/prod/2018/09/checkedc-secdev2018-preprint.pdf

(page 6, right column, bottom)

Running time overhead.

...

Compile-time overhead.

...

(top of page 7)

https://www.microsoft.com/en-us/research/project/checked-c/

Memory safe computer languages

https://www.avrfreaks.net/forum/memory-safe-computer-languages

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ka7ehk wrote:
Not so rigorous about transmit buffers because "I" populate them; such hubris  will bite some day, pretty sure.
starting a transmit then "somehow" starting the next transmit before the current transmit completes.

Asserts in Action, Hardware Registers

https://vimeo.com/223539610 (starts at 25m25s for 1m15s)

 via https://www.avrfreaks.net/forum/jack-ganssles-reason-8-why-embedded-software-projects-run-trouble#comment-2541376

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I had a posting relatively recently exactly about

 

starting a transmit then "somehow" starting the next transmit before the current transmit completes.

It was really challenging to find. And it made a mess of the transmitted character string, not unlike incorrect clock rate, which was a real "red herring".

 

Jim

 

Jim Wagner Oregon Research Electronics, Consulting Div. Tangent, OR, USA http://www.orelectronics.net

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

An alternate to a display is MPLAB REAL ICE Instrumented Trace, or, in general a logic analyzer (breadcrumb byte out a spare port or spare SPI)

kk6gm wrote:
So catch as many as possible in the development and testing phase, ...
The best testers are operators but then there's pissed operators and customers (Windows 10 beta testers) (roll eyes)

Static analysis is more effective than testing at identifying defects as complete testing is an improbability.

Some static analysis is a part of Checked C and Rust though there is some dynamic analysis (runtime checks) in both.

Static analyzers are somewhat popular for C and C++ and Ada, are one of the best methods practices, though need compute iron (multi-core CPU or many-core CPU or multi-CPU, RAM of 1 or 2GB/core) and are an order-of-magnitude more expensive than a linter.

kk6gm wrote:
... and make those that get through that net ...
telemetry; operators may have a tendency to not create and send the defect report (ease defect reporting for operators, web code for operator's manual with a defect report as a part of it)

Progressive Web Apps (PWA)

https://developers.google.com/web/progressive-web-apps/

https://docs.microsoft.com/en-us/microsoft-edge/progressive-web-apps

PWA is a recent part of multiple OS web browsers.

 


http://microchipdeveloper.com/realice:trace-and-profiling

 

Edits: strikethrus

 

"Dare to be naïve." - Buckminster Fuller

Last Edited: Thu. Sep 20, 2018 - 01:27 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

fyi to all, Jack completed his series at Embedded.

ka7ehk wrote:
Optimistic Code, that is.
That's in Jack's monthly newsletter for this month in his list of best practices :

The Ganssle Group logo

The Ganssle Group

The Embedded Muse 360

October 15, 2018
by Jack Ganssle

http://www.ganssle.com/tem/tem360.html

(1/3 page)

On Test

[5 paragraphs about testing, remainder about the process which includes testing]

I think that a minimal set of filters for use in firmware development would include:

...

  • Including code that will detect bugs automatically. I call this "proactive debugging."

...

A take on Jack's best practices :

  • Formal requirements analysis at best effort
  • Review all (requirements, designs, code, tests, products, tools, instruments, processes)
  • Coding standard
  • Update the coding standard
  • Assertions and exceptions
  • linter
  • Static analysis if one has deep pockets
  • Unit tests
  • Automated regression testing with continuous integration
  • Metrics and a process (Personal Software Process (PSP), Team Software Process (TSP), etc)
  • Product defect analysis (process escapes)

 


PSP :

https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=30873

 

TSP:

https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=5287

 

The Software Process Dashboard | The Software Process Dashboard Initiative

https://www.processdash.com/

The Software Process Dashboard Project is an open-source initiative to create a PSP(SM) / TSP(SM) support tool.

...

 

linter, metrics, coding standard :

https://www.avrfreaks.net/forum/jack-ganssles-reason-8-why-embedded-software-projects-run-trouble#comment-2541376

...

  • Susan's 600USD tools BOM at 51m00s

...

 

Edit: more on review

 

"Dare to be naïve." - Buckminster Fuller

Last Edited: Wed. Oct 17, 2018 - 09:45 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The Microsoft Research Checked C project is hiring an intern and software engineers (senior, principal)

Research intern – Compilers/Security in Redmond, Washington, United States | Research at Microsoft (expired)

renewed 7-Feb'19 :

Research intern – Compilers/Security in Redmond, Washington, United States | Research at Microsoft

 

"Dare to be naïve." - Buckminster Fuller

Last Edited: Sun. Feb 24, 2019 - 02:01 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

Here's the whole list:

 

How embedded projects run into trouble: Jack’s Top Ten Countdown:

    10 - Not enough resources allocated to a project: https://www.embedded.com/electronics-blogs/say-what-/4460965/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Ten--

    9 - Jumping into coding too quickly: https://www.embedded.com/electronics-blogs/say-what-/4460983/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Nine

    8 – The undisciplined use of C and C++: https://www.embedded.com/electronics-blogs/say-what-/4461001/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Eight

    7 - Bad science: https://www.embedded.com/electronics-blogs/say-what-/4461022/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Seven

    6 - Crummy analog/digital interfacing: https://www.embedded.com/electronics-blogs/say-what-/4461069/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Six

    5 - Weak managers or team leads: https://www.embedded.com/electronics-blogs/say-what-/4461085/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Five

    4 - Writing optimistic code: https://www.embedded.com/electronics-blogs/say-what-/4461093/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Four

    3 – Poor resource planning: https://www.embedded.com/electronics-blogs/say-what-/4461134/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Three

    2 – Quality gets lip service: https://www.embedded.com/electronics-blogs/say-what-/4461162/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-Two--

    1 - Unrealistic schedules: https://www.embedded.com/electronics-blogs/say-what-/4461180/How-embedded-projects-run-into-trouble--Jack---s-Top-Ten-----Number-One

 

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

updated yesterday (senior may have been filled) 

https://github.com/Microsoft/checkedc/blob/master/README.md#we-are-hiring

We have a position available for a Principal Software Engineer...

 

"Dare to be naïve." - Buckminster Fuller