strcmp()

Go To Last Post
13 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Bounced of another odd one, about which I should have known better: strcmp() where one of the two inputs is null gives, on this 64-bit Linux/GCC C system, a segfault.

 

It turns out that if either or both inputs are NULL, the output is undefined... which is a reasonable explanation of a segfault - but it looks (from the effects; I haven't looked at the code) as if there is no test for the NULLness before the comparison is attempted. Which leads to dereferencing address zero, hence the segfault.

 

I was a bit surprised: I might rather have expected it to be able to make a sane decision on it (e.g. both inputs NULL, result ==, one input NULL, result > or < depending which) but I guess a string comparison is used so often they didn't want that extra time to check it.

 

<shrug>

 

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I'm guessing they leave it up to the caller to assert() the inputs or similar. The reason being that not everyone would wan the size/time penalty of a NULL check on the two parms so, if you want it, you have to do it yourself - your choice. One could envisage a "protected_strcmp()" you might implement where you add such validation checks.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Yes, it makes sense. I'd come to vaguely the same conclusion. Odd that in forty years though it hasn't managed to bite me before :)

 

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

barnacle wrote:
Odd that in forty years though it hasn't managed to bite me before :)
How much of that time was using strcmp() on protect mode CPUs? ;-)

 

(on small micros something very unpleasant may be happening inside strcmp() if you pass 0x0000 as one of the addresses but, because there's no MMU you won't have known about it!)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Well, I'm much more likely to have used it in a Windows or, more recently, a Linux environment. My embedded stuff rarely needs to compare strings! (And generally I would avoid stdio and strings anyway on embedded to avoid malloc issues).

 

Neil

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

This raises philosophical questions like... are two things, that are not defined, equal or not equal? 

 

*stares into distance*

/Jakob Selbing

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Surely strcmp() expects valid char arrays to compare.   e.g. strcmp("David", "Neil") or strcmp("David", "")

An empty string still has an address.   The address simply holds a NUL character.

 

NULL is a pointer.   NULL refers to an address with value 0.   i.e. an unlikely address for a string.  (that is particularly easy to detect)

Of course it is quite possible to store a string at address = 0 in a Harvard architecture.

 

You generally use NULL as an argument or return value when you want to signal an "impossible" address.

I am horrified when I see NULL being confused with NUL.    They are two distinct items i.e. address and scalar.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:
NULL is a pointer.   NULL refers to an address with value 0.

I seem to recall a spirited discussion on this in the past. It boils down to how many names there are for zero, right? ;)

 

And when you are through with that discussion, post all of the naming rules to the new

proposed ASSEMBLER-ONLY FORUM for ridicule.

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

Last Edited: Thu. Nov 5, 2020 - 03:27 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

re Windows, an assertion may occur.

Invalid Parameter Handler Routine | Parameter Validation | Microsoft Docs (second paragraph)

 

P.S.

barnacle wrote:
(And generally I would avoid stdio and strings anyway on embedded to avoid malloc issues).
For those specific instances :

  • wrap the heap
  • manage the heap
  • operate a linter (some)
  • static analysis (if available)
  • a memory-safe computer language (when one is willing and MCU is able)

Strings and standard input/output can be done on a PC/tablet/phone/workstation/server by allocating that functionality to such from an embedded system.

 


Mastering stack and heap for system reliability: Part 3 - Avoiding heap errors - Embedded.com

eheap - A new embedded heap manager - Embedded.com

Rule 08. Memory Management (MEM) - SEI CERT C Coding Standard - Confluence

Rec. 08. Memory Management (MEM) - SEI CERT C Coding Standard - Confluence (Recommendations)

Memory safe computer languages | AVR Freaks

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

david.prentice wrote:
NULL is a pointer.   NULL refers to an address with value 0.

I seem to recall a spirited discussion on this in the past. It boils down to how many names there are for zero, right? ;)

 

And when you are through with that discussion, post all of the naming rules to the new

proposed ASSEMBLER-ONLY FORUM for ridicule.

 

No,  it comes down to using the appropriate type acceptable to the Compiler.

 

It costs you nothing.    It removes any ambiguity.  stdio.h defines NULL

 

I first learned C on the M68000 when PC programs ran on i8088.   Early PC hobbyist programs often used int for addresses.    There were seldom function prototypes and PC Compilers were lenient.

The M68000 compilers were equally lax.    But the M68000 hardware was not.

AVR compilers are much better than 1980s M68k.    But the AVR hardware does not throw exceptions.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I always expect error handling to be on the shoulders of the programmer.  Many of those older functions will fault if they get bad pointer data (null pointer). Those library functions are written to be as optimized as possible for minimal code space and fast execution. You must always assume responsibility to pass in what every library function expects (valid data) and never anything else.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

12oclocker wrote:
I always expect error handling to be on the shoulders of the programmer.

Rationale for International Standard— Programming Languages— C  (C99)

[page 10, line 17]

• Trust the programmer.

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
david.prentice wrote:
NULL is a pointer.   NULL refers to an address with value 0.
The integer value zero converts to a null pointer

regardless of the null pointer's representation.

That said, I encountered a compiler that would not convert 00 to the null pointer.

Quote:
I seem to recall a spirited discussion on this in the past. It boils down to how many names there are for zero, right? ;)
No.

 

A frequently occurring theme is whether to use a nearly universal implementation detail

or to do things according to a standard.

Moderation in all things. -- ancient proverb

Last Edited: Thu. Nov 5, 2020 - 09:54 PM