how do you pass a string containing a " character to an assembler macro

Go To Last Post
28 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

 

in every other version of the GNU assembler i have used I can pass a string terminated with either " chars or ' chars to a macro. 

 

f I want to pass a string containing a ' char to a macro i do 

 

  foo "xx'xx"

 

if i want to pass a string to that same macro that contains a " char i simply enclose the string parameter in ' chars as follows. 

 

  foo 'xx"xx'

 

this does not work with the GNU avr assembler and thus I can not use the macro, i have to hand encode the string one byte at a time

  .byte 'x', 'x', 0x27, 'x', 'x'

  .byte 'x', 'x', 0x22, 'x', 'x'

 

which makes a horrendous mess of my otherwise *EXTREMELY* neat and tidy sources!  (because this is important! :)

 

 

If something can be read without effort then great effort has gone into its writing

Last Edited: Tue. Apr 30, 2019 - 07:21 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But GNU as is built from the same source for all targets?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I can't be bothered to test it.   But I suspect that gnu-as will use the regular C rules for literal strings.

 

Yes,   I know that historically different Assemblers have different rules.

 

Yes,  occasionally you have to escape certain characters.    You should try embedding Cyrillic.

If you have a lot of non-ascii characters,   choose UTF-8 or other encoding.

 

It will make your text strings easier to maintain.   But you will need to decode the resultant strings.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:

I can't be bothered to test it.   But I suspect that gnu-as will use the regular C rules for literal strings.

 

Yes,   I know that historically different Assemblers have different rules.

 

Yes,  occasionally you have to escape certain characters.    You should try embedding Cyrillic.

If you have a lot of non-ascii characters,   choose UTF-8 or other encoding.

 

It will make your text strings easier to maintain.   But you will need to decode the resultant strings.

 

David.

using UTF-8 would significantly increase the resultant code size, probably beyond the boot rom size for the 32u4 that this has to fit into.  i tried ".\"" and the \" terminates the original quote i tried ".""" and the same things happens.

 

If something can be read without effort then great effort has gone into its writing

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Seriously. A bootloader is a one-off. You write it once. It must work 100%. You don't tweak it. You just write it correctly first time.
.
So I don't see it as a great imposition if you have to write byte arrays instead of literal strings. You can generate the byte arrays with an external program. You can place the original literal in a comment field.
.
I have vague memories of your exact problem. I just gave in.
I do not have much experience of gnu-as. If you had a real-life problem it would be worth investigating e.g. vast number of strings in an application.
.
David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I wrote a C file and an S file with some byte arrays:

#include <stdint.h>

uint8_t david[14] = "David\"Prentice";   //.ascii no NUL
uint8_t kbv[] = "K B V\"Controls";       //.asciz with NUL
uint8_t raw[] = { 0x01, 0x02, 'D', 'a', 0x22, 'v', 'i', 'd', 0x00 }; 

and

      .globl david_asm, kbv_asm, raw_asm
      .data
david_asm:
      .ascii "David\"Prentice"
kbv_asm:
      .asciz "K B V\"Controls"
raw_asm:
      .byte 0x01, 0x02, 'D', 'a', 0x22, 'v', 'i', 'd', 0x00

Both files work with main.c

#include <avr/io.h>

extern uint8_t david[], kbv[], raw[];
extern uint8_t david_asm[], kbv_asm[], raw_asm[];

int main(void)
{
    PORTB = david;
    PORTC = kbv;
    PORTD = raw;
    PORTB = david_asm;
    PORTC = kbv_asm;
    PORTD = raw_asm;
    while (1) ;
} 

Yes,  it might upset you to use a backslash escape.

 

Yes,  I can remember different assemblers with different features.   I think that you just have to bite your lip and accept the appropriate gobbledygook.

 

I resigned myself to C conventions 30 years ago.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Just to note that the GNU as manual also says pretty much the same thing:

 

https://ftp.gnu.org/old-gnu/Manuals/gas-2.9.1/html_chapter/as_3.html

 

(OK, that's 2.9.1 but I guess the text has not changed much over the years?). it says:

 

\\

    Represents one `\' character.

\"

    Represents one `"' character. Needed in strings to represent this character, because an unescaped `"' would end the string.

EDIT: yup the "latest" manual is a bit different in structure but says essentially the same thing:

 

https://sourceware.org/binutils/docs/as/Strings.html#Strings  

Last Edited: Wed. May 1, 2019 - 10:59 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The OP seems to want to construct a macro.

However robust the C escape sequence might be,   the macro execution might barf.

 

The real mystery is why a Bootloader would ever have more than one or two literal strings e.g. USB device name, manufacturer, ...

And just how many would have embedded double-quote characters.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

David,

 

See his other thread(s) about GNU as. It's clear from the examples posted that he's implementing a Forth interpreter. Now how that relates to a bootloader I don't actually know. (but for Forth there's a lot of strings and word links - hence the above).

 

EDIT: specifically this shows the Forth heritage.... https://www.avrfreaks.net/commen...

Last Edited: Wed. May 1, 2019 - 12:15 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In #4 it looks as if it is going into the boot area of a USB chip.

mark4th wrote:

using UTF-8 would significantly increase the resultant code size, probably beyond the boot rom size for the 32u4 that this has to fit into.  i tried ".\"" and the \" terminates the original quote i tried ".""" and the same things happens.

 

All that I know about Forth is that you have a tiny core kernel.   Words are added to dictionary.   The program execution works by pushing and pulling words off a "stack".

 

You might "build" the core kernel with gnu-as.    But you are not going to fit the whole gnu-as assembler into the Forth Dictionary.

It is quite possible that you do implement a "simple" assembler in Forth.

 

David.

Last Edited: Wed. May 1, 2019 - 02:21 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

But you wouldn't really implement a bootloader in Forth?? So I wonder why the mention of "boot rom size" (which I think is max 4K for a 32U4) ?

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

the backslash escape is not a problem, I use it in a different version of this code (for arm thumb2 linux, linked below).

 

there are only a few strings i need with " characters in them, they include

 

."

(.")

abort"

(abort")

 

In my Thumb2 forth which you can see on github I use backslash escaped " characters.  If you look in io.s on line 145 I define the forth word (.") and the " is escaped in the way suggested. 

 

  https://github.com/mark4th/t4/bl...

 

With this AVR incantation of the GNU assembler this did not work for me and every time i tried to define things the way they are in the above linked sources I would get an error.  This morning it just worked so now im confused, was there some other error that was preventing things from working? No idea... but now its working :/

 

ty for the help

 

 

If something can be read without effort then great effort has gone into its writing

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mark4th wrote:
With this AVR incantation of the GNU assembler
You keep saying that as if the AVR version is different from other GNU as versions. While the mnemonic/opcode support varies between targets things like symbol/string/macro support is common across all variants of as so you should not find avr-as any different in this sense as it's all built from the common huts of the GNU assembler. In fact this is, in part, what's so appealing about GCC/GNU in that what you know for one x86 or ARM or whatever target is generally applicable for all targets with just some very small, machine specific subtleties.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

actually THIS incantation of this forth compiler is specifically so I can use atmel studio to test hundreds of primitives.  Being able to single step through each one here simplifies things enormously.  Once I see that they are working ALL forth headers will be stripped out of the AVR Forth kernel which will reduce the boot sector size significantly. 

 

The entire code base for this forth kernel will be rewritten as a target compiler extension to my already existing Linux forth.  I already have a MUCH more powerful AVR macro assembler 100% coded and working and will no longer have any need for the GNU assembler, my Linux forth will pretty much take these existing sources with only minor modifications and assemble them. 

 

This incantation is simply a small step in the overall design of the forth and can be considered a bootstrap test build.

 

I COULD however keep this version as is (the kernel would use up the entire boot sector) and extend this with my existing forth AVR macro assembler and the entire assembler would probably not take up more than about 2 or 3k of application code space.  It is not that big and is orders of magnitude more powerful than the GNU assembler because it has the power of the entire forth programming language behind it.  

 

I did not as yet release the sources to my AVR assembler but they are also not top secret and will be released eventually.

 

If something can be read without effort then great effort has gone into its writing

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I admire Forth enthusiasts for their perseverance.

 

I am aware of its elegance and efficiency.   But my head starts hurting.

 

Embedded programs need to work and be maintainable by humans.

 

Yes,  I am horrified by Megabytes of PC application.

Embedded programs tend to be 8kB - 256kB.   Even C, C++, ... becomes difficult to maintain as memory gets larger.

 

I suppose that I am really saying:

Let's see an example project written in Forth that has dramatic performance (compared to regular C, C++).

 

Some years ago,   I wrote some PostScript code for execution on a printer.   Yes,  it worked but it was a painful process.

 

David.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:

I am aware of [Forth's] elegance and efficiency.   But my head starts hurting.

 

I was sold on Forth ... until I got to the IF statement.

 

--Mike

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I absolutely WOULD implement it in forth because that turns your target into your entire development system and application operating system. 

 

The target forth is tethered to your desktop and the interface between the host and the target is 100% seamless.  You can write code on the host or on the target.  you can run code on the host or on the target.  once the target application has been built you can configure the system to run your application at boot.  The "boot" rom from that point on is basically the operating system for your application code and you have the option to use it as a means of updating your application in the field or not.

 

The advantages of forth are significant, specially in an embedded system.  First of all you dispense with the entire "Edit, compile, link, debug, repeat" development cycle and replace it with "edit, debug, repeat".  Forth is interactive, you define primitives and immediately test them.  These you use to create higher level definitions which are also immediately tested.  At all stages of development, everything that you have already coded has already been tested.

 

Secondly, forth is orders of magnitude more space efficient than the equiv C (think 100 times smaller) and when done right can be more space efficient than the equiv assembler for any non trivial application.  THIS makes forth *THE* solution for pretty much any embedded application you can think of.  

 

The down side?  Almost nobody knows Forth and most people taking their first look at it go screamin for mamma because there are no data types or operator presidences, no built in protections stopping you from doing "stupid" things and... it is entirely RPN.  If you cant think RPN you wont like this language.

 

for example

 

: fib 0 1 rot 1 do tuck + loop nip ;

 

 

40 fib . xxxxxxx <-- forth prints out the result xxxxx

 

"oopts thats wrong"

 

forget fib

 

: fib 0 1 rot 1+ 1 do tuck + loop nip ;

 

fixed!

 

 

If something can be read without effort then great effort has gone into its writing

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I would HATE to have to write PS code! lol

 

This is a hobby but I have also done it professionally and it is the only language I would consider myself an expert at.  The ***ONLY*** issues I have ever had are related to the development tools I need to use to create them and my own inexperience with them.

 

As for "megabyte" PC applications, those would be considered tiny in this day and age.   I wrote a Linux curses library in forth (8 hours of man 5 terminfo followed by 2 hours of coding).  Libncurses is about 500k in size, my entire library at the time was 5k in size - its now about 8k in size and contains a complete (tho currently broken) text user interface / windowing system with moving / overlapping windows. (really need to fix this code lol).

 

My Linux forth compiler can also compile about 4 megabytes of source code per second (bestest guestimate).

 

 

If something can be read without effort then great effort has gone into its writing

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Your problem probably was not the IF statement but the RPN.  Think of it in this way

 

  do a test somewhere in your code.  The test result is left at the top of the parameter stack

 

  x 100 <      \ x goes on stack. 100 goes on stack < tests if x is less than 100.

 

the top of stack now has a "true" or a "false" flag on it.

 

  if ...... else ..... then

 

the IF tests the flag and if it is true it does nothing so the code between IF and ELSE is executed.  The "ELSE" is a unconditional jump to the code after the "THEN" statement.

 

If the flag at the top of the stack is false the IF is a jump to the code immediately following the else statement.

 

 

: fudge

  x 100 <

  if

    \ TRUE PART

  else

   \ FALSE PART

  then ;

 

 

Forth IS not easy to learn at first, it is very different to more traditional languages and you have to THINK differently.  This is a major stumbling block for a lot of people.  If however you push through you will come to recognize the elegance and simplicity of the language and once you become proficient with it you will develop applications much faster with it than you would with C in most cases (assuming NON trivial applications).

 

Think of the forth learning curve as being flatline for the first stage of your learning and then it just clicks and goes ballistic.  Your level of understanding of forth from its lowest levels to its highest become totally intuitive and second nature to you. 

 

With C you learn quickly how to implement extremely bad code.  It then takes years to master.  So the learning curve is steep at first but then goes quickly flatline.  Over the years you learn to recognize not only what code is bad but why it is bad and with experience you tend to not fall into the pitfalls that a beginner would fall into.  x++ + ++x <-- gotta love C lol.

 

Just about every forth programmer in the world has the ability to create a from scratch forth compiler of their own after maybe a month or 2 (or less) of learning it.  Once you understand how to code in forth you understand how forth works internally down to the lowest levels.  You can not say the same about C.

If something can be read without effort then great effort has gone into its writing

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

david.prentice wrote:
I admire Forth enthusiasts for their perseverance.
My first professional job was almost as a Forth programmer. I did the interview and everything and had the job offer with a view to start (in a place about 100 miles from home) in 2..3 weeks. Then the week after I got the offer that week's Popular Computing Weekly had a small ad from this company called Amstrad (only known for "crap hi-fi" at that stage) who were just getting into the home computer business and looking for someone who could help out a bit in supporting it. Went for an interview the following day and then started the following Monday (about 10 miles from home). 25 years later I left.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mark4th wrote:

Your problem probably was not the IF statement but the RPN.

 

Nope.  I actually quite like the langauge and even

almost finished a project to implement FORTH on

an FPGA.  I need to get back to it someday....

 

The problem with FORTH is it is unreadable.  If you

take away the fib tag in your example and ask what

does this do, you have to really study the routine

and imagine what is happening to the stack at each

point in the definition.  You don't even know how

many items are supposed to be on the stack when

the call is made, how many will be popped off, and

how many new ones will be added.

 

I indicted the IF statement as being the deal-breaker

because it makes the language unwriteable as well.

I guess a good point of this is it makes you try hard

to eliminate branching as much as possible.

 

--Mike

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

mark4th wrote:
I would HATE to have to write PS code! lol
As part of generating my thesis,

I wrote C code to write encapsulated PS code.

'Twas included in troff.

"Demons after money.
Whatever happened to the still beating heart of a virgin?
No one has any standards anymore." -- Giles

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

you wouldn't really implement a bootloader in Forth?

SUN Microsystems did, IIRC.

 

The Gnu Assembler macro capability is a bit primitive compared to others that I've used (DEC, even Microsoft MASM.)

After all, the assembler is really just for use by the compilers; it's not as if anyone actually wrote code in assembler any more...

 

I think you might want ".altmacro" https://sourceware.org/binutils/docs/as/Altmacro.html#Altmacro

That's what normally provides the "either single or double quote delimiters for strings" feature, I think. 

It's possible that .altmacro is the default in some assemblers?

 

 

It seems it's not so much a matter of putting the quote characters inside a string, as apparently invoking macros removes a level of quotes...

 

	.macro string s
	.ascii \s
	.endm

main:	string "test"
	string "\"test\""
	string "(.\")"

Will generate errors on what it says is ".ascii test"  (note the removed quotes around test) (on an x86 gas - avr-gcc has less useful error messages.

 

	.altmacro
	.macro string s
	.ascii \s
	.endm

main:	string <"test">
	string <"\"test\"">
	string <"(.\")">

	.end

Seems to do what you want.

 

You could also consider:

	string "anot\x22her"

 

Contents of section .text:
 0000 74657374 22746573 7422282e 2229616e  test"test"(.")an
 0010 6f742268 6572                        ot"her         

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

westfw wrote:

 

You could also consider:

	string "anot\x22her"

That would offend OP's eye...

mark4th wrote:
which makes a horrendous mess of my otherwise *EXTREMELY* neat and tidy sources!  (because this is important! :)

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:
westfw wrote:

 

 

You could also consider:

	string "anot\x22her"

 

 

That would offend OP's eye...

 

mark4th wrote:

which makes a horrendous mess of my otherwise *EXTREMELY* neat and tidy sources!  (because this is important! :)

That is a gratuitous nasty.

If the suggestion works, *as a macro argument* mark4th might use it.

mark4th's tidiness comment was a comparison with something much clunkier, but deliberately excluded from theusch's quotation.

mark4th wrote:
this does not work with the GNU avr assembler and thus I can not use the macro, i have to hand encode the string one byte at a time

  .byte 'x', 'x', 0x27, 'x', 'x'

  .byte 'x', 'x', 0x22, 'x', 'x'

 

which makes a horrendous mess of my otherwise *EXTREMELY* neat and tidy sources!  (because this is important! :)

Legibility is important.

To get the effect wanted, mark4th might have to resort to code generation.

 

"Demons after money.
Whatever happened to the still beating heart of a virgin?
No one has any standards anymore." -- Giles

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

skeeve wrote:

That is a gratuitous nasty.

Unless a contrary theory is laid out, I'll stick behind my comment.  OP wants everything neatly lined up. Escape sequences "ruin" that.

skeeve wrote:
mark4th's tidiness comment was a comparison with something much clunkier, but deliberately excluded from theusch's quotation.

You say "clunkier"; I say "OP wants everything to line up".  Obviously I read it wrong. <insert eye roll here>  I saw pretty strong words for aligning...horrendus extremely neat tida

 

 

You can put lipstick on a pig, but it is still a pig.

I've never met a pig I didn't like, as long as you have some salt and pepper.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

theusch wrote:

skeeve wrote:

 

That is a gratuitous nasty.

Unless a contrary theory is laid out, I'll stick behind my comment.  OP wants everything neatly lined up. Escape sequences "ruin" that.

 

skeeve wrote:

mark4th's tidiness comment was a comparison with something much clunkier, but deliberately excluded from theusch's quotation.

 

You say "clunkier"; I say "OP wants everything to line up".  Obviously I read it wrong. <insert eye roll here>  I saw pretty strong words for aligning...horrendus extremely neat tida

 

 

 

My sources are extremely neat and tidy because if they are not then they are no better than an unmade bed.  doing "....\xx..." does not impact this.  IF you look at my github you will see sources that are extremely easy to read.  I have had people who do not code asm OR forth tell me that they can read my sources.  I always look for the cleanest way to implement something, if I cannot do that I go for "whatever works".

If something can be read without effort then great effort has gone into its writing

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

I took "tidy" to imply that OP wanted legible strings

in the design to remain legible strings in the source.

The occasional escape sequence is still preferable to separate bytes.

 

I've been bitten by 2-byte-by-2-byte "strings".

The author needed 2-byte characters.

The C implementation had 4-byte wchar_t's.

Searching for them was annoying.

 

I'd have used code generation or assembly macros.

None of them needed quotes.

"Demons after money.
Whatever happened to the still beating heart of a virgin?
No one has any standards anymore." -- Giles