Idea for compaction and pointer handling

Go To Last Post
20 posts / 0 new
Author
Message
#1
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Before I begin, please disregard whether compaction would even be viable inside AVR. I am just brainstorming here.

 

Assume a heap that has fragmented memory. I want to compact all live objects. I would first compact live objects in terms of physical address, and then update reference pointers. It is this second step that I am trying to work out here.

 

Firstly, to even identify a live object, we'll need some sort of mark-algorithm. Assume we have perfect garbage collection and any dereferenced pointers are immediately removed. My idea was to maintain an array of pointers, where every newly allocated object is added to this array. This array would then hold all live objects. In order to access these objects, you must go through this array as the root stack pointers just point to these array pointers.

 

Then when the memory is physically compacted, the array pointers are updated to their new address. This array could be stored in a reserved area that won't be compacted itself (possibly between the heap and the .bss segment).

 

How realistic is this array in terms of size? Would my math be correct in saying you'd need (heap size/pointer size) number of array values? Have I just misunderstood how pointers work (very possible)?

 

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 1

If you mean a normal malloc/free heap then once a pointer has been allocated how could you possibly get back to the user code and tell it "you know that pointer to 0x1234 I recently issued to you? It now points to 0x2345". There is no such mechanism.
.
Of course on a CPU with an MMU and virtualised memory pointers you can issue a pointer to 0x1111 which behind the scenes is virtually mapped to 0x1234 then later you can change that internally to 0x2345 but the user always believes it's 0x1111 and is never aware of the change. That's why Linux and Windows don't have heap fragmentation problems. But in simple CPUs with no MMU you cannot do such tricks.
.
This is the exact reason for not using malloc /free on non-MMU CPUs.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

If something like that is usefull it's probably already implemented in various ways in different programming languages. Maybe as part of a "garbage collector".

But C / C++ does not do garbage collection.

Stuff like this is only acceptable on systems where random interruptions of milli seconds or more outside of the programmer's influence are acceptable.

 

I even think there are nowadays also all kind of virtual address spaces, where fragmented memory is handled in hardware and transpararent to the programmer.

Modern CPU's are mind blowingly complex.

 

C++ does have "std::vector" and other classes for data storage which can move data around internally if needed.

I have no idea how smart the internals of such classes are.

C++ is a moving target.

It might be a good idea for you to look into the different versions of the iso cpp standards:

https://isocpp.org/

 

But all this stuff is never going to be relevant on uC's the size of AVR's.

Stackoverflow is probably a better forum for questions like this.

Paul van der Hoeven.
Bunch of old projects with AVR's:
http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Is your current program having issues? Are you now trying to have it run as slow as possible....all of this monitoring/overhead will take a huge toll.

When in the dark remember-the future looks brighter than ever.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
If you mean a normal malloc/free heap then once a pointer has been allocated how could you possibly get back to the user code and tell it "you know

Could it be that this is the realm of "managed pointers"?

https://duckduckgo.com/html?q=%2...

 

Here a reference to "smart pointers" and it's use in C++:

https://en.wikipedia.org/wiki/Sm...

 

Edit:

I'm a bit out of my depth here. (See Johan below).

Paul van der Hoeven.
Bunch of old projects with AVR's:
http://www.hoevendesign.com

Last Edited: Sat. Jan 27, 2018 - 05:01 PM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Paulvdh wrote:
Could it be that this is the realm of "managed pointers"?

I associate the use of the term  "managed pointers" with .Net environments. After a bit of reading it seems to boil down to something very similar to references in C++. So, for all that I can see, not applicable to the OQ (original question). I read this: https://www.infoq.com/news/2015/... .

 

Paulvdh wrote:
Here a reference to "smart pointers"

I can't see how this applies to the OQ. A smart pointer in C++ is a (automatic allocated) container for a pointer to a dynamically allocated object. In short, it will guarantee that the object gets freed when the container goes out of scope. This is what is often referred to as "RAII". Although the acronym is read out as Resource Allocation Is Initialization, it is mostly about this guarantee of de-allocation which 1) guarantees that the destructor will run and 2) that there will be no memory leak (so the slight connection to the OQ is that it applies to allocations schemes like the C++ one - which does not sport reference counting and garbage collection built into the heap management system).

 

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
If you mean a normal malloc/free heap then once a pointer has been allocated how could you possibly get back to the user code and tell it "you know that pointer to 0x1234 I recently issued to you? It now points to 0x2345". There is no such mechanism.

 

Not sure what you mean here. Are you saying there is no way to trace back up a pointer?

 

avrcandies wrote:

Is your current program having issues? Are you now trying to have it run as slow as possible....all of this monitoring/overhead will take a huge toll.

 

I'm on an environment that needs all of this overhead to be evaluated. 

 

Thanks for the responses everyone, I appreciate it :)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ThePageMan wrote:
I'm on an environment that needs all of this overhead to be evaluated.

 

On small uC's (such as AVR) it is veryunusual to use malloc. All memory allocation is usually static, or on the stack for functions.

If there is not enough memory, then switch to a bigger device.

Alternatively, if the application allows, You can declare a (relatively) big array to hold temporary data and re-use that array for differnt data at different times and places in your program.

 

The easiest and best way to handle memory fragmentation on small uC's it to not fragment thememory at all.

uC's dont run programs for a few hours, but run continously for their whole lifetime ( 20 years or so...).

 

On Harvard archtitecture chips it is often neccessary to tell the compiler to store static data ONLY in flash.

An often made mistake is to declare (static or not) strings, which permanently occupy valuable ram space.

Paul van der Hoeven.
Bunch of old projects with AVR's:
http://www.hoevendesign.com

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ThePageMan wrote:

clawson wrote:
If you mean a normal malloc/free heap then once a pointer has been allocated how could you possibly get back to the user code and tell it "you know that pointer to 0x1234 I recently issued to you? It now points to 0x2345". There is no such mechanism.

 

Not sure what you mean here. Are you saying there is no way to trace back up a pointer?

 

In a normal C program if I 

char * p = (char *)malloc(20);

Then if this returns 0x1234 then this code using p can never know if the 20 bytes have been moved.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ThePageMan wrote:
Not sure what you mean here. Are you saying there is no way to trace back up a pointer?

Adding to clawsons comment above, and using his example code snippet.

 

Given the address 0x1234 there is no easy/practical way to know that the pointer p points to it.

 

1) You need to do an exhaustive search of all of memory to find all pointers pointing to that address, and

2) For each memory location so investigated you also need to know it that location actually is a pointer.

 

The information needed for the second point more or less equates to having type information for each used memory location in the memory.

 

Now, for a static variable that might be possible if your build included debugging info, which is seldom for production code. Also, debugging info is never propagated all the way to the flash ROM of an AVR. So right there the possibility falls over on it's nose.

 

Similar for an automatically allocated variable (e.g. a function-local variable), only that it gets somewhat more complicated because of the possibility that the function might recursively call itself. Again, it falls over on it's nose.

 

For dynamically allocated object (i.e. allocated with malloc() ) it is literally impossible in the general case. E.g. you could have a pointer pointing to a dynamically allocated pointer pointing to a dynamically allocated pointer pointing to... (You get the picture.. Falling over.. Nose.. ).

 

The solution to this kind of problem is to implement a quite different approach. Let the pointer be a "heavy-wheight" object that not only holds the basic pointer (i.e. a memory address) but also some tag clearly indicating that it is a pointer (and then you might as well include what type of object it potentially points to. Then let all other types of dynamically allocated objects (e.g. structs) also have an embedded tag indicating the objects type. Now you can pick such an object and then search through all objects that are pointers and see if the first object is pointed to by any (or several) of the pointers you find. This, or something similar, is what e.g. the Java run-time and the .Net framework does. This also allows for objects that no longer are pointed to by any pointer to be discarded in a controlled way. Such a discarding algorithm could e.g. be run when memory becomes scarse, or periodically when there is low load on the system, to free up memory. One problem still prevails, and that is fragmented memory, but this problem can also be solved with the information present. Objects can be moved around to "tidy up the memory" and when an object is moved all pointers pointing to it can also be located and corrected. Since objects that are moved might be pointers pointing to other objects this process can easily become costly in CPU time. And you can see that it is costly in memory  consumption. The "tidying-up" process is called "Garbage Collection".

 

C is a language designed to be highly efficient both when it comes to CPU time and memory consumption. Therefore no such scheme is generated for C programs. C pointers are as light weight as they can be. When a C program is running any pointer is just a simple address (referring to some point in memory). Nothing more.

 

C pointers are sometimes illustrated as arrows, e.g. drawn on a paper, but while generally good that allegory does not hold all the way. On that piece of paper you can clearly see where the arrow starts. Not so for (C) pointers.

 

Think of them more as road signs. Now, imagine yourself standing in a town into which many roads lead. Trying to see what points to a certain memory location is akin to standing in that town and ask "What road-signs point to here?" (actually even worse, because when you do an exhaustive search over all of the road network you immediately can identify everything that serves as a road sign - but as noted above, this information is not present for pointer in memory in a C running program.)

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

Last Edited: Sun. Jan 28, 2018 - 01:48 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Paulvdh wrote:
But C / C++ does not do garbage collection.
Python has garbage collection.

Paulvdh wrote:
C++ does have "std::vector" and other classes for data storage which can move data around internally if needed.
The Boost library can extend C++.

Boost.Pool adds storage pools (iow memory pools) that might be "good enough" though Pool does not have the required compaction.

Paulvdh wrote:
But all this stuff is never going to be relevant on uC's the size of AVR's.
There's a Python 2.5 subset for megaAVR though it's dormant.

There's an active and growing Python 3 for MCU in MicroPython.

Some of the MicroPython ports that may be of interest :

  • dsPIC (one model, 16KB RAM, 32KB RAM at low effort); PIC24 would be some effort though some PIC24 have significant SRAM EBI (8b data, 24b address IIRC)
  • Zephyr RTOS (x86 Quark, x86-64 Atom, SAM4S, SAM E70, multiple other arm, ESP32)
  • ESP32 (shared common effort between MicroPython and Pycom)

PIC32 - If the application isn't real-time or near real-time then could consider porting Python to LiteBSD on PIC32MZ.

PIC32MZ is not coin cell friendly cheeky

 


Python

https://docs.python.org/3/c-api/memory.html#overview

Memory management in Python involves a private heap containing all Python objects and data structures.

...

Consequently, under certain circumstances, the Python memory manager may or may not trigger appropriate actions, like garbage collection, memory compaction or other preventive procedures. 

...

https://theboostcpplibraries.com/boost.pool

Python

Python

https://wiki.python.org/moin/PyMite

What is PyMite

PyMite is a flyweight Python interpreter written from scratch to execute on 8-bit and larger microcontrollers with resources as limited as 64 KiB of program memory (flash) and 4 KiB of RAM. PyMite supports a subset of the Python 2.5 syntax and can execute a subset of the Python 2.5 bytecodes. ...

...

https://micropython.org/

(about 3/4 page)

  • ...
  • a simple, fast and robust mark-sweep garbage collector for heap memory
  • ...
  • support for running Python code on a hard interrupt with minimal latency
  • ...
  • a native emitter that targets machine code directly rather than the bytecode virtual machine
  • ...

Note: MicroPython is in C; should be able to preempt the Python interpreter.

 

https://github.com/micropython/micropython/tree/master/ports

http://www.microchip.com/wwwproducts/en/dsPIC33FJ256GP506A

Zephyr Project

https://www.zephyrproject.org/

Pycom - Next Generation Internet of Things Platform

https://pycom.io/

https://github.com/sergev/LiteBSD/wiki

LiteBSD is variant of 4.4BSD operating system for microcontrollers. Currently, only Microchip PIC32MZ family is supported.

PIC32MZ is a MIPS32 processor with MMU with paging support, and 512kbytes of on-chip RAM. These resources are enough to run 4.4BSD.

...

http://www.microchip.com/design-centers/32-bit/architecture/pic32mz-family

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

gchapman wrote:
Paulvdh wrote:
But C / C++ does not do garbage collection.

Python has garbage collection.

 

Yes. So what. Clearly, the question here was in the context of "C"? 

As of January 15, 2018, Site fix-up work has begun! Now do your part and report any bugs or deficiencies here

No guarantees, but if we don't report problems they won't get much of  a chance to be fixed! Details/discussions at link given just above.

 

"Some questions have no answers."[C Baird] "There comes a point where the spoon-feeding has to stop and the independent thinking has to start." [C Lawson] "There are always ways to disagree, without being disagreeable."[E Weddington] "Words represent concepts. Use the wrong words, communicate the wrong concept." [J Morin] "Persistence only goes so far if you set yourself up for failure." [Kartman]

Last Edited: Sun. Jan 28, 2018 - 01:51 AM
  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ThePageMan wrote:
I'm on an environment that needs all of this overhead to be evaluated.
eheap has piece-meal defrag and is thread-safe.

Micro Digital

eheap vs. dlmalloc
by Ralph Moore, smx Architect
February 2016

http://www.smxrtos.com/articles/eheapvdlmalloc.htm

...

 

Free Chunk Merging
... eheap permits deferred merging

...

 

Fragmentation
...

If merge control fails to stem heap failure, an eheap recovery service is provided to search the heap in order to find and merge adjacent free chunks to satisfy a failed allocation. If this fails, ...

 

Reliable Operation

...

Heap Scanning

The eheap heap scan service is intended to be run from the idle task so that it will not consume valuable processing time. ...

...

 

via

Micro Digital

eheap - Embedded Heap

http://www.smxrtos.com/eheap/

via

embedded.com

eheap - A new embedded heap manager

MARCH 14, 2016

https://www.embedded.com/electronics-blogs/break-points/4441633/eheap---A-new-embedded-heap-manager

 

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

JohanEkdahl wrote:
I associate the use of the term  "managed pointers" with .Net environments.
In instance of that for MCU is .NET Micro (NETMF)

.NET Micro Framework

http://netmf.github.io/

...

The Microsoft® .NET Micro Framework combines the reliability and efficiency of managed code with the premier development tools of Microsoft Visual Studio® to deliver exceptional productivity for developing embedded applications on small devices. The Microsoft .NET Micro Framework SDK supports development of code, including device I/O, in the C# language using a subset of the .NET libraries, and is fully integrated with the Microsoft Visual Studio® development environment.

...

NETMF went legacy that lead to the birth of TinyCLR OS.

TinyCLR.com

TinyCLR OS

http://www.tinyclr.com/

"Dare to be naïve." - Buckminster Fuller

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

In some of our C++ projects we have implemented "smart pointers" which in our case is to do with "locking" but one could use a similar technique to create "indirect pointers" so their physical target address could be changed, but you couldn't do this in C.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

The OP is called ThePageMan and this concept is the answer to his problem.

 

The memory allocator has a big pool of memory from which several memory pages of fixed sizes 16, 32, 64, 128 (or whatever) are allocated to requesters.

Each page is obviously contiguous but subsequent requests, if successful, may not necessarily result in contiguous pages. The module requesting memory for example a large message must be prepared to split that message over several pages if the biggest free block can't handle it.

 

BTW: This idea isn't mine; it was the memory allocation mechanism in a framework I once used.

 

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Or you just use an MMU and an OS that virtualises memory ;-)

 

(In reality most virtualising OS typically setup the MMU with something like 4K pages)

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

Oh dear, I didn't realise I had this many responses.

 

It took me awhile to understand what the problem was, but every time I tried to say "Well surely you could just..." I realised I didn't know how to finish that sentence.

 

I think I see the flaw in my thinking. There's basically two directions we can tackle the problem.

 

If we compact the addresses first, C objects don't have any overhead to say which pointers are pointing at them. So there's no way of updating a pointer that its object is now at a different address.

 

If we traverse all pointers and compact as we go, other pointers pointing at the same address would become dangling pointers. Additionally, if that pointer is just pointing at another pointer, there would be no way of knowing. And that would create a big mess itself.

 

The only solution would be attach all this information as overhead to every single object. Something against the whole idea of C.

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

ThePageMan wrote:
If we compact the addresses first, C objects don't have any overhead to say which pointers are pointing at them. So there's no way of updating a pointer that its object is now at a different address.
In C you could do something like a callback. So when you malloc you not only ask for N bytes but extend malloc() to have a new parameter that is a callback. Something like:

typedef void (*fptr_t)(void *);

void * newMalloc(size_t, fptr_t);

uint8_t * myPtr;

void myPtr_Update(void * newAdr) {
    cli();
    myPtr = (uint8_t *)newAdr;
    sei();
}

int main(void) {
    myPtr = newMalloc(32, myPtr_Update);
    while(1) {
        // use myPtr
    }
}

The update code would need to run in an ISR context (perhaps a regular time "managing" memory?). On the occasion it moves myPtr it then calls back to myPtr_Update. But suppose this client code had made a local copy of myPtr - it won't know so it would need to be very controlled.

 

Using C++ and implementing "smart pointers" where all the things you could do like * dereference or [] array indexing have their operator overloaded. In the core there is a remapping table that maps the issued pointer to a physical address and that remapping can be adjusted.

 

But seriously, for non-MMU micros with limited RAM resources "Just Say No!" to malloc()!

  • 1
  • 2
  • 3
  • 4
  • 5
Total votes: 0

clawson wrote:
But seriously, for non-MMU micros with limited RAM resources "Just Say No!" to malloc()!

Microchip

Developer Help

Avoid Memory Allocation Catastrophe

http://microchipdeveloper.com/c:avoid-memory-allocation-catastrophe

...

(last paragraph)

Don’t fall for the seeming ease and expandability of “dynamic memory allocation” in embedded systems. Do the hard work. Define the performance envelope of your application. Figure out, beforehand, how to fail gracefully – degrading performance along the way if necessary. If you invite catastrophe, it will come.

via https://plus.google.com/+MicrochipTech/posts/XZFtRsVFa54

 

"Dare to be naïve." - Buckminster Fuller