How to rewrite "malloc" from scratch ? (with C)

The purpose of this post is to help you make these functions, but the work remains to you (however, I will give you pseudo-codes or code parts to help)! With these following explanations, you will understand the principle and then you will be able to make the functions by yourself. Right, let's start.

First malloc

This is a "dumb" memory allocation function as I use to call it. We will not keep it, but you can start to introduce your code with this one.

void* my_malloc(size_t size) {
    void* allocated = sbrk(0);
    void* request = sbrk(size);
    if (request == (void*) -1) {
        return nullptr;
    }
    return allocated;
}

Let me explain it :

void* : It's a pointer without an assigned type, then you can have a pointer pointing on any object, no worries about its type.
size_t : You have to create it if you don't use the standard library. It's simply a long unsigned int
nullptr : You also have to create it if you don't use the standard library. I defined it as #define nullptr 0

Now, we have sbrk(). According to its documentation :

brk() and sbrk() change the location of the program break, which defines the end of the process's data segment (i.e., the program break is the first location after the end of the uninitialized data segment). Increasing the program break has the effect of allocating memory to the process; decreasing the break deallocates memory. Source : manpages.ubuntu.com

We are also checking if the request was accepted and then we return a pointer on the allocated block.

Do you see the problem here? When we allocate an object, we want to free it after. But how can we know which memory block was allocated to free it?

Let's solve that problem.

Memory blocks

For each allocation, we will save the information of that block to a structure called BlockMeta : the meta-information of a memory block. It's true, the size in memory for each allocation will be increased but it's not so terrible, no worries.

Create your own from this scheme :

BlockMeta
 - size // how much space you are using in memory for this block
 - next // the following block
 - is_free ? // maybe you already freed it

Help about types

I chose to set a type for this structure : typedef s_block_meta* BlockMeta where s_block_meta is the structure

Before updating the malloc function, using the BlockMeta structure, we have to create two other functions to request memory space :

BlockMeta mem_find_free_block(BlockMeta* last, size_t size);
BlockMeta mem_request_space(BlockMeta last, size_t size);

And a global variable called BLOCK_BASE is set as nullptr (a pointer pointing on a NULL value) for the beginning. It will permit us to know while allocating if it's the first time we allocate memory. This will be seen further while writing our mem_alloc() function.

Find a free block in the memory

Here we are going to create this function:

BlockMeta mem_find_free_block(BlockMeta* last, size_t size);

At first, it creates a BlockMeta current variable and initialises it to BLOCK_BASE. In a loop, the value of last is updated to the current variable. And the current variable is updated to the current->next value.

This loop stops when the next block (the current block is free and if the size is correct for the asked allocation. When it finds a free block of the same size or bigger as the requested size, it breaks the loop and returns the current block being this block.

Request space in memory

Here we are going to create this function:

BlockMeta mem_request_space(BlockMeta last, size_t size);

Thanks to the sbrk() function, it requests a space of the requested size in the memory to the operating system.

Sometimes, sbrk() fails to allocate and returns a null pointer, so we have to check the return value.

Don't forget we actually don't allocate the requested size, but the size of a BlockMeta plus the requested size.

Consequently, a call to sbrk() does not look like this:
void* request = sbrk(size);
but this:
void* request = sbrk(size + sizeof(BlockMeta));

We will also update the last->next value to the value of the created block.

To initialise the created block, we do it as follows:

BlockMeta created_block = sbrk(0);

created_block->size = size;
created_block->next = nullptr;
created_block->is_free = false;

The value of created_block is returned by the function.

Allocate!

The memory allocation function simply looks like this:

void* mem_alloc(size_t size);

Before anything, we have to check of the requested size is not null. Otherwise, a null pointer is returned.

You can define a macro for requesting space in memory, it avoids rewriting the same code each time we want to do it.

#define REQUEST_SPACE(last) \
    created_block = mem_request_space(last, size); \
    if (!created_block) { return nullptr; } // system failure

As you can see, it calls our last function mem_request_space()

When it's the first allocation, so BLOCK_BASE is null, we don't have to find a free block on memory, we can directly request space in memory :

BlockMeta created_block;

if (!BLOCK_BASE) {
    REQUEST_SPACE(nullptr);
    BLOCK_BASE = created_block;
    return created_block + 1;
}

Here we update the value BLOCK_BASE for the next allocations so that we know they are not the first allocation (obviously).

However, when it's not the first allocation, we create the dear BlockMeta last we often have seen in our functions. It is initialised as last = BLOC_BASE.

Now we have to search for a free block in memory, thanks to our mem_find_free_block() function. The returned block is our created_block.

But sometimes no free block is found. We have to directly request space.

created_block = mem_find_free_block(&last, size)

if !created_block {
    request space
    return created_block + 1
}

Otherwise, we set the created_block as non-free and we return created_block + 1.

Free me that memory!

The memory-freeing function is simple:

void mem_free(void* pointer);

When the pointer is null, it returns directly. Otherwise, it has to retrieve to block corresponding to the pointer and set it as free.

But how do we retrieve the block corresponding to a pointer?

This is what we are going to see right as follows.

Retrieve blocks from pointers

Let's be efficient, here is the simple code to do that:

BlockMeta mem_block_from_pointer(void* pointer) {
    return (BlockMeta) pointer - 1;
}

Allocate, again!

The reallocate function looks like a mix of the free function and the allocate function:

void* mem_realloc(void* pointer, size_t size);

Indeed, it retrieves the block corresponding to the pointer. If there is enough size in the retrieved block, the same pointer is returned, no need to reallocate. It is done as follows:

BlockMeta block = block_from_pointer(pointer)

// Enough space, it simply returns the same memory block 
if block->size >= size {
    return pointer
}

But when there is not enough space, we have to reallocate. It's simple, we call our mem_alloc() with the new size (don't forget to check if the returned pointer is null).

Then, we copy the content of the current block to the new block before calling mem_free() on the pointer. The returned value is the new pointer got from our previous call to mem_alloc().

Wait, we have no function to do a memory copy.

This is what we are going to see right as follows.

Utility functions on memory

Here are three useful functions on memory:

void mem_copy(void* dest, const void* source, size_t size);
bool mem_comp(const void* first, const void* second, size_t size);
void mem_set(void* dest, int fill_value, size_t size);

The first one copies the content of the source block to the destination block.
The second one compares the content of two blocks.
The third one fills a destination block with an integer value.

Copy, copy, copy, copy...

Copying is easy, we browse the content of a block and set each value to the other.

for i = 0; i < size; i++ {
    dest[i] = source[i]
}

But this cannot be done as easily in C because we cannot update the pointed value of a void pointer. To access the ith element of a void pointer, we have to convert it to an int pointer as follows:

void* x = ...
// This: `x[i] = ...` cannot be done!
((int*) x)[i] = ...

So, here is the assignment we do in the loop:

((int*) dest)[i] = ((int*) source)[i];

Compare two blocks

It is simple as copying memory but instead of assignment, we do a comparison. We the two values are not equal, the function returns false.

Fill a memory block with a value

It is simple as copying memory but instead of assigning the ith value of a source block, we assign each value to the given fill_value.

Conclusion

Congratulations, you have created a tiny library to allocate, reallocate and free blocks in memory but you have also learnt to do some optimization by reusing the old blocks instead of wasting them.

Here is the repository of my "memlib" project: https://github.com/antoninhrlt/memlib. Take a look if you have doubts.

Don't hesitate to post a comment if you need further help.

Test your functions to be sure you have done everything as well!

Thank you for reading this article.

Antonin Hérault