Simplifying Memory Management in C | Security Journey & HackEDU Help Center

Source:

Memory Management Explained

Memory management is the process of controlling a computer's memory, which involves allocating and managing blocks of memory space for the operating system, applications, and other processes. This is crucial in systems where multiple processes may be running simultaneously. Unlike most other languages, C provides direct access to memory management, making it essential to understand proper techniques and rules to avoid memory leaks, crashes, and other issues.

Efficient memory management determines how well a program runs. The better and more effectively memory is allocated and managed, the smoother and faster the program operates. This is particularly important in embedded systems or any system with limited resources.

The Software Engineering Institute (SEI) has established a rule for memory management to maximize portability, performance, and safety. The information below provides a summarized version of this documentation (Linked above).

Stack Memory vs. Heap Memory

Before delving into the SEI documentation, it's essential to understand the differences between stack memory and heap memory. If you're already familiar with these concepts, feel free to skip ahead as this will be a review.

Stack memory is allocated and managed automatically during runtime. It follows a Last-In, First-Out (LIFO) data structure, meaning the most recently added item is the first one to be removed. Stack memory is created and removed by the system as functions are called and return. When a function is called, a new stack frame containing local variables, parameters, and the return address is created. Upon the function's return, the stack frame is deallocated, and the memory is freed. Stack memory is typically faster than heap memory but has more limited size, making it unsuitable for large data structures or objects with a long lifetime.

Heap memory, on the other hand, is dynamically allocated during runtime and must be manually freed by the programmer when no longer needed. Unlike stack memory, heap memory does not follow a specific order, allowing for greater flexibility when allocating and deallocating memory blocks. Heap memory is managed through functions such as malloc, calloc, and realloc, among others. These functions allow developers to request memory of any size at runtime, making it suitable for large data structures or objects with unpredictable lifetimes. However, it is the developer's responsibility to free the allocated memory to avoid memory leaks. Heap memory is generally slower than stack memory due to its non-contiguous allocation and the overhead of memory management functions.

Learning Checklist

After reading the following section you will gain knowledge on:

1. Proper memory allocation using functions like malloc, calloc, and realloc

2. Importance of memory deallocation using the free function

3. Avoiding common pitfalls like double-free and use-after-free issues

4. Identifying and preventing memory leaks

5. Proper usage of NULL pointers and avoiding NULL pointer dereferences

Key Concepts and Best Practices in Memory Management

Memory Allocation

When using malloc(), calloc(), realloc(), or aligned_alloc(), it is crucial to allocate enough memory to store the objects properly. Additionally, make sure size arguments are not exposed to users, as attackers could manipulate them to cause buffer overflows. While this might seem basic, real-world situations can make it challenging to spot and diagnose such issues. Consider the following example of a program allocating memory to store a 'person' object:

#include <stdlib.h>
#include <time.h>

struct person {
  int age;
  char name[];
};

void f(void)
{
  struct person *p;
  p = (struct person *) malloc(sizeof(p));
  if (p == NULL) {
    return NULL;
  }

  *p = (struct person) { .age=25, .name="Charles" };
}

In the example above, insufficient space is allocated for a 'struct person' object because the size of the pointer is used instead of the size of the pointed-to object. The correct malloc() line should be:

p = (struct person *) malloc(sizeof(*p));

Although it may seem like we're passing a dereferenced uninitialized pointer, it's acceptable in this context. The sizeof() operator doesn't evaluate its operand, so dereferencing an uninitialized or null pointer here is well-defined behavior.

Another error in the code above relates to flexible array members, which could lead to a buffer overflow vulnerability when trying to access the 'name' member. To fix this, allocate memory for the flexible array member in the malloc() call like this:

p = (struct person *) malloc(sizeof(*p) + (STRSIZE + 1) * sizeof(char));

Additionally, use functions like memcpy() to copy flexible array members, pass structures with flexible array members to functions using pointers, and avoid storing them on the stack.

Finally, be cautious when using the realloc() function, especially with allocated objects that have stricter alignment requirements than those guaranteed by malloc(). The C standard only requires realloc() to return a pointer with fundamental alignment. For a more detailed example, refer to the MEM36-C section in the source article also linked here.

Memory Deallocation

When working with memory management, it's crucial to be cautious while freeing memory. Some common issues include accessing freed memory through a pointer, which can involve dereferencing the pointer, using it in arithmetic operations, type casting it, or using it as the right-hand side of an assignment. Accessing freed memory through any of these methods results in undefined behavior. Such pointers, known as dangling pointers, must be avoided.

Dereferencing dangling pointers is dangerous because their values are indeterminate and might be trap representations. A trap representation is an object representation that does not represent a value of the object type. Fetching a trap representation may cause a hardware trap, an event that occurs when the processor detects an abnormal condition or a specific event requiring immediate attention. The processor temporarily halts the current execution flow, handles the situation, and resumes normal operation. Fetching a trap representation can trigger a hardware trap because the processor may recognize an invalid bit pattern during program execution, causing a halt. If exploited by an attacker, this could lead to a Denial of Service (DoS) vulnerability and affect the system.

Care must also be taken when freeing complex data structures, such as linked lists. Improperly freeing a node can leave a trail of dangling pointers. The example below, provided by Brian Kernighan and Dennis Ritchie, demonstrates an incorrect way to free memory associated with a linked list. In this example, p is freed before p->next is executed, causing `p->next` to read memory that has already been freed.

#include <stdlib.h>

struct node {
  int value;
  struct node *next;
};

void free_list(struct node *head) {
  for (struct node *p = head; p != NULL; p = p->next) {
    free(p);
  }
}

The correct solution is to store a reference to p->next in a temporary pointer q before freeing p.

#include <stdlib.h>

struct node {
  int value;
  struct node *next;
};

void free_list(struct node *head) {
  struct node *q;
  for (struct node *p = head; p != NULL; p = p->next) {
    q = p->next; 
    free(p);
  }
}

Additionally, always free dynamically allocated memory as soon as it is no longer needed. One exception to this is when allocated memory is assigned to a pointer whose lifetime includes program termination. In such cases, the memory doesn't need to be freed.

Be careful not to accidentally free memory that wasn't dynamically allocated, as it can result in undefined behavior, heap corruption, or other severe errors. The same issue can occur if realloc() is supplied a pointer to non-dynamically allocated memory. This doesn't apply to null pointers since the C standard guarantees that if free() is passed a null pointer, no action occurs.

The example below illustrates erroneous use of realloc(), where the pointer parameter buf doesn't refer to dynamically allocated memory.

#include <stdlib.h>

enum { BUFSIZE = 256 };

void f(void) {
  char buf[BUFSIZE];
  char *p = (char *) realloc(buf, 2 * BUFSIZE);
  if (p == NULL) {
    // Handle error
  }
}

The solution is to ensure that buf is allocated dynamically. In real-world programs, it can be easy to mistakenly free memory that wasn't dynamically allocated, leave dangling pointers, or incorrectly size allocations for objects. Always keep track of allocations and free them as needed throughout your program. There are tools to assist with various aspects of memory management, including:

Valgrind: an open-source memory debugging tool that can detect memory leaks, buffer overflows, and other memory related issues in C/C++ programs.
AddressSanitizer(ASan): A fast memory error detector built into LLVM/Clang and GCC compilers.
LeakSanitizer(LSan): A memory leak detector integrated with AddressSanitizer.
Dr. Memory: A memory monitoring tool for Windows, Linux, and Mac that can detect memory leaks, uninitialized memory reads, and other memory-related issues in C/C++

Quiz Yourself

Question 1: What are the key differences between stack memory and heap memory?

a) Stack memory is dynamically allocated, while heap memory is automatically managed.

b) Stack memory follows a LIFO data structure, while heap memory does not have a specific order.

c) Heap memory is faster than stack memory.

d) Stack memory is suitable for large data structures, while heap memory is not.

Question 2: What is a trap representation?

a) A trap representation is a representation of a C program that is difficult to debug.

b) A trap representation is an object representation that does not represent a value of the object type.

c) A trap representation is a way to allocate memory in the heap.

d) A trap representation is a pointer to memory that has been deallocated.

Question 3: What is the primary reason to avoid accidentally freeing memory that wasn't dynamically allocated?

a) It increases program performance.

b) It simplifies the program's code structure.

c) It reduces the memory footprint of the program.

d) It results in undefined behavior and can cause heap corruption or other serious errors.

Correct answers: B, B, D