0x322 Memory

1. Virtual Memory
2. Segmentation translation
- 2.1. Paging translation
3. Pages
- 3.1. Zones
- 3.2. Allocation
4. Kernel API
- 4.1. memory management
5. POSIX
- 5.1. Shared Memory
- 5.2. Memory Allocation
6. Reference

Suppose we are running multiple processes concurrently, how to assign the memory space?

One way is to give the entire memory to each process, we save the entire memory to the disk when switching process, however, the switching cost is too high.

The other way is to have all the process reside concurrently in the same memory space. This makes protection an important issue. Additionally, to make it easy for users to access memory, we need to introduce the memory virtualization.

There are 3 goals in designing virtualizations:

transparency: the program should not be aware of the fact that memory is virtualized.
efficiency: it should be efficient in terms of time and space
protection it should protect processes from other processes by enabling isolation.

1. Virtual Memory

There are three types of memory kernel is handling.

logical address: used in the instruction set, consists of segment + offset. This address is visible in user space.
linear address (virtual address): the address translated from logical address by segmentation unit
phyiscal address: actual address in the memory cell, translated from linear address by paging unit. Kernel is responsible to setup segmentation and paging to perform virtual memory translation. (Actual translation is done by MMU and TLB). Although segmentation looks not enabled in current Desktop OS, because segment registers (cs, ds, ss(stack segment)...) are always set to 0.

Kernel in real mode and protected mode handles two translation differently. By the way, protected mode is controlled by PE bit on CR0 register in x86.

2. Segmentation translation

real-mode: translation rule is segment address * 16 + offset (20bit). The entire memory space is 1MB (20bit), each segment memory space is 64KB.

protected mode: segment register stores segment selector instead of address. Segment selector is used to select segment descriptor on GDT (global descriptor table), linear address and size of the segment are retrieved from segment descriptor.

2.1. Paging translation

On 32-bit arch, the address space is 0x00000000 - 0xffffffff

user space range 0x00000000 - 0xbfffffff
kernel space is 0xc0000000 - 0xffffffff

3. Pages

3.1. Zones

3.2. Allocation

kmalloc/kfree
vmalloc

4. Kernel API

4.1. memory management

copy_from_user: may block due to page fault copy_to_user

5. POSIX

This section describes memory related API in user space

5.1. Shared Memory

API (mmap, munmap (2)) mmap can select whether memory is private (MAP_PRIVATE) or shared (MAP_SHARED)

mmap can map file or map anonymous memory (an option for memory allocation).
offset and addr should be page aligned in linux, length will be rounded up to a multiple of page size (BTW, page size can be retrieved by getpagesize(2))

API (mprotect, madvise, mlock, msync (2))

mprotect: change protection (PROT_READ, PROT_WRITE, PROT_EXEC) of a region
madvise: tell OS the expectedd read pattern to make good guess. (e.g: random or sequence)
mlock: lock the region to prevent from being swapped out.
msync: force memory to be written into file (sync or async)

5.2. Memory Allocation

API (brk, sbrk (2)) heap allocation system call, it change current program break.

brk specify the new program break address. brk(0) returns current program break
sbrk specify increments

API (malloc, free (3)) memory allocation can be implemented with brk, sbrk or mmap.

free memory blocks are managed (as a linked list) in user space to reduce syscalls

malloc first search empty blocks in current memory lists. If found, return the block and mark as used. If not found, call brk or sbrk to allocate new memory. This is to prevent issuing system calls. free will return the block to managed memory lists (in user space) without calling brk or sbrk.

first fit strategy: implementation used K&R amd malloc in embedded systems. find the first block whose size is enough. The problem is memory fragmentation.
best fit strategy: glibc malloc implementation.

API (calloc, realloc (3))

calloc: malloc with initialization
realloc: can be used in vector, map implementations

API (memalign, posix_memalign (3)) allocate memory with a specific alignment. useful for SSE, AVX...

API (alloca (3)) allocate memory on stack

6. Reference

[1] Kerrisk, Michael. The Linux programming interface: a Linux and UNIX system programming handbook. No Starch Press, 2010.