When the NULL Pointer is Valid

James Smith

2023-11-15

Accessing a NULL pointer usually results in a segfault, which is familiar to anyone with experience in C/C++. However, while working on a toy compiler, I discovered that there are ways to make the zero address a valid and usable address on Linux.

01. The ELF and the Program Loader

ELF, which stands for Executable and Linkable Format, is the standard binary executable file format for Unix and Unix-like systems. One of the most important functions of the file format is to define the virtual address where the program is to be loaded. The corresponding structure is called a program header. Each program header is an instruction to the kernel to map a chunk of data to a specific virtual address. You may be familiar with the various sections such as .text, .data, and etc in a typical ELF executable, how these sections are loaded into memory is defined in the program headers.

While working on the compiler, I tried to change the virtual address in the program header to zero to see what happens. This should create a memory mapping starting at the virtual address zero. To my surprise, this just works, accessing the NULL pointer no longer segfaults.

$ cat samples/null_deref.txt
(peek (cast (ptr int) 0))   ;; dereference the NULL pointer
$ ./pl_comp.py -o hello --vaddr=0 ./samples/null_deref.txt
$ ./hello   # no segfaults

The objdump command shows that we have indeed created the zero address mapping and dereferenced the NULL pointer.

hello:     file format elf64-x86-64
hello
architecture: i386:x86-64, flags 0x00000102:
EXEC_P, D_PAGED
start address 0x0000000000000080

Program Header:
    LOAD off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**12
         filesz 0x00000000000000e0 memsz 0x00000000000000e0 flags r-x

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
SYMBOL TABLE:
no symbols

Disassembly of section .data:
...
  d0:   31 c0                   xor    eax,eax
  d2:   48 89 03                mov    QWORD PTR [rbx],rax
...

02. `mmap` and the Zero Address Page

I dug further after the above experiment. It appears that the zero virtual address can also be obtained via the mmap syscall on Linux. However, there is a sysctl knob that prevents applications from doing this. The knob is vm.mmap_min_addr.

Only when vm.mmap_min_addr=0 can applications map the zero virtual address. This restriction is probably set to detect NULL pointer dereferences (by segfaulting). But the interesting thing is that this restriction is bypassed by the ELF program header trick we used above.

You may wonder why applications want to mmap the zero page. It appears that emulators use it, such as WINE and dosemu2.

03. Conclusion

I gained some useless knowledge.

The NULL pointer can be valid.
mmap can map the zero virtual address, but is restricted by the sysctl knob vm.mmap_min_addr.
ELF program headers can map the zero virtual address without restriction.

Welcome to build-your-own.org.

A website for free educational software development materials.

[SUBSCRIBE]

for updates and new books.

Build Your Own X From Scratch Book Series:

01. The ELF and the Program Loader

02. mmap and the Zero Address Page

03. Conclusion

02. `mmap` and the Zero Address Page