Accessing a NULL pointer usually results in a segfault, which is familiar to anyone with experience in C/C++. However, while working on a toy compiler, I discovered that there are ways to make the zero address a valid and usable address on Linux.
01. The ELF and the Program Loader
ELF, which stands for Executable and Linkable Format, is the standard binary executable file format for Unix and Unix-like systems. One of the most important functions of the file format is to define the virtual address where the program is to be loaded. The corresponding structure is called a program header. Each program header is an instruction to the kernel to map a chunk of data to a specific virtual address. You may be familiar with the various sections such as .text, .data, and etc in a typical ELF executable, how these sections are loaded into memory is defined in the program headers.
While working on the compiler, I tried to change the virtual address in the program header to zero to see what happens. This should create a memory mapping starting at the virtual address zero. To my surprise, this just works, accessing the NULL pointer no longer segfaults.
$ cat samples/null_deref.txt
(peek (cast (ptr int) 0)) ;; dereference the NULL pointer
$ ./pl_comp.py -o hello --vaddr=0 ./samples/null_deref.txt
$ ./hello # no segfaults
The objdump
command shows that we have indeed created
the zero address mapping and dereferenced the NULL pointer.
hello: file format elf64-x86-64
hello
architecture: i386:x86-64, flags 0x00000102:
EXEC_P, D_PAGED
start address 0x0000000000000080
Program Header:
LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**12
filesz 0x00000000000000e0 memsz 0x00000000000000e0 flags r-x
Sections:
Idx Name Size VMA LMA File off Algn
SYMBOL TABLE:
no symbols
Disassembly of section .data:
...
d0: 31 c0 xor eax,eax
d2: 48 89 03 mov QWORD PTR [rbx],rax ...
02. mmap
and the
Zero Address Page
I dug further after the above experiment. It appears that the zero
virtual address can also be obtained via the mmap
syscall
on Linux. However, there is a sysctl
knob that prevents
applications from doing this. The knob is vm.mmap_min_addr
.
Only when vm.mmap_min_addr=0
can applications map the
zero virtual address. This restriction is probably set to detect NULL
pointer dereferences (by segfaulting). But the interesting thing is that
this restriction is bypassed by the ELF program header trick we used
above.
You may wonder why applications want to mmap
the zero
page. It appears that emulators use
it, such as WINE and
dosemu2.
03. Conclusion
I gained some useless knowledge.
- The NULL pointer can be valid.
mmap
can map the zero virtual address, but is restricted by thesysctl
knobvm.mmap_min_addr
.- ELF program headers can map the zero virtual address without restriction.