As C/C++/Rust/OCaml/(insert other language usually compiled to native code here) programmers, when writing code targeting Linux, our toolchains usually produce as final output, a dynamically linked binary in the ELF format. While this covers almost all use cases when writing code meant to run in userspace, there are some circumstances in which it is necessary to write code that runs without the dynamic linker present.
In this article, we’ll try to write a standard hello world program in C for an x86-64 Linux system, but with the catch that our compiled ELF binary can’t make use of the runtime dynamic linker. As will be seen, this entails a range of interesting challenges not normally encountered when writing userspace code.
There are two related but distinct programs commonly referred to as “the linker” in the context of ELF binaries, one of which operates at compile time and one of which operates at runtime.
The first of these two programs is
ld, provided by GNU
binutils, which is responsible for
resolving cross-file and dynamic library references in object files at compile
time and “linking” them together with these references resolved to produce a
single executable binary or shared object. In most circumstances,
invoked under the hood when running
gcc as the last step of building a binary.
The second entity is the
ld.so shared object (also provided by GNU binutils).
This is a shared object that is mapped into a process’s address space by the
kernel during a call to
exec(2). It is what receives initial control from the
exec(2) finishes, and identifies libraries needed by the program,
maps them into the address space, and then hands off control to the program,
later being called into from time to time when certain symbolic references need
to be resolved.
To avoid ambiguity, I will from this point forward refer to
ld as the static
ld.so as the dynamic linker.
A full discussion of the dynamic linker is way beyond the scope of this article,
however it suffices to know that the dynamic linker is what receives initial
control from the Linux kernel, what loads shared libraries required by the
program, and what gets called into when certain symbolic references (eg. a call
libc) need to be resolved at runtime.
So where is the dynamic linker specified in an ELF binary that requires it?
Dynamically linked ELF binaries have a program
INTERP), specifying the path to the dynamic linker binary to be used. This
can be seen with the
readelf utility (a part of GNU binutils), here showing
/lib64/ld-linux-x86-64.so.2 as the dynamic linker for
/bin/ls on my system.
$ readelf -l /bin/ls [...] Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000000040 0x0000000000000040 0x00000000000001f8 0x00000000000001f8 R E 0x8 INTERP 0x0000000000000238 0x0000000000000238 0x0000000000000238 0x000000000000001c 0x000000000000001c R 0x1 [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] [...]
Creating a Binary That Doesn’t Use the Dynamic Linker
Given the information above, our first task at hand is to compile an ELF binary
INTERP program header. This will result in no dynamic linker being
mapped into our process’s address space by the kernel on
exec(2), thus ensuring
that our program cannot use the dynamic linker at runtime.
Forcing the exclusion of the
INTERP section is very easy with GNU binutils
version 2.26 or later, in which the static linker supports a
--no-dynamic-linker argument. Excluding the dynamic linker in versions of
binutils before 2.26 seems to only be possible by providing the static linker
with a custom linker
script. Since I’m
using binutils 2.28, I’ll just use the
Now that we know how to compile a binary that doesn’t use the dynamic linker, let’s go ahead and try to run one compiled in this way. I’ll use the following barebones C program as an example:
We can go ahead and compile this with
gcc, ensuring that it passes
--no-dynamic-linker to the static linker as follows:
$ gcc -Wl,--no-dynamic-linker prog.c -o prog
Let’s verify that no
INTERP program header was generated using
$ readelf -l prog Elf file type is DYN (Shared object file) Entry point 0x271 There are 7 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000350 0x0000000000000350 R E 0x200000 LOAD 0x0000000000000f20 0x0000000000200f20 0x0000000000200f20 0x00000000000000e0 0x00000000000000e0 RW 0x200000 DYNAMIC 0x0000000000000f20 0x0000000000200f20 0x0000000000200f20 0x00000000000000e0 0x00000000000000e0 RW 0x8 NOTE 0x00000000000001c8 0x00000000000001c8 0x00000000000001c8 0x0000000000000024 0x0000000000000024 R 0x4 GNU_EH_FRAME 0x00000000000002b4 0x00000000000002b4 0x00000000000002b4 0x0000000000000024 0x0000000000000024 R 0x4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x10 GNU_RELRO 0x0000000000000f20 0x0000000000200f20 0x0000000000200f20 0x00000000000000e0 0x00000000000000e0 R 0x1 [...]
Looks good, let’s try to run it:
$ ./prog Segmentation Fault
Seems like we’ve got a bit more work to do.
Getting our Non-Dynamically-Linked Binary to Run
So we’ve discovered that forcing the program to run without the dynamic linker results in a segfault. Let’s recompile with debugging symbols and take a closer look at the path of execution with GDB:
$ gdb -g -Wl,--no-dynamic-linker prog.c -o prog $ gdb -q prog Reading symbols from prog...done. (gdb) run Starting program: /home/rhys/prog Program received signal SIGSEGV, Segmentation fault. 0x0000000000000000 in ?? () (gdb) backtrace #0 0x0000000000000000 in ?? () #1 0x00007ffff7dfd3ba in _start ()
If you aren’t familiar with how the dynamic linker works, this backtrace should
surprise you. Where’s the call to
main? And what is this
Well, it turns out that the
exec(2)‘ing of an ELF binary is a bit more complex
in userspace than
main just receiving control directly from the kernel. As
stated in the background section, in a dynamically linked binary, the kernel
passes initial control to the dynamic linker, which runs initialization code and
main. For an ELF binary with no
INTERP segment however (as
we’re dealing with here), control is initially handed to the function specified
ENTRY command of the linker script being used, which, in the default
linker script on most systems, has the name
_start. This can be seen with the
following command (
ld -verbose dumps the default linker script):
$ ld -verbose | grep ENTRY ENTRY(_start)
_start is defined in an object file containing the entry point (and other
bootstrap code) usually called crt0.o,
that is statically linked into the binary by GCC. These “start files” can be
manually excluded from the finished binary using GCC’s
Since we want our own function to receive control directly from the kernel, we
-nostartfiles to GCC, and rename
that it gets control directly from the kernel when
exec(2) is done.
Here’s our new program:
And our new compilation command:
$ gcc -Wl,--no-dynamic-linker -nostartfiles prog.c -o prog
Let’s give this one a shot:
$ ./prog Segmentation Fault
Another segfault, but this time with a different underlying cause. We can see
the issue at play here by taking a look at the binary’s disassembly with
$ objdump -M intel -d prog prog: file format elf64-x86-64 Disassembly of section .text: 0000000000000233 <_start>: 233: 55 push rbp 234: 48 89 e5 mov rbp,rsp 237: b8 00 00 00 00 mov eax,0x0 23c: 5d pop rbp 23d: c3 ret
Even without much knowledge of x86 assembly, this function is pretty easy to
dissect. Keep in mind the above five instructions are equivalent to a function
containing the single statement
return 0;. The first
mov pair set up
the function’s stack frame,
mov eax,0x0 sets
eax (the return value register
in the System V ABI) to 0, and the final two instructions restore the base
pointer and return to the function’s caller.
The last sentence of that description may have piqued your interest. This
function’s “caller” is the kernel, which doesn’t push a return address to the
stack before calling
_start like a normal C function. That wouldn’t make
sense as userspace code can’t just
ret itself into kernel space. This means
ret at the end of this function will grab whatever is on top of the
stack (which definitely isn’t a return address) and set
rip to that.
So what address is the final
ret instruction actually returning to? We can
find out with GDB.
$ gdb -q prog Reading symbols from prog...done. (gdb) run Starting program: /home/rhys/prog Program received signal SIGSEGV, Segmentation fault. 0x0000000000000001 in ?? ()
It’s returning to
0x0000000000000001, definitely not a valid address. As it
turns out, this is actually the value of
argc. As specified in the System V
page 29), upon userspace code receiving initial control from the kernel after
exec(2), the top eight bytes of the stack represent a little endian integer
specifying the value of
Fixing this requires making an
exit(2) syscall instead of using a
statement (which will get compiled into
pop rbp followed by
ret). This can
be done in GNU C using some inline
to manually perform an
Let’s recompile and rerun as before:
$ gcc -Wl,--no-dynamic-linker -nostartfiles prog.c -o prog $ ./prog $ echo $? 0
Success! We’ve managed to create an ELF binary that doesn’t segfault, exits with status 0 and doesn’t use the dynamic linker.
Now let’s get it to print hello world.
Printing to Standard Output Without
As previously stated, the dynamic linker resolves symbolic references to shared library code at runtime. Given this, it should be clear that doing the following will not work if our program doesn’t run with the dynamic linker present:
To verify this, let’s try to compile and run as before:
$ gcc -Wl,--no-dynamic-linker -nostartfiles prog.c -o prog $ ./prog Segmentation Fault
What’s going on here? Let’s take a look at the disassembly:
$ objdump -M intel -d prog prog: file format elf64-x86-64 Disassembly of section .plt: 00000000000002a0 <.plt>: 2a0: ff 35 62 0d 20 00 push QWORD PTR [rip+0x200d62] # 201008 <_GLOBAL_OFFSET_TABLE_+0x8> 2a6: ff 25 64 0d 20 00 jmp QWORD PTR [rip+0x200d64] # 201010 <_GLOBAL_OFFSET_TABLE_+0x10> 2ac: 0f 1f 40 00 nop DWORD PTR [rax+0x0] 00000000000002b0 <[email protected]>: 2b0: ff 25 62 0d 20 00 jmp QWORD PTR [rip+0x200d62] # 201018 <[email protected]_2.2.5> 2b6: 68 00 00 00 00 push 0x0 2bb: e9 e0 ff ff ff jmp 2a0 <.plt> Disassembly of section .text: 00000000000002c0 <_start>: 2c0: 55 push rbp 2c1: 48 89 e5 mov rbp,rsp 2c4: 48 8d 3d 18 00 00 00 lea rdi,[rip+0x18] # 2e3 <_start+0x23> 2cb: e8 e0 ff ff ff call 2b0 <[email protected]> 2d0: 48 c7 c0 3c 00 00 00 mov rax,0x3c 2d7: 48 c7 c7 00 00 00 00 mov rdi,0x0 2de: 0f 05 syscall 2e0: 90 nop 2e1: 5d pop rbp 2e2: c3 ret
The inclusion of a call to
puts caused two new symbols to be added to the
[email protected]. PLT stands for Procedure Linkage Table, and is a
dynamic linker mechanism that adds a layer of indirection to calls to functions
in external shared objects. A full discussion of how the PLT works is beyond the
scope of this article, however knowing that it works in conjunction with the
dynamic linker and requires it to be present to work properly explains why our
puts is segfaulting.
So how do we print stuff without the use of our nice
abstractions? Under the hood,
libc printing functions just make a
syscall to standard output (technically it’s usually buffered in userspace but
we can ignore this). Because of this, we can use a bit of inline assembly to
create our own primitive
_start and into
another function to make the whole thing cleaner.
Compiling and running this, we get:
$ gcc -Wl,--no-dynamic-linker -nostartfiles prog.c -o prog $ ./prog Hello World! $ echo $? 0
Phew! So there we have it, your standard hello world, in C, with no dynamic linker.
Everything written so far is all well and good, but when would you actually have to work in an environment with no dynamic linker? One possible scenario is working on or reimplementing the dynamic linker itself. Another possibility (and something I happen to be working on at the moment) is writing a binary obfuscator.
There are many ways to obfuscate executable binaries such that they’re hard to reverse engineer. One common technique is to encrypt the original binary and add a “stub” loader to it, which takes control directly from the kernel, maps the encrypted binary into memory and decrypts it on the fly during execution. Due to having to take initial control from the kernel, the stub usually runs in an environment without the dynamic linker present, requiring a variety of workarounds similar to the ones shown in this article.
I hope you’ll walk away from this article with a heightened sense of respect for the work the dynamic linker does behind the scenes for userspace code. If you like this sort of thing and are interested in reading further on the subject of binary hacking and analysis, I’m currently reading Dennis Andriesse’s excellent book on the subject and can highly recommend it.