Skip to content

0x331 Linker

This section is for linker and object files

1. Foundation

Concept (definition vs declaration)

  • definition: association of a name with its implementation
  • declaration: a definition of the name exists somewhere in the program (e.g: extern keyword in C)

Concept (global vs local vs static)

  • global: global existence (exists for the whole lifetime of the program) + global visibility (accessible everywhere)
  • local: local existence + local visibility
  • static: global existence + local visibility

1.1. Name Mangling

C: no name mangling

C++: name get mangled with info of class name, argument, return value. use extern keyword to link with C compiled objects

2. ELF

  • _start is known to the linker ld (in Linux) as the default entrypoint symbol (another symbol can be used) and is not called.
// This is an example to show how _start works
// The following program can be compiled by gcc -c program.c && ld program.o
// gcc program.c will fail because _start is automatically created by gcc when creating executable elf
// executing this binary, and echo $? will return 23.

extern "C" void _start(){
    asm("mov $60, %eax\n"      // syscall 60 on x86-64 is sys_exit
        "mov $23, %edi\n"      // return val
        "syscall\n");
}
  • main or main (osx) or main (OpenWatcom) is known to the C language (glibc), and is called by "startup code" which is "usually" linked to.
  • following is the startup code generated by assembler (objdump -d)
// int main() { return 23; }

00000000004004f0 <_start>:
  4004f0:   31 ed                   xor    %ebp,%ebp
  4004f2:   49 89 d1                mov    %rdx,%r9
  4004f5:   5e                      pop    %rsi
  4004f6:   48 89 e2                mov    %rsp,%rdx
  4004f9:   48 83 e4 f0             and    $0xfffffffffffffff0,%rsp
  4004fd:   50                      push   %rax
  4004fe:   54                      push   %rsp
  4004ff:   49 c7 c0 80 06 40 00    mov    $0x400680,%r8
  400506:   48 c7 c1 10 06 40 00    mov    $0x400610,%rcx
  40050d:   48 c7 c7 f8 05 40 00    mov    $0x4005f8,%rdi // this param is the actual enter point to main
  400514:   e8 c7 ff ff ff          callq  4004e0 <__libc_start_main@plt>
  400519:   f4                      hlt
  40051a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

00000000004005f8 <main>:
  4005f8:   55                      push   %rbp
  4005f9:   48 89 e5                mov    %rsp,%rbp
  4005fc:   b8 17 00 00 00          mov    $0x17,%eax
  400601:   5d                      pop    %rbp
  400602:   c3                      retq
  400603:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
  40060a:   00 00 00
  40060d:   0f 1f 00                nopl   (%rax)

2.1. Sections

  • data (D/d) : definition of initialized global variables
  • code (text) (T/t) : definition of functions
  • bss (B/b): definition of uninitialized global variables

3. Dynamic Linking

  • windows: DLL
  • linux: so
  • mac: dylib

4. Commands

4.1. Linux

  • nm: print symbols
  • lsof -p : check linked shared libraries

4.2. Windows

dumpbin: print symbols

5. Reference

[1] Introduction to linker