0x333 Linker

Linker

This section is for linker and object files

Basic

definition vs declaration

  • definition: association of a name with its implementation
  • declaration: a definition of the name exists somewhere in the program (e.g: extern keyword in C)

global vs local vs static

  • global: global existence (exists for the whole lifetime of the program) + global visibility (accessible everywhere)
  • local: local existence + local visibility
  • static: global existence + local visibility

Name Mangling

C: no name mangling

C++: name get mangled with info of class name, argument, return value. use extern keyword to link with C compiled objects

ELF

  • _start is known to the linker ld (in Linux) as the default entrypoint symbol (another symbol can be used) and is not called.
// This is an example to show how _start works
// The following program can be compiled by gcc -c program.c && ld program.o
// gcc program.c will fail because _start is automatically created by gcc when creating executable elf
// executing this binary, and echo $? will return 23.

extern "C" void _start(){
    asm("mov $60, %eax\n"      // syscall 60 on x86-64 is sys_exit
        "mov $23, %edi\n"      // return val
        "syscall\n");
}
  • main or _main (osx) or main_ (OpenWatcom) is known to the C language (glibc), and is called by “startup code” which is “usually” linked to.
  • following is the startup code generated by assembler (objdump -d)
// int main() { return 23; }

00000000004004f0 <_start>:
  4004f0:	31 ed                	xor    %ebp,%ebp
  4004f2:	49 89 d1             	mov    %rdx,%r9
  4004f5:	5e                   	pop    %rsi
  4004f6:	48 89 e2             	mov    %rsp,%rdx
  4004f9:	48 83 e4 f0          	and    $0xfffffffffffffff0,%rsp
  4004fd:	50                   	push   %rax
  4004fe:	54                   	push   %rsp
  4004ff:	49 c7 c0 80 06 40 00 	mov    $0x400680,%r8
  400506:	48 c7 c1 10 06 40 00 	mov    $0x400610,%rcx
  40050d:	48 c7 c7 f8 05 40 00 	mov    $0x4005f8,%rdi // this param is the actual enter point to main
  400514:	e8 c7 ff ff ff       	callq  4004e0 <__libc_start_main@plt>
  400519:	f4                   	hlt
  40051a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)

00000000004005f8 <main>:
  4005f8:	55                   	push   %rbp
  4005f9:	48 89 e5             	mov    %rsp,%rbp
  4005fc:	b8 17 00 00 00       	mov    $0x17,%eax
  400601:	5d                   	pop    %rbp
  400602:	c3                   	retq
  400603:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  40060a:	00 00 00
  40060d:	0f 1f 00             	nopl   (%rax)

Sections

  • data (D/d) : definition of initialized global variables
  • code (text) (T/t) : definition of functions
  • bss (B/b): definition of uninitialized global variables (

Dynamic Linking

  • windows: DLL
  • linux: so
  • mac: dylib

Commands

Linux

  • nm: print symbols
  • lsof -p <pid>: check linked shared libraries

Windows

  • dumpbin: print symbols

Reference