0x331 Linker
This section is for linker and object files
1. Foundation
Concept (definition vs declaration)
- definition: association of a name with its implementation
- declaration: a definition of the name exists somewhere in the program (e.g: extern keyword in C)
Concept (global vs local vs static)
- global: global existence (exists for the whole lifetime of the program) + global visibility (accessible everywhere)
- local: local existence + local visibility
- static: global existence + local visibility
1.1. Name Mangling
C: no name mangling
C++: name get mangled with info of class name, argument, return value. use extern keyword to link with C compiled objects
2. ELF
- _start is known to the linker ld (in Linux) as the default entrypoint symbol (another symbol can be used) and is not called.
// This is an example to show how _start works
// The following program can be compiled by gcc -c program.c && ld program.o
// gcc program.c will fail because _start is automatically created by gcc when creating executable elf
// executing this binary, and echo $? will return 23.
extern "C" void _start(){
asm("mov $60, %eax\n" // syscall 60 on x86-64 is sys_exit
"mov $23, %edi\n" // return val
"syscall\n");
}
- main or main (osx) or main (OpenWatcom) is known to the C language (glibc), and is called by "startup code" which is "usually" linked to.
- following is the startup code generated by assembler (objdump -d)
// int main() { return 23; }
00000000004004f0 <_start>:
4004f0: 31 ed xor %ebp,%ebp
4004f2: 49 89 d1 mov %rdx,%r9
4004f5: 5e pop %rsi
4004f6: 48 89 e2 mov %rsp,%rdx
4004f9: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
4004fd: 50 push %rax
4004fe: 54 push %rsp
4004ff: 49 c7 c0 80 06 40 00 mov $0x400680,%r8
400506: 48 c7 c1 10 06 40 00 mov $0x400610,%rcx
40050d: 48 c7 c7 f8 05 40 00 mov $0x4005f8,%rdi // this param is the actual enter point to main
400514: e8 c7 ff ff ff callq 4004e0 <__libc_start_main@plt>
400519: f4 hlt
40051a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
00000000004005f8 <main>:
4005f8: 55 push %rbp
4005f9: 48 89 e5 mov %rsp,%rbp
4005fc: b8 17 00 00 00 mov $0x17,%eax
400601: 5d pop %rbp
400602: c3 retq
400603: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40060a: 00 00 00
40060d: 0f 1f 00 nopl (%rax)
2.1. Sections
- data (D/d) : definition of initialized global variables
- code (text) (T/t) : definition of functions
- bss (B/b): definition of uninitialized global variables
3. Dynamic Linking
- windows: DLL
- linux: so
- mac: dylib
4. Commands
4.1. Linux
- nm: print symbols
- lsof -p
: check linked shared libraries
4.2. Windows
dumpbin: print symbols