0x331 Linker
This section is for linker and object files
1. File Format
The first edition of UNIX was using a.out
format on PDP machines.
1.1. COFF
COFF (Common Object File Format) is an old format replacing a.out
format. It has been largely replaced by ELF on Unix-like systems and by PE on Windows systems.
1.2. ELF
_start
is known to the linker ld (in Linux) as the default entrypoint symbol (another symbol can be used) and is not called.
// This is an example to show how _start works
// The following program can be compiled by gcc -c program.c && ld program.o
// gcc program.c will fail because _start is automatically created by gcc when creating executable elf
// executing this binary, and echo $? will return 23.
extern "C" void _start(){
asm("mov $60, %eax\n" // syscall 60 on x86-64 is sys_exit
"mov $23, %edi\n" // return val
"syscall\n");
}
- main or main (osx) or main (OpenWatcom) is known to the C language (glibc), and is called by "startup code" which is "usually" linked to.
- following is the startup code generated by assembler (objdump -d)
// int main() { return 23; }
00000000004004f0 <_start>:
4004f0: 31 ed xor %ebp,%ebp
4004f2: 49 89 d1 mov %rdx,%r9
4004f5: 5e pop %rsi
4004f6: 48 89 e2 mov %rsp,%rdx
4004f9: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
4004fd: 50 push %rax
4004fe: 54 push %rsp
4004ff: 49 c7 c0 80 06 40 00 mov $0x400680,%r8
400506: 48 c7 c1 10 06 40 00 mov $0x400610,%rcx
40050d: 48 c7 c7 f8 05 40 00 mov $0x4005f8,%rdi // this param is the actual enter point to main
400514: e8 c7 ff ff ff callq 4004e0 <__libc_start_main@plt>
400519: f4 hlt
40051a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
00000000004005f8 <main>:
4005f8: 55 push %rbp
4005f9: 48 89 e5 mov %rsp,%rbp
4005fc: b8 17 00 00 00 mov $0x17,%eax
400601: 5d pop %rbp
400602: c3 retq
400603: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40060a: 00 00 00
40060d: 0f 1f 00 nopl (%rax)
1.2.1. Sections
sections are used at link time
- data (D/d) : definition of initialized global variables
- code (text) (T/t) : definition of functions
- bss (B/b): definition of uninitialized global variables
1.2.2. Segments
segments are used at runtime
1.3. PE
1.3.1. COFF header
2. Linker
Concept (definition vs declaration)
- definition: association of a name with its implementation
- declaration: a definition of the name exists somewhere in the program (e.g: extern keyword in C)
Concept (global vs local vs static)
- global: global existence (exists for the whole lifetime of the program) + global visibility (accessible everywhere)
- local: local existence + local visibility
- static: global existence + local visibility
2.1. Name Mangling
C: no name mangling
C++: name get mangled with info of class name, argument, return value. use extern keyword to link with C compiled objects
3. Loader
4. Linux
4.1. SO
shared library
Commands:
- nm: print symbols
- lsof -p
: check linked shared libraries
5. Windows
5.1. DLL
windows, watch this Youtube tutorial
link.exe
from MSVC combines object files (obj) into executable (exe). See reference here
Command:
- dumpbin: print symbols
APIs:
- loadlibraryexw
6. OSX
6.1. dylib
osx