$ cat a.c
extern int foo;

int function(void) {
    return foo;
$ gcc -c a.c
$ readelf --relocs ./a.o

Relocation section '.rel.text' at offset 0x2dc contains 1 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000004  00000801 R_386_32          00000000   foo

在编译生成a.o文件的时候,编译器并不知道符号foo的值,所以产生一个重定位记录,表示“在最后的二进制文件中,把符号foo的地址填入偏移量为4的地方(相对于text 区而言)”。如果你观察下a.o的汇编结果,你就会发现在text区偏移量为4的地方,有4个字节为0,这四个字节最终将会填入真实的地址。

$ objdump --disassemble ./a.o

./a.o:     file format elf32-i386

Disassembly of section .text:

00000000 <function>:
   0:    55         push   %ebp
   1:    89 e5                  mov    %esp,%ebp
   3:    a1 00 00 00 00         mov    0x0,%eax
   8:    5d                     pop    %ebp
   9:    c3                     ret
$ readelf --headers /bin/ls
ELF Header:
  Entry point address:               0x8049bb0

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x08048000 0x08048000 0x16f88 0x16f88 R E 0x1000
  LOAD           0x016f88 0x0805ff88 0x0805ff88 0x01543 0x01543 RW  0x1000

This is fine for an executable, because each time you start a new process (fork andexec) you have your own fresh address space. Thus it is a considerable time saving to pre-calculate addresses from and have them fixed in the final output (you can make position-independent executables, but that‘s another story).


$ readelf --headers /lib/libc.so.6
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00000000 0x00000000 0x236ac 0x236ac R E 0x1000
  LOAD           0x023edc 0x00024edc 0x00024edc 0x0015c 0x001a4 RW  0x1000
共享库还有第二个目的,代码分享。如果有一百个进程使用一个共享库,就没有必要在内存中产生100分代码拷贝。如果代码是完全只读,并且永远不会修改,那么每一个进程就可以分享相同的代码。然而,对于共享库有一个约束:对于每一个进程都必须有一份自己的数据实例。从头文件信息中也可以看到数据段相对于代码段有一个固定的偏移量。所以访问数据段的算法是很简单的:访问数据地址 = 当前地址+ 固定偏移。


$ cat test.c
static int foo = 100;

int function(void) {
    return foo;
$ gcc -fPIC -shared -o libtest.so test.c
000000000000056c <function>:
 56c:        55         push   %rbp
 56d:        48 89 e5               mov    %rsp,%rbp
 570:        8b 05 b2 02 20 00      mov    0x2002b2(%rip),%eax        # 200828 <foo>
 576:        5d                     pop    %rbp
0000040c <function>:
 40c:    55         push   %ebp
 40d:    89 e5                  mov    %esp,%ebp
 40f:    e8 0e 00 00 00         call   422 <__i686.get_pc_thunk.cx>
 414:    81 c1 5c 11 00 00      add    $0x115c,%ecx
 41a:    8b 81 18 00 00 00      mov    0x18(%ecx),%eax
 420:    5d                     pop    %ebp
 421:    c3                     ret

00000422 <__i686.get_pc_thunk.cx>:
 422:    8b 0c 24       mov    (%esp),%ecx
 425:    c3                     ret
这里的魔数是__i686.get_pc_thunk.cx。i386不允许我们得到当前指令的地址,但是我们可以得到一个已知的固定地址——__i686.get_pc_thunk.cx的值,cx中的值是call的返回地址,这里是0x414.我们做一个简单的算术:0x115c+0x414 = 0x1570.最终的数据和0x1588偏移了0x18个字节,查看汇编代码:
00001588 <global>:
    1588:       64 00 00                add    %al,%fs:(%eax)




$ cat test.c
extern int foo;

int function(void) {
    return foo;
$ gcc -shared -fPIC -o libtest.so test.c
$ objdump --disassemble libtest.so
00000000000005ac <function>:
 5ac:        55         push   %rbp
 5ad:        48 89 e5               mov    %rsp,%rbp
 5b0:        48 8b 05 71 02 20 00   mov    0x200271(%rip),%rax        # 200828 <_DYNAMIC+0x1a0>
 5b7:        8b 00                  mov    (%rax),%eax
 5b9:        5d                     pop    %rbp
 5ba:        c3                     retq

$ readelf --sections libtest.so
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [20] .got              PROGBITS         0000000000200818  00000818
       0000000000000020  0000000000000008  WA       0     0     8

$ readelf --relocs libtest.so
Relocation section '.rela.dyn' at offset 0x418 contains 5 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000200828  000400000006 R_X86_64_GLOB_DAT 0000000000000000 foo + 0
反汇编的结果显示返回值位于当前指令偏移0x200271处:0x0200828。查看section header,这个地址位于.got区。接着我们查看重定位记录,可以发现有一个类型为R_X86_64_GLOB_DAT的重定位的意思是“找到foo的值,然后把它放在地址0x200828处”。

So, when this library is loaded, the dynamic loader will examine the relocation, go and find the value offoo and patch the.got entry as required. When it comes time for the code loads to load that value, it will point to the right place and everything just works; without having to modify any code values and thus destroy code sharability.


以上是数据的处理,那么函数调用呢?函数调用的中间层称之为procedure linkage table 或者PLT.代码不会直接调用外部的函数,而是通过一个plt stub。

$ cat test.c
int foo(void);

int function(void) {
    return foo();
$ gcc -shared -fPIC -o libtest.so test.c

$ objdump --disassemble libtest.so
00000000000005bc <function>:
 5bc:        55         push   %rbp
 5bd:        48 89 e5               mov    %rsp,%rbp
 5c0:        e8 0b ff ff ff         callq  4d0 <foo@plt>
 5c5:        5d                     pop    %rbp

$ objdump --disassemble-all libtest.so
00000000000004d0 <foo@plt>:
 4d0:   ff 25 82 03 20 00       jmpq   *0x200382(%rip)        # 200858 <_GLOBAL_OFFSET_TABLE_+0x18>
 4d6:   68 00 00 00 00          pushq  $0x0
 4db:   e9 e0 ff ff ff          jmpq   4c0 <_init+0x18>

$ readelf --relocs libtest.so
Relocation section '.rela.plt' at offset 0x478 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000200858  000400000007 R_X86_64_JUMP_SLO 0000000000000000 foo + 0


$ objdump --disassemble-all libtest.so

Disassembly of section .got.plt:

0000000000200840 <.got.plt>:
  200840:       98                      cwtl
  200841:       06                      (bad)
  200842:       20 00                   and    %al,(%rax)
  200858:       d6                      (bad)
  200859:       04 00                   add    $0x0,%al
  20085b:       00 00                   add    %al,(%rax)
  20085d:       00 00                   add    %al,(%rax)
  20085f:       00 e6                   add    %ah,%dh
  200861:       04 00                   add    $0x0,%al
  200863:       00 00                   add    %al,(%rax)
  200865:       00 00                   add    %al,(%rax)
00000000000004c0 <foo@plt-0x10>:
 4c0:   ff 35 82 03 20 00       pushq  0x200382(%rip)        # 200848 <_GLOBAL_OFFSET_TABLE_+0x8>
 4c6:   ff 25 84 03 20 00       jmpq   *0x200384(%rip)        # 200850 <_GLOBAL_OFFSET_TABLE_+0x10>
 4cc:   0f 1f 40 00             nopl   0x0(%rax)

What‘s going on here? What‘s actually happening is lazy binding — by convention when the dynamic linker loads a library, it will put an identifier and resolution function into known places in the GOT. Therefore, what happens is roughly this: on the first call of a function, it falls through to call the default stub, which loads the identifier and calls into the dynamic linker, which at that point has enough information to figure out "hey, thislibtest.so is trying to find the function foo". It will go ahead and find it, and then patch the address into the GOT such that thenext time the original PLT entry is called, it will load the actual address of the function, rather than the lookup stub. Ingenious!

Out of this indirection falls another handy thing — the ability to modify the symbol binding order.LD_PRELOAD, for example, simply tells the dynamic loader it should insert a library as first to be looked-up for symbols; therefore when the above binding happens if the preloaded library declares afoo, it will be chosen over any other one provided.

In summary — code should be read-only always, and to make it so that you can still access data from other libraries and call external functions these accesses are indirected through a GOT and PLT which live at compile-time known offsets.
