首页 > 代码库 > FreeBSD 内核中的SYSINIT分析【转】

FreeBSD 内核中的SYSINIT分析【转】

FreeBSD?kernel是一个膨大的系统,?对于这样一个大系统,?里面往往包含了大量的子系统和??
模块,当系统初始化时这些模块就需要初始化,?按照通常的思路,这些初始化过程必须在某处??
被显式地调用,这样一来,当你新增某个模块,你必须再修改那个系统初始化的地方来调用这??
个新增模块的初始化过程,?而且由于ANSI?C语言的限制,调用某个函数最好先声明,这样当系??
统的初始化过程开始增加时,?那个调用初始化过程的文件开始大量包含那些本来不相关的头??
文件,?偶合度就增加了,?这是一种不好的设计.??

FreeBSD
为了应付这种情况,?使用一种叫做SYSINIT的机制.?我们知道FreeBSD使用一种叫做??
ELF
的二进制目标执行文件格式.?这种文件格式允许文件内部组织成结构化的方式,?文件内??
部可以由不同的组成部分(section),?FreeBSD正是利用了这种机制.???

FreeBSD
使用GNU?GCC作为其C语言编译器,?这种编译器允许在C源程序中嵌入汇编语言代码,??
FreeBSD
通过在C源程序中加入汇编指令来在目标文件中增加额外的section,?在文件??
/sys/sys/linker_set.h
中定义如下:??

#ifdef?__alpha__??
#define?MAKE_SET(set,?sym)??????????????????????????????????????????????\??
????????static?void?const?*?const?__set_##set##_sym_##sym?=?&sym;???????\??
????????__asm(".align?3");??????????????????????????????????????????????\??
????????__asm(".section?.set."?#set?",\"aw\"");?????????????????????????\??
????????__asm(".quad?"?#sym);???????????????????????????????????????????\??
????????__asm(".previous")??
#else??
#define?MAKE_SET(set,?sym)?????
#define?MAKE_SET(set,?sym)??????????????????????????????????????????????\??
????????static?void?const?*?const?__set_##set##_sym_##sym?=?&sym;???????\??
????????__asm(".section?.set."?#set?",\"aw\"");?????????????????????????\??
????????__asm(".long?"?#sym);???????????????????????????????????????????\??
????????__asm(".previous")??
#endif??
#define?TEXT_SET(set,?sym)?MAKE_SET(set,?sym)??
#define?DATA_SET(set,?sym)?MAKE_SET(set,?sym)??

程序一旦在某处调用DATA_SET宏指令,?就会将相应的汇编符号加入到目标文件.?例如:??
int?myint;??
DATA_SET(myset,?myint);??
这两句话将导致在目标文件中创建一个myset?section,?并且myint的地址将被放入这个??
section
.??

系统的初始化必须按严格的顺序进行,?为此FreeBSD定义了很多子系统的顺序号,?这些顺序??
连同SYSINIT的许多相关定义在/sys/sys/kernel.h头文件中:??

enum?sysinit_sub_id?{??
????????SI_SUB_DUMMY????????????=?0x0000000,????/*?not?executed;?for?linker*/??
????????SI_SUB_DONE?????????????=?0x0000001,????/*?processed*/??
????????SI_SUB_CONSOLE??????????=?0x0800000,????/*?console*/??
????????SI_SUB_COPYRIGHT????????=?0x0800001,????/*?first?use?of?console*/??
????????SI_SUB_TUNABLES?????????=?0x0700000,????/*?establish?tunable?values?*/??
????????SI_SUB_VM???????????????=?0x1000000,????/*?virtual?memory?system?init*/??
????????SI_SUB_KMEM?????????????=?0x1800000,????/*?kernel?memory*/??
????????SI_SUB_KVM_RSRC?????????=?0x1A00000,????/*?kvm?operational?limits*/??
????????SI_SUB_CPU??????????????=?0x1e00000,????/*?CPU?resource(s)*/??
????????SI_SUB_KLD??????????????=?0x1f00000,????/*?KLD?and?module?setup?*/??
????????SI_SUB_INTRINSIC????????=?0x2000000,????/*?proc?0*/??
????????SI_SUB_VM_CONF??????????=?0x2100000,????/*?config?VM,?set?limits*/??
????????SI_SUB_RUN_QUEUE????????=?0x2200000,????/*?the?run?queue*/??
????????SI_SUB_CREATE_INIT??????=?0x2300000,????/*?create?the?init?process?*/??
????????SI_SUB_DRIVERS??????????=?0x2400000,????/*?Let?Drivers?initialize?*/??
????????SI_SUB_CONFIGURE????????=?0x3800000,????/*?Configure?devices?*/??
????????SI_SUB_VFS??????????????=?0x4000000,????/*?virtual?file?system*/??
????????SI_SUB_CLOCKS???????????=?0x4800000,????/*?real?time?and?stat?clocks*/??
????????SI_SUB_MBUF?????????????=?0x5000000,????/*?mbufs*/??
????????SI_SUB_CLIST????????????=?0x5800000,????/*?clists*/??
????????SI_SUB_SYSV_SHM?????????=?0x6400000,????/*?System?V?shared?memory*/??
????????SI_SUB_SYSV_SEM?????????=?0x6800000,????/*?System?V?semaphores*/??
????????SI_SUB_SYSV_MSG?????????=?0x6C00000,????/*?System?V?message?queues*/??
????????SI_SUB_P1003_1B?????????=?0x6E00000,????/*?P1003.1B?realtime?*/??
????????SI_SUB_PSEUDO???????????=?0x7000000,????/*?pseudo?devices*/??
????????SI_SUB_EXEC?????????????=?0x7400000,????/*?execve()?handlers?*/??
????????SI_SUB_PROTO_BEGIN??????=?0x8000000,????/*?XXX:?set?splimp?(kludge)*/??
????????...??
};??

子系统内还有顺序号:??
enum?sysinit_elem_order?{??
????????SI_ORDER_FIRST??????????=?0x0000000,????/*?first*/??
????????SI_ORDER_SECOND?????????=?0x0000001,????/*?second*/??
????????SI_ORDER_THIRD??????????=?0x0000002,????/*?third*/??
????????SI_ORDER_MIDDLE?????????=?0x1000000,????/*?somewhere?in?the?middle?*/??
????????SI_ORDER_ANY????????????=?0xfffffff?????/*?last*/??
};??

FreeBSD
为每个想要在系统初始化时被调用的函数,?定义两个函数类型:??
typedef?void?(*sysinit_nfunc_t)?__P((void?*));??
typedef?void?(*sysinit_cfunc_t)?__P((const?void?*));??
它们是系统初始化被调用时使用的函数原型.??
两个重要的宏使得初始化函数能够在系统开始时被执行:??

#define?C_SYSINIT(uniquifier,?subsystem,?order,?func,?ident)????\??
????????static?struct?sysinit?uniquifier?##?_sys_init?=?{???????\??
????????????????subsystem,??????????????????????????????????????\??
????????????????order,??????????????????????????????????????????\??
????????????????func,???????????????????????????????????????????\??
????????????????ident???????????????????????????????????????????\??
????????};??????????????????????????????????????????????????????\??
????????DATA_SET(sysinit_set,uniquifier?##?_sys_init);??

#define?SYSINIT(uniquifier,?subsystem,?order,?func,?ident)??????\??
????????C_SYSINIT(uniquifier,?subsystem,?order,?????????????????\??
????????(sysinit_cfunc_t)(sysinit_nfunc_t)func,?(void?*)ident)??

其中每个初始化函数被存储成这样一个结构:??
????????struct?sysinit?{??
???????????unsigned?int????subsystem;??????????????/*?subsystem?identifier*/??
???????????unsigned?int????order;??????????????????/*?init?order?within?subsystem*/??
???????????sysinit_cfunc_t?func;???????????????????/*?function?????????????*/??
???????????const?void??????*udata;?????????????????/*?multiplexer/argument?*/??
????????};??
这个结构包含了子系统编号,?子系统中的顺序号,?初始化函数的地址,?以及这个函数??
使用的参数.??

现在如果有个函数想要在系统启动时自动被调用,?并且知道这个函数是为VM子系统做准备工??
,?可以这样申明:??

long?myvar;??
void?init_myvar(void?*p)??
{??
?????*(long?*)p?=?2;??
}??
SYSINIT(init_myvar,?SI_SUB_VM,?1000,?init_myvar,?&myvar)??

这样声明的初始化过程分布在很多目标文件中,?gcc的连接编辑器ld运行时就会把属于同??
一个section的数据合并到一个连续的地址块中.??
由于在这个section中包含的只能是指向sysinit结构的指针,这样FreeBSD就可以把这个地址??
当成一个sysinit*?的数组,?FreeBSD找出这个sysinit_set地址,?边历这个数组并调用其中??
的初始化函数.?为了确切知道这个section的大小(直接读ELF是可能的,但是那样太复杂,??
知道kernel调用初始化过程时文件系统可能还没有初始化呢),?系统中包含一个工具??
gensetdefs,?
这个工具能扫描给出的一组.o目标文件,?并找到任何名字是由.set.开头的??
section,?
它统计有多少个这样的的初始化函数,?并在sysinit_set的开头生成一个长整形??
计数器.?gensetdefs生成三个文件:??
setdef0.c?setdef1.c?setdefs.h??

文件setdef0.c的内容:??

--------------------------------------------------------??
/*?THIS?FILE?IS?GENERATED,?DO?NOT?EDIT.?*/??

#define?DEFINE_SET(set,?count)??????????????????\??
__asm__(".section?.set."?#set?",\"aw\"");???????\??
__asm__(".globl?"?#set);????????????????????????\??
__asm__(".type?"?#set?",@object");??????????????\??
__asm__(".p2align?2");??????????????????????????\??
__asm__(#set?":");??????????????????????????????\??
__asm__(".long?"?#count);???????????????????????\??
__asm__(".previous")??

#include?"setdefs.h"????????????/*?Contains?a?`DEFINE_SET‘?for?each?set?*/??
--------------------------------------------------------??

这里的DEFINE_SET效果就是申明一C结构:??
struct?linker_set?{??
????????int?????ls_length;??
????????void????*ls_items[1];???????????/*?really?ls_length?of?them,??
????????????????????????????????????????????????*?trailing?NULL?*/??
};??

文件setdef1.c的内容:??

--------------------------------------------------------??
/*?THIS?FILE?IS?GENERATED,?DO?NOT?EDIT.?*/??

#define?DEFINE_SET(set,?count)??????????????????????????\??
__asm__(".section?.set."?#set?",\"aw\"");???????\??
__asm__(".long?0");?????????????????????\??
__asm__(".previous")??

#include?"setdefs.h"????????????/*?Contains?a?`DEFINE_SET‘?for?each?set?*/??

这个DEFINE_SET在某个section中放入一个?long?0.??
--------------------------------------------------------??

文件setdefs.h的内容:??

DEFINE_SET(cons_set,?3);??
DEFINE_SET(kbddriver_set,?2);??
DEFINE_SET(periphdriver_set,?5);??
DEFINE_SET(scrndr_set,?9);??
DEFINE_SET(scterm_set,?1);??
DEFINE_SET(sysctl_set,?552);??
DEFINE_SET(sysinit_set,?323);??
DEFINE_SET(sysuninit_set,?155);??
DEFINE_SET(vga_set,?9);??
DEFINE_SET(videodriver_set,?4);??

kernel被连接时,?Makefilesetdef0.o被安排最前面,?这样ld就把这个初始化函数的??
计数器安排在这个section的最前面.?FreeBSD?kernel就能从这个section的开头读到这个计??
数器,?也就知道了有多少个初始化函数.?Makefile中被安排在中间的的是FreeBSD的其他??
.o
文件,?最后由setdef1.o压阵.?setdef1.c定义了一个空指针,用以表示这个section的结束??
,
这种安排,?我把它叫做夹三明治.??

初始化过程的调用被安排在内核?/sys/kern/init_main.cmi_startup函数中,?mi_startup??
是系统启动过程中,?第一个被执行的C语言函数,??它做的第一件事情就是调用这些初始化函??
,?开始时对所有的初始化过程做优先级排序,?然后顺序调用它们.??

void??????????????????????
mi_startup(void)??
{?????????????????
??????????????????????????
????????register?struct?sysinit?**sipp;?????????/*?system?initialization*/??
????????register?struct?sysinit?**xipp;?????????/*?interior?loop?of?sort*/??
????????register?struct?sysinit?*save;??????????/*?bubble*/??

restart:??????????
??????????
????????
这是优先级别排序,?这里没有使用那个在setdef0.c中定义的计数器,?而是使用??
????????
setdef1.c中定义的空指针作为结束标志.??
??????????
????????/*????????
?????????*?Perform?a?bubble?sort?of?the?system?initialization?objects?by??
?????????*?their?subsystem?(primary?key)?and?order?(secondary?key).??
?????????*/???????
????????for?(sipp?=?sysinit;?*sipp;?sipp++)?{??
????????????????for?(xipp?=?sipp?+?1;?*xipp;?xipp++)?{??
????????????????????????if?((*sipp)->subsystem?< (*xipp)->subsystem?||??
?????????????????????????????((*sipp)->subsystem?==?(*xipp)->subsystem?&&??
??????????????????????????????(*sipp)->order?<= (*xipp)->order))??
????????????????????????????????continue;???????/*?skip*/??
????????????????????????save?=?*sipp;??
????????????????????????*sipp?=?*xipp;??
????????????????????????*xipp?=?save;??
????????????????}??
????????}??

????????/*??
?????????*?Traverse?the?(now)?ordered?list?of?system?initialization?tasks.??
?????????*?Perform?each?task,?and?continue?on?to?the?next?task.??
?????????*??
?????????*?The?last?item?on?the?list?is?expected?to?be?the?scheduler,??
?????????*?which?will?not?return.??
?????????*/??
????????for?(sipp?=?sysinit;?*sipp;?sipp++)?{??

????????????????if?((*sipp)->subsystem?==?SI_SUB_DUMMY)??
????????????????????????continue;???????/*?skip?dummy?task(s)*/??


这是按顺序调用:??
/*??
?????????*?Traverse?the?(now)?ordered?list?of?system?initialization?tasks.??
?????????*?Perform?each?task,?and?continue?on?to?the?next?task.??
?????????*??
?????????*?The?last?item?on?the?list?is?expected?to?be?the?scheduler,??
?????????*?which?will?not?return.??
?????????*/??
????????for?(sipp?=?sysinit;?*sipp;?sipp++)?{??

????????????????if?((*sipp)->subsystem?==?SI_SUB_DUMMY)??
????????????????????????continue;???????/*?skip?dummy?task(s)*/??

????????????????if?((*sipp)->subsystem?==?SI_SUB_DONE)??
????????????????????????continue;??

????????????????/*?Call?function?*/??
????????????????(*((*sipp)->func))((*sipp)->udata);??

????????????????/*?Check?off?the?one?we‘re?just?done?*/??
????????????????(*sipp)->subsystem?=?SI_SUB_DONE;??

????????????????/*?Check?if?we‘ve?installed?more?sysinit?items?via?KLD?*/??
????????????????if?(newsysinit?!=?NULL)?{??
????????????????????????if?(sysinit?!=?(struct?sysinit?**)sysinit_set.ls_items)??
????????????????????????????????free(sysinit,?M_TEMP);??
????????????????????????sysinit?=?newsysinit;??
????????????????????????newsysinit?=?NULL;??
????????????????????????goto?restart;??
????????????????}??
????????}??

????????panic("Shouldn‘t?get?here!");??
}??
??????

SRC=http://www.moon-soft.com/program/bbs/readelite432617.htm