首页 > 代码库 > Linux子进程

Linux子进程

Linux子进程

1.fork()函数概述

Linux程序中,用fork()可以创建一个子进程,具体而言:

  • 调用fork()时,会创建当前进程的一份拷贝;

  • 当前进程称为父进程(parentprocess),新创建的进程称为子进程(childprocess)

  • fork()调用点开始,父子进程都开始执行。


2.区分父子进程


2.1fork()函数

flying-bird@flying-bird:~$man fork | more

FORK(2) Linux Programmer‘s Manual FORK(2)




NAME

fork- create a child process


SYNOPSIS

#include<unistd.h>


pid_tfork(void);


DESCRIPTION

fork() creates a new process by duplicating the calling process. The

newprocess, referred to as the child, is an exact duplicate of the

calling process, referred to as the parent, except for the following

points:


* The child has its own unique process ID, and this PID does not match

theID of any existing process group (setpgid(2)).


* The child‘s parent process ID is the same as the parent‘s process

ID.


* The child does not inherit its parent‘s memory locks (mlock(2),

mlockall(2)).


* Process resource utilizations (getrusage(2)) and CPU time counters

(times(2))are reset to zero in the child.


* The child‘s set of pending signals is initially empty (sigpend‐

ing(2)).


* The child does not inherit semaphore adjustments from its parent

(semop(2)).

* The child does not inherit record locks from its parent (fcntl(2)).


* The child does not inherit timers from its parent (setitimer(2),

alarm(2),timer_create(2)).


* The child does not inherit outstanding asynchronous I/O operations

fromits parent (aio_read(3), aio_write(3)), nor does it inherit any

asynchronousI/O contexts from its parent (see io_setup(2)).


The process attributes in the preceding list are all specified in

POSIX.1-2001. The parent and child also differ with respect to the

followingLinux-specific process attributes:


* The child does not inherit directory change notifications (dnotify)

fromits parent (see the description of F_NOTIFY in fcntl(2)).


* The prctl(2) PR_SET_PDEATHSIG setting is reset so that the child

doesnot receive a signal when its parent terminates.


* Memory mappings that have been marked with the madvise(2) MADV_DONT‐

FORKflag are not inherited across a fork().


* The termination signal of the child is always SIGCHLD (see

clone(2)).


Notethe following further points:


* The child process is created with a single thread—the onethat

calledfork(). The entire virtual address space of the parent is

replicated in the child, including the states of mutexes, condition

variables,and other pthreads objects; the use of pthread_atfork(3)

maybe helpful for dealing with problems that this can cause.


* The child inherits copies of the parent‘s set of open file descrip‐

tors. Each file descriptor in the child refers to the same open

file description (see open(2)) as the corresponding file descriptor

inthe parent. This means that the two descriptors share open file

status flags, current file offset, and signal-driven I/O attributes

(seethe description of F_SETOWN and F_SETSIG in fcntl(2)).


* The child inherits copies of the parent‘s set of open message queue

descriptors (see mq_overview(7)). Each descriptor in the child

refersto the same open message queue description as the correspond‐

ing descriptor in the parent. This means that the two descriptors

sharethe same flags (mq_flags).


* The child inherits copies of the parent‘s set of open directory

streams (see opendir(3)). POSIX.1-2001 says that the corresponding

directorystreams in the parent and child may share the directory

streampositioning; on Linux/glibc they do not.

RETURNVALUE

Onsuccess, the PID of the child process is returned in the parent, and

0is returned in the child. On failure, -1 is returned in the parent,

nochild process is created, and errno is set appropriately.


ERRORS

EAGAINfork() cannot allocate sufficient memory to copy the parent‘s

pagetables and allocate a task structure for the child.


EAGAINIt was not possible to create a new process because the caller‘s

RLIMIT_NPROC resource limit was encountered. To exceed this

limit,the process must have either the CAP_SYS_ADMIN or the

CAP_SYS_RESOURCEcapability.


ENOMEMfork() failed to allocate the necessary kernel structures

becausememory is tight.


CONFORMINGTO

SVr4,4.3BSD, POSIX.1-2001.


NOTES

UnderLinux, fork() is implemented using copy-on-write pages, so the

only penalty that it incurs is the time and memory required to dupli‐

catethe parent‘s page tables, and to create a unique task structure

forthe child.


Since version 2.3.3, rather than invoking the kernel‘s fork() system

call,the glibc fork() wrapper that is provided as part of the NPTL

threading implementation invokes clone(2) with flags that provide the

sameeffect as the traditional system call. The glibc wrapper invokes

anyfork handlers that have been established using pthread_atfork(3).


EXAMPLE

Seepipe(2) and wait(2).


SEEALSO

clone(2), execve(2), setrlimit(2), unshare(2), vfork(2), wait(2), dae‐

mon(3),capabilities(7), credentials(7)


COLOPHON

Thispage is part of release 3.35 of the Linux man-pages project. A

description of the project, and information about reporting bugs, can

befound at http://man7.org/linux/man-pages/.




Linux 2009-04-27 FORK(2)

flying-bird@flying-bird:~$


2.2判别方法

fork()之后,需要区分父进程和子进程,以便执行各自正确的路径。


具体来讲,根据fork()返回的pid的值来区分父子进程:返回者为0,表示该进程是子进程;如果大于0则表示父进程。当返回-1的时候表示fork()调用异常。



3.示例

这里取《AdvancedLinux ProgrammingListing3.3的代码:

#include<stdio.h>

#include<sys/types.h>

#include<unistd.h>



intmain()

{

pid_tchild_pid;



printf("themain program process ID is %d\n", (int)getpid());



child_pid= fork();

if(child_pid != 0) {

printf("thisis the parent process, with id %d\n", (int)getpid());

printf("thechild‘s process ID is %d\n", (int)child_pid);

}else {

printf("thisis the child process, with id %d\n", (int)getpid());

}

return0;

}

执行结果:

flying-bird@flying-bird:~/examples/cpp/fork$gcc fork_list3_3.c

flying-bird@flying-bird:~/examples/cpp/fork$./a.out

themain program process ID is 3161

thisis the parent process, with id 3161

thechild‘s process ID is 3162

thisis the child process, with id 3162

flying-bird@flying-bird:~/examples/cpp/fork$

4.进程等待

4.1退出时机

fork()之后,父进程和子进程谁先执行完?或者说,哪个进程先结束?

答案是,由父子进程本身的代码决定;并不是说子进程先结束、然后父进程再结束。

为此,我们给出一个例子进行说明。其中父进程sleep10秒钟,子进程sleep20秒。同时,用ps命令来观测当前存在的进程列表。

ALPListing 3.3的基础上,增加两个sleep()调用,修改如下:

#include<stdio.h>

#include<sys/types.h>

#include<unistd.h>



intmain()

{

pid_tchild_pid;



printf("themain program process ID is %d\n", (int)getpid());



child_pid= fork();

if(child_pid != 0) {

printf("thisis the parent process, with id %d\n", (int)getpid());

printf("thechild‘s process ID is %d\n", (int)child_pid);

sleep(10);

}else {

printf("thisis the child process, with id %d\n", (int)getpid());

sleep(20);

}



return0;

}

我们打开两个终端,一个执行上面的程序,另外一个不断用ps观察进程。

第一个终端的执行结果:

flying-bird@flying-bird:~/examples/cpp/fork$./a.out

themain program process ID is3392

thisis the parent process, with id 3392

thechild‘s process ID is3393

thisis the child process, with id 3393

flying-bird@flying-bird:~/examples/cpp/fork$

另外一个终端的观测结果:

flying-bird@flying-bird:~$ps -a

PIDTTY TIME CMD

3363pts/3 00:00:00 man

3374pts/3 00:00:00 pager

3392pts/0 00:00:00 a.out

3393pts/0 00:00:00 a.out

3394pts/1 00:00:00 ps

flying-bird@flying-bird:~$ps -a

PIDTTY TIME CMD

3363pts/3 00:00:00 man

3374pts/3 00:00:00 pager

3393pts/0 00:00:00 a.out

3395pts/1 00:00:00 ps

flying-bird@flying-bird:~$ps -a

PIDTTY TIME CMD

3363pts/3 00:00:00 man

3374pts/3 00:00:00 pager

3398pts/1 00:00:00 ps

flying-bird@flying-bird:~$

可以看到,子进程先结束,而后父进程结束。

4.2父进程等待子进程

实际项目中,父进程往往需要等待子进程结束,然后决定后续处理流程;进一步地,需要了解子进程是如何退出的:正常退出、异常退出,等等。

为此,需要使用waitfamily系统调用。

waitfamily4种形式:

  • wait():父进程等待其中一个子进程退出(exit或异常终止);

  • waitpid():父进程等待指定的子进程退出;

  • wait3()& wait4():检查子进程的状态,比如资源信息。

使用较多的是waitpid()

4.3子进程退出的几种方式

有如下几种退出方式:

  • 子进程调用exit()return

  • 子进程异常退出,比如除零错误等;

  • 其他异常终止???。

对于这几种方式,父进程可以wait()获取退出方式。以下给出每一种方式的代码示例,但总的调用形式如下:

intstatus;

waitpid(child_pid,&status, 0);



/*see "man waitpid" for detail */

if(WIFEXITED(status)) {

printf("exited,status=%d\n", WEXITSTATUS(status));

}else if (WIFSIGNALED(status)) {

printf("killedby signal %d\n", WTERMSIG(status));

}else if (WIFSTOPPED(status)) {

printf("stoppedby signal %d\n", WSTOPSIG(status));

}else {

printf("unknown\n");

}



4.3.1exit()return

exit()return均属于正常退出,也是子进程最常见的一种退出方式。对应于:

if(WIFEXITED(status)) {

printf("exited,status=%d\n", WEXITSTATUS(status));

}

return的示例:

#include<stdio.h>

#include<sys/types.h>

#include<unistd.h>

#include<sys/wait.h>

#include<stdlib.h>



voidchild_foo(int exit_code)

{

printf("child_foo()\n");

exit(exit_code);// or return exit_code

}

intget_exit_code(int argc, const char* argv[])

{

if(argc != 2) {

printf("Usage:%s exit_code\n", argv[0]);

exit(-1);

}



//ignore other exceptions

returnatoi(argv[1]);

}



intmain(int argc, const char* argv[])

{

pid_tchild_pid;

int exit_code = get_exit_code(argc, argv);



printf("themain program process ID is %d\n", (int)getpid());



child_pid= fork();



/*child process */

if(child_pid == 0) {

child_foo(exit_code);

}



/*parent process */

intstatus;

waitpid(child_pid,&status, 0);



/*see "man waitpid" for detail */

if(WIFEXITED(status)) {

printf("exited,status=%d\n", WEXITSTATUS(status));

}else if (WIFSIGNALED(status)) {

printf("killedby signal %d\n", WTERMSIG(status));

}else if (WIFSTOPPED(status)) {

printf("stoppedby signal %d\n", WSTOPSIG(status));

}else {

printf("unknown\n");

}



return0;

}

执行结果:

flying-bird@flying-bird:~/examples/cpp/wait$./a.out 1

themain program process ID is 2635

child_foo()

exited,status=1

flying-bird@flying-bird:~/examples/cpp/wait$./a.out 12

themain program process ID is 2637

child_foo()

exited,status=12

flying-bird@flying-bird:~/examples/cpp/wait$

在有了上面的示例之后,我们给出ALP3.4 Process Termination的一段话:

Normally,a process terminates in one of two ways. Either the executing programcalls theexit function, or the program’s mainfunction returns. Each process has an exit code: a number that theprocess returns to its parent. The exit code is the argument passedto theexit function, or the value returned from main.

4.3.2异常返回&信号

进程还会因为异常而退出,比如除零异常(SIGFPE)、段异常(SIGSEGVSegmentFault)abort()对应的SIGABRT异常,等等。

将上面的代码修改如下:

#include<stdio.h>

#include<sys/types.h>

#include<unistd.h>

#include<sys/wait.h>

#include<stdlib.h>



voidtest_SIGFPE()

{

inti;

for(i = 10; i >= 0; i--) {

printf("%d%d\n", i, 100 / i);

}

}



voidtest_SIGABRT()

{

printf("thechild process will abort.\n");

abort();

printf("unreachablestatement.\n");

}



intget_abnormal_type(int argc, const char* argv[])

{

inttype;



if(argc != 2 || (type = atoi(argv[1]), type != 1 && type != 2)){

printf("Usage:%s [1|2]\n", argv[0]);

printf(" 1: SIGFPE, 2: SIGABRT\n");

exit(-1);

}

returntype;

}



intmain(int argc, const char* argv[])

{

pid_tchild_pid;

int type = get_abnormal_type(argc, argv);



child_pid= fork();



/*child process */

if(child_pid == 0) {

type== 1 ? test_SIGFPE() : test_SIGABRT();

return0; // unreachable

}



/*parent process */

intstatus;

waitpid(child_pid,&status, 0);



/*see "man waitpid" for detail */

if(WIFEXITED(status)) {

printf("exited,status=%d\n", WEXITSTATUS(status));

}else if (WIFSIGNALED(status)) {

printf("killedby signal %d\n", WTERMSIG(status));

}else if (WIFSTOPPED(status)) {

printf("stoppedby signal %d\n", WSTOPSIG(status));

}else {

printf("unknown\n");

}



return0;

}

执行结果:

flying-bird@flying-bird:~/examples/cpp/wait$./a.out 1

1010

911

812

714

616

520

425

333

250

1100

killedby signal 8

flying-bird@flying-bird:~/examples/cpp/wait$./a.out 2

thechild process will abort.

killedby signal 6

flying-bird@flying-bird:~/examples/cpp/wait$

各个信号的定义在/usr/include/asm-generic/signal.h,如下:

flying-bird@flying-bird:~$cat /usr/include/asm-generic/signal.h

#ifndef__ASM_GENERIC_SIGNAL_H

#define__ASM_GENERIC_SIGNAL_H



#include<linux/types.h>



#define_NSIG 64

#define_NSIG_BPW __BITS_PER_LONG

#define_NSIG_WORDS (_NSIG / _NSIG_BPW)



#defineSIGHUP 1

#defineSIGINT 2

#defineSIGQUIT 3

#defineSIGILL 4

#defineSIGTRAP 5

#defineSIGABRT 6

#defineSIGIOT 6

#defineSIGBUS 7

#defineSIGFPE 8

#defineSIGKILL 9

#defineSIGUSR1 10

#defineSIGSEGV 11

#defineSIGUSR2 12

#defineSIGPIPE 13

#defineSIGALRM 14

#defineSIGTERM 15

#defineSIGSTKFLT 16

#defineSIGCHLD 17

#defineSIGCONT 18

#defineSIGSTOP 19

#defineSIGTSTP 20

#defineSIGTTIN 21

#defineSIGTTOU 22

#defineSIGURG 23

#defineSIGXCPU 24

#defineSIGXFSZ 25

#defineSIGVTALRM 26

#defineSIGPROF 27

#defineSIGWINCH 28

#defineSIGIO 29

#defineSIGPOLL SIGIO

/*

#defineSIGLOST 29

*/

#defineSIGPWR 30

#defineSIGSYS 31

#define SIGUNUSED 31



/*These should not be considered constants from userland. */

#defineSIGRTMIN 32

#ifndefSIGRTMAX

#defineSIGRTMAX _NSIG

#endif



通常构造一个异常比较棘手,为此测试的时候可以直接调用kill()函数,传入异常退出对应的信号值。如下:

#include<stdio.h>

#include<sys/types.h>

#include<unistd.h>

#include<sys/wait.h>

#include<stdlib.h>



/*

Usage:./a.out signal_code

e.g../a.out 5

*/

intmain(int argc, const char* argv[])

{

pid_tchild_pid;



child_pid= fork();



if(child_pid == 0) {

printf("childprocess is sleeping.\n");

sleep(100);

return0;

}



sleep(2);// Give a chance for child process to print a line.

kill(child_pid,atoi(argv[1]));



intstatus;

waitpid(child_pid,&status, 0);



if(WIFEXITED(status)) {

printf("exited,status=%d\n", WEXITSTATUS(status));

}else if (WIFSIGNALED(status)) {

printf("killedby signal %d\n", WTERMSIG(status));

}else if (WIFSTOPPED(status)) {

printf("stoppedby signal %d\n", WSTOPSIG(status));

}else {

printf("unknown\n");

}



return0;

}

5.僵尸进程&孤儿进程

前面简单讨论过父进程和子进程谁先退出的问题,这里进一步讨论。

5.1进程退出信息

前面已讨论,wait&waitpid等函数可获取子进程退出时的一些信息,比如是正常退出还是异常退出。或者说,子进程退出时,Linux内核为每个(已终止的)子进程都保存了一些信息,包括进程ID、进程退出时的状态(exitcode or signal info),等等。如此父进程调用wait&waitpid的时候,就可以获取这些信息。

5.2僵尸进程

Zombieprocess,也称defunctprocess。通俗地解释,就是父进程还没有执行完,或父进程调用wait之前,子进程就已经执行完毕(terminated)。进程执行的时候,称为活的;相对应地,执行玩了,或终止执行了,进程就die了,死了,所以就称为zombieprocess or defunct process

维基(http://en.wikipedia.org/wiki/Zombie_process)的一段话:

Theterm zombie process derives from the common definition of zombie —an undead person. In the term‘s metaphor, the child process has"died" but has not yet been "reaped".

5.3孤儿进程

和僵尸进程对应地,称为孤儿进程(orphanprocess)。即父进程执行完了,终止了,但子进程仍在执行。此时,子进程的parent不存在了,所以就称为孤儿进程。

——当然,这里的父进程终止的时候,没有针对子进程调用wait

在父进程终止的时候,孤儿进程会被init进程所领养(adopted,reparenting)

5.4示例分析

接下来通过一个例子来说明以上概念。我们通过sleep()的时长来控制父子进程的运行时间,且通过命令行参数输入这两个sleep时长,从而简化示例代码的长度。——为了简化,同样少了许多异常处理流程。代码如下:



5.4.1僵尸进程

flying-bird@flying-bird:~/examples/cpp/zombile_orphan_process$./a.out 20 10

Parentprocess‘s pid: 2944

thisis the parent process, with id 2944

thechild‘s process ID is 2945

thisis the child process, with id 2945

childprocess will be terminated.

parentprocess will be terminated.

flying-bird@flying-bird:~/examples/cpp/zombile_orphan_process$

ps查看两个进程的状态:

1)父子进程均在运行:

2714 2705 Ss bash

2821 1 Sl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c

2846 2 S [kworker/0:3]

2870 2705 Ss bash

2943 2 S [kworker/0:2]

2944 2714 S+ ./a.out 20 10

2945 2944 S+ ./a.out 20 10

2947 2870 R+ ps -e -o pid,ppid,stat,cmd



2)子进程终止(僵尸)、父进程仍在运行:

2714 2705 Ss bash

2821 1 Sl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c

2846 2 S [kworker/0:3]

2870 2705 Ss bash

2943 2 S [kworker/0:2]

2944 2714 S+ ./a.out 20 10

2945 2944 Z+ [a.out] <defunct>

2948 2870 R+ ps -e -o pid,ppid,stat,cmd

3)父子进程均终止

2714 2705 Ss+ bash

2821 1 Sl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c

2846 2 S [kworker/0:3]

2870 2705 Ss bash

2943 2 S [kworker/0:2]

2949 2870 R+ ps -e -o pid,ppid,stat,cmd

5.4.2ps命令显示进程状态码的含义

manps

PROCESSSTATE CODES

Hereare the different values that the s, stat and state output

specifiers(header "STAT" or "S") will display to describethe state of

aprocess:

D uninterruptible sleep (usually IO)

R running or runnable (on run queue)

S interruptible sleep (waiting for an event to complete)

T stopped, either by a job control signal or because it is being

traced.

W paging (not valid since the 2.6.xx kernel)

X dead (should never be seen)

Z defunct ("zombie") process, terminated but not reaped byits

parent.

ForBSD formats and when the stat keyword is used, additional

charactersmay be displayed:

< high-priority (not nice to other users)

N low-priority (nice to other users)

L has pages locked into memory (for real-time and custom IO)

s is a session leader

l is multi-threaded (using CLONE_THREAD, like NPTL pthreads do)

+ is in the foreground process group.

5.4.3父进程终止时,对应的僵尸进程的处理

通过上面的例子及ps结果,可以看到当父进程终止时,僵尸进程也消失了(aregone)。其原理在于,当父进程终止时,该父进程的所有子进程都将被init进程所继承。而init程序会自动清除所继承的僵尸进程。

所以,在上面的例子中,当父进程运行结束之后,子进程也自动消失掉了。

5.4.4孤儿进程

接下来再构造孤儿进程的例子。

flying-bird@flying-bird:~/examples/cpp/zombile_orphan_process$./a.out 10 20

Parentprocess‘s pid: 3684

thisis the parent process, with id 3684

thechild‘s process ID is 3685

thisis the child process, with id 3685

parentprocess will be terminated.

flying-bird@flying-bird:~/examples/cpp/zombile_orphan_process$child process will be terminated.

下面是ps观测结果。

1)父子进程均在运行

2714 2705 Ss bash

2821 1 Rl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c

2870 2705 Ss bash

3038 2 S [kworker/1:1]

3079 2 S [kworker/0:3]

3646 2 S [kworker/0:0]

3666 2 S [kworker/0:2]

3684 2714 S+ ./a.out 10 20

3685 3684 S+ ./a.out 10 20

3688 2870 R+ ps -e -o pid,ppid,stat,cmd

2)父进程执行完毕、子进程仍在运行(变成孤儿进程被init收养)

2714 2705 Ss+ bash

2821 1 Rl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c

2870 2705 Ss bash

3038 2 S [kworker/1:1]

3079 2 S [kworker/0:3]

3646 2 S [kworker/0:0]

3666 2 S [kworker/0:2]

3685 1 S ./a.out 10 20

3689 2870 R+ ps -e -o pid,ppid,stat,cmd

3)子进程运行完毕

2714 2705 Ss+ bash

2821 1 Sl gedit/home/flying-bird/examples/cpp/zombile_orphan_process/zombie_orphan_process.c

2870 2705 Ss bash

3038 2 S [kworker/1:1]

3079 2 S [kworker/0:3]

3646 2 S [kworker/0:0]

3666 2 S [kworker/0:2]

3690 2870 R+ ps -e -o pid,ppid,stat,cmd

















参考资料

ALP:http://download.csdn.net/download/shmilyy/720746