首页 > 代码库 > 如何捕捉并分析SIGSEGV的现场
如何捕捉并分析SIGSEGV的现场
linux下程序对SIGSEGV信号的默认处理方式是产生coredump并终止程序,可以参考man 7 signal
Signal Value Action Comment ────────────────────────────────────────────────────────────────────── SIGHUP 1 Term Hangup detected on controlling terminal or death of controlling process SIGINT 2 Term Interrupt from keyboard SIGQUIT 3 Core Quit from keyboard SIGILL 4 Core Illegal Instruction SIGABRT 6 Core Abort signal from abort(3) SIGFPE 8 Core Floating point exception SIGKILL 9 Term Kill signal SIGSEGV 11 Core Invalid memory reference SIGPIPE 13 Term Broken pipe: write to pipe with no readers SIGALRM 14 Term Timer signal from alarm(2) SIGTERM 15 Term Termination signal SIGUSR1 30,10,16 Term User-defined signal 1 SIGUSR2 31,12,17 Term User-defined signal 2 SIGCHLD 20,17,18 Ign Child stopped or terminated SIGCONT 19,18,25 Cont Continue if stopped SIGSTOP 17,19,23 Stop Stop process SIGTSTP 18,20,24 Stop Stop typed at terminal SIGTTIN 21,21,26 Stop Terminal input for background process SIGTTOU 22,22,27 Stop Terminal output for background process
对于Action的描述
The entries in the "Action" column of the tables below specify the default disposition for each signal, as follows: Term Default action is to terminate the process. Ign Default action is to ignore the signal. Core Default action is to terminate the process and dump core (see core(5)). Stop Default action is to stop the process. Cont Default action is to continue the process if it is currently stopped.
可以看到产生core这个动作的信号不止SIGSEGV这一个。通常程序中有对内存的Invalid reference就会产生SIGSEGV,具体描述见http://www.cnblogs.com/thammer/p/4737371.html 。
分析段错误的方法:
1.直接使用gdb
如果是容易重现的SIGSEGV直接gdb挂着运行,产生SIGSEGV时gdb会停止并打印提示,这时直接敲入命令bt查看程序此时的函数调用栈帧就可以定位到是哪个函数在什么样的调用情况下出现段错误。
2.使用core文件+gdb
在程序收到SIGSEGV时会产生coredump,core文件就是异常进程在发生异常的那一个时刻的进程内存上下文和cpu寄存器的信息。
首先,设置core文件大小 ulimit -c XXXX,XXXX就是允许产生的core文件大小,通常设置为unlimited,不限定大小
然后,运行程序直至产生core文件,名字一般是core.xxx,xxx为程序进程号,不同发行版本可能有不同的命名规则
然后,运行gdb,敲入命令 core-file corefile-name,再bt即可
3.注册SIGSEGV信号处理函数,在处理函数里面使用一些堆栈回溯的函数打印栈帧信息。
A.使用glibc带的函数backtrace backtrace_symbols backtrace_symbols_fd打印
void SigSegv_handler(int signo) { int j, nptrs; void *buffer[BT_BUF_SIZE]; char **strings; nptrs = backtrace(buffer, BT_BUF_SIZE); printf("backtrace() returned %d addresses\n", nptrs); /* The call backtrace_symbols_fd(buffer, nptrs, STDOUT_FILENO) would produce similar output to the following: */ strings = backtrace_symbols(buffer, nptrs); if (strings == NULL) { perror("backtrace_symbols"); exit(EXIT_FAILURE); } for (j = 0; j < nptrs; j++) printf("%s\n", strings[j]); free(strings);
exit(-1); }
backtrace_symbols 和backtrace_symbols_fd不同在于后者将打印输入到一个fd指定的文件里面。
它有一定的限制:
These functions make some assumptions about how a function‘s return address is stored on the stack. Note the following: * Omission of the frame pointers (as implied by any of gcc(1)‘s nonzero optimization levels) may cause these assumptions to be violated. * Inlined functions do not have stack frames. * Tail-call optimization causes one stack frame to replace another. The symbol names may be unavailable without the use of special linker options. For systems using the GNU linker, it is necessary to use the -rdynamic linker option. Note that names of "static" functions are not exposed, and won‘t be available in the backtrace.
对优化的程序可能失效
对inline函数失效
对static函数仅能打印函数地址
对tail-call优化的函数失效
编译时需要加入 -rdynamic
B.还有其他方法或接口做类似backtrace的事情,以后补充
如何捕捉并分析SIGSEGV的现场