undefined

C++ code dump 调试

打开 core dump 开关:ulimit -c unlimited

一、生成一个 core dump

1
2
3
4
5
6
7
8
#include <stdio.h>

int main() {
int* p = NULL;
*p = 0;
printf("%d\n");
return 0;
}

如上代码会生成 core dump,我们使用如下方法进行调试

二、dmesg + addr2line 调试

1
2
3
4
5
➜  [/data/home/noahyzhang/learn/core] dmesg | grep a.out
[10346599.866661] a.out[25068]: segfault at 0 ip 000000000040055b sp 00007fff0228e090 error 6 in a.out[400000+1000]
➜ [/data/home/noahyzhang/learn/core] addr2line -e a.out -f 000000000040055b
main
/data/home/noahyzhang/learn/core/core.c:6
  • demsg:用于检测和控制内核缓冲,可以获取出错堆栈地址
  • addr2line:可以将指令的地址和可执行映像转换成文件名、函数名或源代码的工具。使用 -e 可以指定可执行映像,使用 -f 选项可以输出函数名

先使用 dmesg 找到对应的出错地址,再用 addr2line -e 将地址解析到对应的代码行

三、gdb 调试

gdb a.out core.25068 进行调试,不要再次使用 r 去运行,因为实际开发中,很多问题都是概率发生的。使用 bt 即可看到程序出错代码行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
➜  [/data/home/noahyzhang/learn/core] gdb core.25068 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-119.tl2
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
[New LWP 25068]
Missing separate debuginfo for the main executable file
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/9e/58829cf2411a83aa62369519de72c6dbe9b0e8
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000040055b in ?? ()
"/data/home/noahyzhang/learn/core/core.25068" is a core file.
Please specify an executable to debug.
(gdb) bt
#0 0x000000000040055b in ?? ()
#1 0x00007fff0228e180 in ?? ()
#2 0x0000000000000000 in ?? ()
(gdb) file ./a.out
Reading symbols from /data/home/noahyzhang/learn/core/a.out...done.
(gdb)
(gdb)
(gdb) bt
#0 0x000000000040055b in main () at core.c:6
(gdb)

gdb 的其他选项

1
2
3
4
5
6
7
bt: 查看堆栈信息
i locals:查看当前程序栈的局部变量
i args:查看当前程序栈的参数
i catch:查看当前程序中栈帧的异常处理器
p a:打印变量的值
i register:查看当前寄存器的值
r:从运行程序至第一个断点

四、strace + addr2line 调试

strace 常用来跟踪进程执行时的系统调用和所接收的信号。

1
2
3
-c: 统计每一系统调用的所执行的时间、次数和出错的次数等
-p:指定进程 pid
-i:输出系统调用的入口指针
1
2
3
4
5
6
7
8
9
10
11
12
13
14
➜  [/data/home/noahyzhang/learn/core] strace -i ./a.out 
[00007f4c4663fcc7] execve("./a.out", ["./a.out"], [/* 34 vars */]) = 0
[00007f87ae8cfaac] brk(0) = 0x13f5000
[00007f87ae8d07da] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f87aead7000
...
[00007f87add3a27c] brk(0) = 0x13f5000
[00007f87add3a27c] brk(0x1427000) = 0x1427000
[000000000040055b] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} ---
[????????????????] +++ killed by SIGSEGV (core dumped) +++
[1] 32151 segmentation fault strace -i ./a.out

➜ [/data/home/noahyzhang/learn/core] addr2line -e a.out -f 000000000040055b
main
/data/home/noahyzhang/learn/core/core.c:6

通过 strace 输出 core dump 的时候调用系统调用的入口指针,然后在用 addr2line 来转换得到文件和函数名

五、崩溃监控方案

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <stdio.h>
#include <signal.h>
#include <boost/stacktrace.hpp>

void signal_handler(int sig_num) {
signal(sig_num, SIG_DFL);
boost::stacktrace::safe_dump_to("./backtrace.dump");
raise(SIGABRT);
}

void sign_register() {
signal(SIGSEGV, &signal_handler);
signal(SIGABRT, &signal_handler);
}

int main() {
sign_register();
int* p = NULL;
*p = 0;
printf("%d\n");
return 0;
}

通过向系统注册发生Segmentation Fault和Abort时的回调即可监控C++的崩溃,在回调里就可采集崩溃栈等信息,可以使用boost库方便的记录backtrace等信息。