首页 > 代码库 > 用qemu与gdb调试linux kernel tcp/ip协议栈

用qemu与gdb调试linux kernel tcp/ip协议栈

description
用gdb debug linux kernel容易吗?其实要走到这步真的不容易啊,其实也难道是不难,就是要知道的东西太多了。用gdb debug linux kernel 可以有2中方式:UML和qemu方式,这里主要说qemu,从源码编译安装qemu很费劲。

准备环境
linux OS: Debian7.5-i386(当时最新的Wheezy,装在VMware10上,我用的在线安装,安装后以text方式跑起来,我的笔记本配置资源有限!)
root fs:Debian-Wheezy-x86-root_fs.bz2(之前下的,好像是Debian7.0的,不过没关系,可以更新,下载地址http://fs.devloop.org.uk/)
linux kernel source: linux-3.2.59.tar.xz(选择这个是和Debian7.5-i386的内核版本差不多。)
qemu:我用apt-get install的1.1.2,源码安装折腾。
/etc/qemu-ifup:配置一个ip,(这个在启动qemu的时候,如果带了net参数时会建立这个网卡。)
      #!/bin/sh
      /sbin/ifconfig $1 10.0.2.11

编译和debug工具这些少不了,肯定要有网络,想要什么直接apt-get install xxx 就OK了,方便多了!

编译kernel
下载解压缩就不说了,直接编译:
make defconfig
make menuconfig  --> 进去设置一些debug选项,这里不说了!
make -j 8 bzImage  --> 开始编译,等待漫长...
make modules_install INSTALL_MOD_PATH=../fs  --> 编译安装kernel模块,可以不用。

这样kernel就编译完成了,这样就有了linux-3.2.59/arch/x86/boot/bzImage(用于qemu启动)和linux-3.2.59/vmlinux(用于gdb attach)文件。
去优化问题,kernel很多源码去优化是编译不过的,如果我们对某个文件感兴趣可以通过如下方式:
init/main.c  --> 在init/Makefile添加: CFLAGS_main.o = -O0
net/socket.c  --> 在net/Makefile添加: CFLAGS_socket.o = -O0

upgrade root fs 和install packet
mount -o loop ./Debian-Wheezy-x86-root_fs fs/  --> mount root fs 到fs目录,这样直接对fs访问来修改root fs。
chroot ./fs  --> 加载root fs
mount -t proc /proc /proc  --> 对新的root fs手动加载proc

现在用新的root fs还不能访问网络,需要修改nameserver,/etc/resolv.conf 这个改成之前root fs的内容就能联网了。
如果root fs有密码,并且不知道,这个可以用passwd -d来删除或者重新设置。

apt-get upgrade --> 这个不是必要,我只是为了防止安装软件包出现不必要的错误。
apt-get install gcc g++ make gdb openssh-server -y  -->这个只是安装一些觉得可能用的软件包!

总之觉得chroot确实是一个很强大的东西,这个能构建一个新的linux release。

qemu running and gdb debug kernel
启动很简单,这里用qemu的stop on tcp::1234的这种gdbserver方式,cmd:
root@debian:~# qemu-system-i386 -kernel ./linux-3.2.59/arch/x86/boot/bzImage -append "console=ttyS0 rdinit=/bin/sh root=/dev/sda rw mem=256M" --boot c -nographic -hda ./Debian-Wheezy-x86-root_fs -m 256 -k en-us -S -s -net nic -net tap
QEMU 1.1.2 monitor - type ‘help‘ for more information
(qemu) QEMU 1.1.2 monitor - type ‘help‘ for more information
(qemu) 
这里可以不需要xwindow,带了参数-nographic,就要一般的远程终端下就能完成。
下一步就是启动gdb and target remote tcp::1234,cmd:
root@debian:~# gdb ./linux-3.2.59/vmlinux
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /root/linux-3.2.59/vmlinux...done.
(gdb) target remote tcp::1234
Remote debugging using tcp::1234
0x0000fff0 in ?? ()
(gdb) b start_kernel
Breakpoint 1 at 0xc1881629: file init/main.c, line 469.
(gdb) c
Continuing.
Breakpoint 1, start_kernel () at init/main.c:469
469     {
(gdb) n
473             smp_setup_processor_id();
(gdb) l
468     asmlinkage void __init start_kernel(void)
469     {
470             char * command_line;
471             extern const struct kernel_param __start___param[], __stop___param[];
472
473             smp_setup_processor_id();
474
475             /*
476              * Need to run as early as possible, to initialize the
477              * lockdep hash:
(gdb) 
(gdb) c
Continuing.
看上面的gdb确实可以debug linux kernel,我们先跳过启动。我们来看qemu的启动日志,这里只有贴最后一点:
[....] Cleaning up temporary files.... ok
INIT: Entering runlevel: 2
[info] Using makefile-style concurrent boot in runlevel 2.
[....] Starting enhanced syslogd: rsyslogd. ok
[....] Starting periodic command scheduler: cron. ok
[....] Starting OpenBSD Secure Shell server: sshd. ok

Debian GNU/Linux 7 changeme ttyS0

changeme login: root
Password:
Last login: Mon May 26 15:44:04 UTC 2014 on ttyS0
Linux changeme 3.2.59 #2 SMP Mon May 26 09:40:43 EDT 2014 i686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
root@changeme:~# 
root@changeme:~# uname -a
Linux changeme 3.2.59 #2 SMP Mon May 26 09:40:43 EDT 2014 i686 GNU/Linux
看,这个是不是起来了,内核版本也是我们之前编译的,这个之前安装了gcc,来看看gcc使用,cmd:
root@changeme:~# cat hello.c
#include <stdio.h>
#include <stdlib.h>
void main()
{
        printf("hello world!\n");
}
root@changeme:~# gcc hello.c
root@changeme:~# ls -l a.out
-rwxr-xr-x 1 root root 4980 May 27 08:23 a.out
root@changeme:~# ./a.out
hello world!
root@changeme:~# 
gcc使用正常。来看看网卡,cmd:
root@changeme:~# ifconfig
eth0      Link encap:Ethernet  HWaddr 52:54:00:12:34:56 
          inet6 addr: fe80::5054:ff:fe12:3456/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:3330 (3.2 KiB)

lo        Link encap:Local Loopback 
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:112 (112.0 B)  TX bytes:112 (112.0 B)
没有DHCD到IP,手动设置,cmd:
root@changeme:~# ifconfig eth0 10.0.2.15 netmask 255.255.255.0
root@changeme:~# ifconfig
eth0      Link encap:Ethernet  HWaddr 52:54:00:12:34:56 
          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
          inet6 addr: fe80::5054:ff:fe12:3456/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 B)  TX bytes:3672 (3.5 KiB)

lo        Link encap:Local Loopback 
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:1 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:112 (112.0 B)  TX bytes:112 (112.0 B)
手动设置后,确实有ip了,在host主机上能通过ping通这个ip吗?我们来看看host主机上的IP:
root@debian:~# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:0c:29:65:c4:5c 
          inet addr:192.168.91.136  Bcast:192.168.91.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe65:c45c/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3586 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5985 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:247944 (242.1 KiB)  TX bytes:696252 (679.9 KiB)
          Interrupt:19 Base address:0x2000

lo        Link encap:Local Loopback 
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:340 errors:0 dropped:0 overruns:0 frame:0
          TX packets:340 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:22302 (21.7 KiB)  TX bytes:22302 (21.7 KiB)

tap0      Link encap:Ethernet  HWaddr 26:2c:07:73:43:21 
          inet addr:10.0.2.11  Bcast:10.255.255.255  Mask:255.0.0.0
          inet6 addr: fe80::242c:7ff:fe73:4321/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:27 errors:0 dropped:0 overruns:0 frame:0
          TX packets:34 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:500
          RX bytes:5850 (5.7 KiB)  TX bytes:7246 (7.0 KiB)
host主机上多了一个tap0,ip 10.0.2.11和qemu跑的linux的ip 10.0.2.15在一个网段,那我在host主机上ping下,cmd:
root@debian:~# ping 10.0.2.15
PING 10.0.2.15 (10.0.2.15) 56(84) bytes of data.
64 bytes from 10.0.2.15: icmp_req=1 ttl=64 time=0.652 ms
64 bytes from 10.0.2.15: icmp_req=2 ttl=64 time=1.98 ms
64 bytes from 10.0.2.15: icmp_req=3 ttl=64 time=1.04 ms
64 bytes from 10.0.2.15: icmp_req=4 ttl=64 time=0.993 ms
^C
--- 10.0.2.15 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3006ms
rtt min/avg/max/mdev = 0.652/1.168/1.982/0.494 ms
root@debian:~# 
perfect,ping通了,能ssh过去吗?下看qemu上的linux的sshd开启没,这个之前已经安装了,cmd:
root@changeme:~# netstat -apn          
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      2591/sshd      
tcp6       0      0 :::22                   :::*                    LISTEN      2591/sshd      
udp        0      0 0.0.0.0:13820           0.0.0.0:*                           2397/dhclient  
udp        0      0 0.0.0.0:68              0.0.0.0:*                           2397/dhclient  
udp6       0      0 :::30821                :::*                                2397/dhclient  
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node   PID/Program name    Path
unix  2      [ ACC ]     SEQPACKET  LISTENING     1680     973/udevd           /run/udev/control
unix  4      [ ]         DGRAM                    3497     2526/rsyslogd       /dev/log
unix  2      [ ]         DGRAM                    3610     2397/dhclient      
unix  2      [ ]         DGRAM                    3575     2617/login         
unix  3      [ ]         DGRAM                    1689     973/udevd          
unix  3      [ ]         DGRAM                    1688     973/udevd          
root@changeme:~# 
perfect,sshd起来了,我尝试ssh上去,cmd:
root@debian:~# ssh test@10.0.2.15
test@10.0.2.15‘s password:
Linux changeme 3.2.59 #2 SMP Mon May 26 09:40:43 EDT 2014 i686

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Tue May 27 08:17:31 2014 from 10.0.2.11
Could not chdir to home directory /home/sam: No such file or directory
$ su root
Password:
root@changeme:/# uname -a
Linux changeme 3.2.59 #2 SMP Mon May 26 09:40:43 EDT 2014 i686 GNU/Linux
root@changeme:/# 
perfect,通过test user跳到root。

gdb debug tcp/ip kernel
上面的网络环境都建立好了,下面我们在qemu linux上启动server,host上起来client来连接qemu linux上的server程序,在网上随便找了2个example:
root@changeme:~# cat server1.c
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/types.h>
#include <time.h>

int main(int argc, char *argv[])
{
    int listenfd = 0, connfd = 0;
    struct sockaddr_in serv_addr;

    char sendBuff[1025];
    time_t ticks;

    listenfd = socket(AF_INET, SOCK_STREAM, 0);
    memset(&serv_addr, ‘0‘, sizeof(serv_addr));
    memset(sendBuff, ‘0‘, sizeof(sendBuff));

    serv_addr.sin_family = AF_INET;
    serv_addr.sin_addr.s_addr = htonl(INADDR_ANY);
    serv_addr.sin_port = htons(5000);

    bind(listenfd, (struct sockaddr*)&serv_addr, sizeof(serv_addr));

    listen(listenfd, 10);

    while(1)
    {
        connfd = accept(listenfd, (struct sockaddr*)NULL, NULL);

        ticks = time(NULL);
        snprintf(sendBuff, sizeof(sendBuff), "%.24s\r\n", ctime(&ticks));
        write(connfd, sendBuff, strlen(sendBuff));

        close(connfd);
        sleep(1);
     }
}
root@changeme:~# gcc -o server1 server1.c
root@changeme:~# ls -l server1*
-rwxr-xr-x 1 root root 6562 May 27 08:27 server1
-rw-r--r-- 1 root root 1022 May 26 14:40 server1.c
root@changeme:~# ./server1 
root@changeme:~# 

root@debian:~# cat client1.c
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <arpa/inet.h>

int main(int argc, char *argv[])
{
    int sockfd = 0, n = 0;
    char recvBuff[1024];
    struct sockaddr_in serv_addr;

    if(argc != 2)
    {
        printf("\n Usage: %s <ip of server> \n",argv[0]);
        return 1;
    }

    memset(recvBuff, ‘0‘,sizeof(recvBuff));
    if((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0)
    {
        printf("\n Error : Could not create socket \n");
        return 1;
    }

    memset(&serv_addr, ‘0‘, sizeof(serv_addr));

    serv_addr.sin_family = AF_INET;
    serv_addr.sin_port = htons(5000);

    if(inet_pton(AF_INET, argv[1], &serv_addr.sin_addr)<=0)
    {
        printf("\n inet_pton error occured\n");
        return 1;
    }

    if( connect(sockfd, (struct sockaddr *)&serv_addr, sizeof(serv_addr)) < 0)
    {
       printf("\n Error : Connect Failed \n");
       return 1;
    }

    while ( (n = read(sockfd, recvBuff, sizeof(recvBuff)-1)) > 0)
    {
        recvBuff[n] = 0;
        if(fputs(recvBuff, stdout) == EOF)
        {
            printf("\n Error : Fputs error\n");
        }
    }

    if(n < 0)
    {
        printf("\n Read error \n");
    }

    return 0;
}
root@debian:~# gcc -o client1 client1.c
root@debian:~# ls -l client1*
-rwxr-xr-x 1 root root 6245 May 27 04:26 client1
-rw-r--r-- 1 root root 1351 May 27 04:26 client1.c
root@debian:~# ./client1 10.0.2.15
Tue May 27 08:28:33 2014
如上看来是跑通了。gdb设置bind断点,然后启动server1,cmd:
(gdb) c
Continuing.
^C
Program received signal SIGINT, Interrupt.
default_idle () at arch/x86/kernel/process.c:369
369                     current_thread_info()->status |= TS_POLLING;
(gdb) bt
#0  default_idle () at arch/x86/kernel/process.c:369
#1  0xc1001a3f in cpu_idle () at arch/x86/kernel/process_32.c:116
#2  0xc15ed776 in rest_init () at init/main.c:387
#3  0xc1881920 in start_kernel () at init/main.c:641
#4  0xc18810ac in i386_start_kernel () at arch/x86/kernel/head32.c:68
#5  0x00000000 in ?? ()
(gdb) b sys_bind
Breakpoint 2 at 0xc14a7735: file net/socket.c, line 1431.
(gdb) c
Continuing.
Breakpoint 2, sys_bind (fd=3, umyaddr=0xbf8b93c8, addrlen=16) at net/socket.c:1431
1431            sock = sockfd_lookup_light(fd, &err, &fput_needed);
(gdb) l
1426    {
1427            struct socket *sock;
1428            struct sockaddr_storage address;
1429            int err, fput_needed;
1430
1431            sock = sockfd_lookup_light(fd, &err, &fput_needed);
1432            if (sock) {
1433                    err = move_addr_to_kernel(umyaddr, addrlen, (struct sockaddr *)&address);
1434                    if (err >= 0) {
1435                            err = security_socket_bind(sock,
(gdb) 
.........................
(gdb) bt
#0  inet_bind (sock=0xcf577300, uaddr=0xcfb55ed0, addr_len=16) at net/ipv4/af_inet.c:465
#1  0xc14a77af in sys_bind (fd=3, umyaddr=0xbf8b93c8, addrlen=16) at net/socket.c:1439
#2  0xc14a8b67 in sys_socketcall (call=2, args=0xbf8b8fb0) at net/socket.c:2421
#3  <signal handler called>
#4  0xb7687d22 in ?? ()
#5  0xb75ca723 in ?? ()
(gdb) l
460             unsigned short snum;
461             int chk_addr_ret;
462             int err;
463
464             /* If the socket has its own bind function then use it. (RAW) */
465             if (sk->sk_prot->bind) {
466                     err = sk->sk_prot->bind(sk, uaddr, addr_len);
467                     goto out;
468             }
469             err = -EINVAL;


就这样,算是debug tcp/ip stack起来了!