首页 > 代码库 > 排查 “Detected Tx Unit Hang”问题

排查 “Detected Tx Unit Hang”问题

实现功能:

使用自己已经分配的内存让skb->data指向,而不是使用alloc_malloc()。

部分代码如下:   

 1             /* 2              * build a new sk_buff 3              */ 4             //struct sk_buff *send_skb = kmem_cache_alloc_node(skbuff_head_cache, GFP_ATOMIC & ~__GFP_DMA, NUMA_NO_NODE); 5             struct sk_buff *send_skb = kmem_cache_alloc(skbuff_head_cache, GFP_ATOMIC & ~__GFP_DMA); 6  7             if (!send_skb) { 8                 //spin_unlock(&lock); 9                 return NF_DROP;10             }11             12             //printk("what2\n");13             memset(send_skb, 0, offsetof(struct sk_buff, tail));14             atomic_set(&send_skb->users, 2);15             send_skb->cloned = 0;16             17             send_skb->head = mmap_buf + 1024;18             send_skb->data = http://www.mamicode.com/mmap_buf + 1024;19             

第18行,mmap_buf是提前分配的内存。

在/var/log/messages中网卡驱动会输出错误信息:

 1 ep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang 2 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <13> 3 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <0>, <1ea> 4 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ea> 5 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <0> 6 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang 7 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <15> 8 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <1>, <1eb> 9 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1eb>10 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <1>11 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang12 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <14>13 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <0>, <1ea>14 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ea>15 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <0>16 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang17 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <4>18 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <0>, <1ea>19 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ea>20 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <0>21 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang22 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <12>23 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <5>, <1ef>24 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ef>25 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <5>26 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang27 Sep 28 15:36:17 10g-host2 kernel:  Tx Queue             <2>28 Sep 28 15:36:17 10g-host2 kernel:  TDH, TDT             <2>, <1ec>29 Sep 28 15:36:17 10g-host2 kernel:  next_to_use          <1ec>30 Sep 28 15:36:17 10g-host2 kernel:  next_to_clean        <2>31 Sep 28 15:36:17 10g-host2 kernel: ixgbe 0000:03:00.0: eth2: Detected Tx Unit Hang

在排除各种原因后,定位为分配的mmap_buf存在问题。使用vmalloc()分配不正确,改为kmalloc()后正常。

《Linux内核设计与实现》第12.5节有解释,应该是:网卡设备要求分配的物理地址连续,而vmalloc()只是虚拟地址连续

 

排查 “Detected Tx Unit Hang”问题