首页 > 代码库 > [dpdk] 熟悉SDK与初步使用 (三)(IP Fragmentation源码分析)
[dpdk] 熟悉SDK与初步使用 (三)(IP Fragmentation源码分析)
对例子IP Fragmentation的熟悉,使用,以及源码分析。
问题一:
main()函数大概是这样的:标红的三行将与下面叙述的事情相关
int main(int argc, char **argv) { ... ... /* init EAL */ ret = rte_eal_init(argc, argv); if (ret < 0) rte_exit(EXIT_FAILURE, "rte_eal_init failed"); ... ... /* launch per-lcore init on every lcore */ rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER); RTE_LCORE_FOREACH_SLAVE(lcore_id) { if (rte_eal_wait_lcore(lcore_id) < 0) return -1; } return 0; }
其中,函数 rte_eal_wait_lcore 的实现如下:
/* * Wait until a lcore finished its job. */ int rte_eal_wait_lcore(unsigned slave_id) { if (lcore_config[slave_id].state == WAIT) return 0; while (lcore_config[slave_id].state != WAIT && lcore_config[slave_id].state != FINISHED); rte_rmb(); /* we are in finished state, go to wait state */ lcore_config[slave_id].state = WAIT; return lcore_config[slave_id].ret; }
阅读红色部分,可以很明显的发现,这是一个死循环啊!!! 从字面意义上来看,main函数在完成了remote_launch之后,主进程会在这个函数里等等子进程结束。
这样的话,用一个死循环来等,难道不会有问题吗??? 所以我要的debug它一下看看怎么回事。 于是,为了达到这个目的,我分别经历了下文中的问题二三四。终于debug成功了。解答如下:
解答起来其实也很简单,只需要看下 rte_eal_mp_remote_launch() 函数的代码,就明白了。它的代码如下:
66 /* 67 * Check that every SLAVE lcores are in WAIT state, then call 68 * rte_eal_remote_launch() for all of them. If call_master is true 69 * (set to CALL_MASTER), also call the function on the master lcore. 70 */ 71 int 72 rte_eal_mp_remote_launch(int (*f)(void *), void *arg, 73 enum rte_rmt_call_master_t call_master) 74 { 75 int lcore_id; 76 int master = rte_get_master_lcore(); 77 78 /* check state of lcores */ 79 RTE_LCORE_FOREACH_SLAVE(lcore_id) { 80 if (lcore_config[lcore_id].state != WAIT) 81 return -EBUSY; 82 } 83 84 /* send messages to cores */ 85 RTE_LCORE_FOREACH_SLAVE(lcore_id) { 86 rte_eal_remote_launch(f, arg, lcore_id); 87 } 88 89 if (call_master == CALL_MASTER) { 90 lcore_config[master].ret = f(arg); 91 lcore_config[master].state = FINISHED; 92 } 93 94 return 0; 95 }
从第90行可以看出。主进程在这里进入了业务逻辑,所以直到程序退出之前。它都没有机会执行前边的那个死循环。也就是说,主进程当进入死循环的时候,也说明其他进程即将结束。并不会存在长期空跑CPU的情况。 不过,如果业务逻辑写错了呢? 子进程并没有如逾期退出的话,是否会进入循环? 这里暂时先留下这个疑问。
另一个需要纪录下来的东西是。所有有需要的函数,实际上在rte_eal_init() 函数中便都创建完成了。remote_launch()函数实际上只是为其他进程传递一个启动运行的消息。
具体消息内容,目前我没有深入分析。
问题二:
运行不起来,启用DEBUG,gdb跟踪一下。
这个makefile也是很那难用的。摸索了一下,有几个命令,比较有用的如下:
[root@dpdk dpdk]# make help
[root@dpdk dpdk]# make V=yes D=yes
以上命令并没有用,到各模块的MAKEFILE里,将-O3手工改成-g,重新编译,才奏效。
问题三:
通过gdb发现,启动不了跟网卡特效有关系。
a。初始化函数中默认的参数是启用 硬checksum 等 offload 特性的。由于我模拟的网卡不支持,只能关掉。
static const struct rte_eth_conf port_conf = { .rxmode = { .max_rx_pkt_len = JUMBO_FRAME_MAX_SIZE, .split_hdr_size = 0, .header_split = 0, /**< Header Split disabled */ .hw_ip_checksum = 0, /**< IP checksum offload enabled */ .hw_vlan_filter = 0, /**< VLAN filtering disabled */ .jumbo_frame = 1, /**< Jumbo Frame Support enabled */ .hw_strip_crc = 0, /**< CRC stripped by hardware */ }, .txmode = { .mq_mode = ETH_MQ_TX_NONE, }, };
b. 另一处修改
/* init one TX queue per couple (lcore,port) */ queueid = 0; for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { if (rte_lcore_is_enabled(lcore_id) == 0) continue; socket = (int) rte_lcore_to_socket_id(lcore_id); printf("txq=%u,%d ", lcore_id, queueid); fflush(stdout); rte_eth_dev_info_get(portid, &dev_info); txconf = &dev_info.default_txconf; txconf->txq_flags = 0 | ETH_TXQ_FLAGS_NOXSUMS; ret = rte_eth_tx_queue_setup(portid, queueid, nb_txd, socket, txconf); if (ret < 0) { printf("\n"); rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup: " "err=%d, port=%d\n", ret, portid); } qconf = &lcore_queue_conf[lcore_id]; qconf->tx_queue_id[portid] = queueid; queueid++; }
c. 我之前模拟的网卡不支持多队列,经过学习研究,让 qemu/kvm 支持了多队列。另写了一篇,如下:
[Virtualization][qemu][kvm][virtio] 使用 QEMU/KVM 模拟网卡多队列
启动成功:
[root@dpdk build]# ./ip_fragmentation -l 6,7 -- -p 3 EAL: Detected 8 lcore(s) EAL: Probing VFIO support... EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles ! PMD: bnxt_rte_pmd_init() called for (null) EAL: PCI device 0000:00:03.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd EAL: PCI device 0000:00:04.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd EAL: PCI device 0000:00:05.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd IP_FRAG: Creating direct mempool on socket 1 IP_FRAG: Creating indirect mempool on socket 1 IP_FRAG: Creating LPM table on socket 1 IP_FRAG: Creating LPM6 table on socket 1 Initializing port 0 on lcore 6... Address:00:00:00:01:00:01 txq=6,0 txq=7,1 Initializing port 1 on lcore 7... Address:00:00:00:01:00:02 txq=6,0 txq=7,1 IP_FRAG: Socket 1: adding route 100.10.0.0/16 (port 0) IP_FRAG: Socket 1: adding route 100.20.0.0/16 (port 1) IP_FRAG: Socket 1: adding route 100.30.0.0/16 (port 2) IP_FRAG: Socket 1: adding route 100.40.0.0/16 (port 3) IP_FRAG: Socket 1: adding route 100.50.0.0/16 (port 4) IP_FRAG: Socket 1: adding route 100.60.0.0/16 (port 5) IP_FRAG: Socket 1: adding route 100.70.0.0/16 (port 6) IP_FRAG: Socket 1: adding route 100.80.0.0/16 (port 7) IP_FRAG: Socket 1: adding route 0101:0101:0101:0101:0101:0101:0101:0101/48 (port 0) IP_FRAG: Socket 1: adding route 0201:0101:0101:0101:0101:0101:0101:0101/48 (port 1) IP_FRAG: Socket 1: adding route 0301:0101:0101:0101:0101:0101:0101:0101/48 (port 2) IP_FRAG: Socket 1: adding route 0401:0101:0101:0101:0101:0101:0101:0101/48 (port 3) IP_FRAG: Socket 1: adding route 0501:0101:0101:0101:0101:0101:0101:0101/48 (port 4) IP_FRAG: Socket 1: adding route 0601:0101:0101:0101:0101:0101:0101:0101/48 (port 5) IP_FRAG: Socket 1: adding route 0701:0101:0101:0101:0101:0101:0101:0101/48 (port 6) IP_FRAG: Socket 1: adding route 0801:0101:0101:0101:0101:0101:0101:0101/48 (port 7) Checking link status done Port 0 Link Up - speed 10000 Mbps - full-duplex Port 1 Link Up - speed 10000 Mbps - full-duplex IP_FRAG: entering main loop on lcore 7 IP_FRAG: -- lcoreid=7 portid=1 IP_FRAG: entering main loop on lcore 6 IP_FRAG: -- lcoreid=6 portid=0
问题四:
如何查看编译选项,使用的静态库。修改编译选项,启动debug等? 唯一的办法是makefile。结构还是很清晰的。但是,依然需要花很长的时间读。
打印编译命令的方法如下:
修改文件 /sdk/@dpdk/dpdk-stable-16.07.1 mk/internal/rte.compile-pre.mk 中的 C_TO_O_DO 变量: 第101行,为新增内容。
99 C_TO_O_DO = @set -e; \ 100 echo $(C_TO_O_DISP); 101 echo $(C_TO_O); 102 $(C_TO_O) && \ 103 $(PMDINFO_TO_O) && 104 echo $(C_TO_O_CMD) > $(call obj2cmd,$(@)) && 105 sed ‘s,‘$@‘:,dep_‘$@‘ =,‘ $(call obj2dep,$(@)).tmp > $(call obj2dep,$(@)) && \ 106 rm -f $(call obj2dep,$(@)).tmp 107
打印链接命令的方法如下:
修改文件 /sdk/@dpdk/dpdk-stable-16.07.1 mk/rte.app.mk 中的 O_TO_EXE_DO 变量: 第209行,为新增内容。
207 O_TO_EXE_DO = @set -e; 208 echo $(O_TO_EXE_DISP); 209 echo $(O_TO_EXE); 210 $(O_TO_EXE) && 211 echo $(O_TO_EXE_CMD) > $(call exe2cmd,$(@)) 212
实现效果如下:
[root@dpdk ip_fragmentation]# make echo "xxxxccccxxxx" xxxxccccxxxx CC main.o gcc -Wp,-MD,./.main.o.d.tmp -m64 -pthread -march=native -DRTE_MACHINE_CPUFLAG_SSE -DRTE_MACHINE_CPUFLAG_SSE2 -DRTE_MACHINE_CPUFLAG_SSE3 -DRTE_MACHINE_CPUFLAG_SSSE3 -DRTE_MACHINE_CPUFLAG_SSE4_1 -DRTE_MACHINE_CPUFLAG_SSE4_2 -I/root/src/sdk/@dpdk/dpdk-stable-16.07.1/examples/ip_fragmentation/build/include -I/root/dpdk//x86_64-native-linuxapp-gcc/include -include /root/dpdk//x86_64-native-linuxapp-gcc/include/rte_config.h -g -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-align -Wnested-externs -Wcast-qual -Wformat-nonliteral -Wformat-security -Wundef -Wwrite-strings -Wno-return-type -o main.o -c /root/src/sdk/@dpdk/dpdk-stable-16.07.1/examples/ip_fragmentation/main.c LD ip_fragmentation gcc -o ip_fragmentation -m64 -pthread -march=native -DRTE_MACHINE_CPUFLAG_SSE -DRTE_MACHINE_CPUFLAG_SSE2 -DRTE_MACHINE_CPUFLAG_SSE3 -DRTE_MACHINE_CPUFLAG_SSSE3 -DRTE_MACHINE_CPUFLAG_SSE4_1 -DRTE_MACHINE_CPUFLAG_SSE4_2 -I/root/src/sdk/@dpdk/dpdk-stable-16.07.1/examples/ip_fragmentation/build/include -I/root/dpdk//x86_64-native-linuxapp-gcc/include -include /root/dpdk//x86_64-native-linuxapp-gcc/include/rte_config.h -g -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-align -Wnested-externs -Wcast-qual -Wformat-nonliteral -Wformat-security -Wundef -Wwrite-strings main.o -L/root/dpdk//x86_64-native-linuxapp-gcc/lib -Wl,-lrte_kni -Wl,-lrte_pipeline -Wl,-lrte_table -Wl,-lrte_port -Wl,-lrte_pdump -Wl,-lrte_distributor -Wl,-lrte_reorder -Wl,-lrte_ip_frag -Wl,-lrte_meter -Wl,-lrte_sched -Wl,-lrte_lpm -Wl,--whole-archive -Wl,-lrte_acl -Wl,--no-whole-archive -Wl,-lrte_jobstats -Wl,-lrte_power -Wl,--whole-archive -Wl,-lrte_timer -Wl,-lrte_hash -Wl,-lrte_vhost -Wl,-lrte_kvargs -Wl,-lrte_mbuf -Wl,-lethdev -Wl,-lrte_cryptodev -Wl,-lrte_mempool -Wl,-lrte_ring -Wl,-lrte_eal -Wl,-lrte_cmdline -Wl,-lrte_cfgfile -Wl,-lrte_pmd_bond -Wl,-lrte_pmd_af_packet -Wl,-lrte_pmd_bnxt -Wl,-lrte_pmd_cxgbe -Wl,-lrte_pmd_e1000 -Wl,-lrte_pmd_ena -Wl,-lrte_pmd_enic -Wl,-lrte_pmd_fm10k -Wl,-lrte_pmd_i40e -Wl,-lrte_pmd_ixgbe -Wl,-lrte_pmd_null -Wl,-lrte_pmd_ring -Wl,-lrte_pmd_virtio -Wl,-lrte_pmd_vhost -Wl,-lrte_pmd_vmxnet3_uio -Wl,-lrte_pmd_null_crypto -Wl,--no-whole-archive -Wl,-lrt -Wl,-lm -Wl,-ldl -Wl,-export-dynamic -Wl,-export-dynamic -L/root/src/sdk/@dpdk/dpdk-stable-16.07.1/examples/ip_fragmentation/build/lib -L/root/dpdk//x86_64-native-linuxapp-gcc/lib -Wl,--as-needed -Wl,-Map=ip_fragmentation.map -Wl,--cref INSTALL-APP ip_fragmentation INSTALL-MAP ip_fragmentation.map [root@dpdk ip_fragmentation]#
[dpdk] 熟悉SDK与初步使用 (三)(IP Fragmentation源码分析)