首页 > 代码库 > 一个内存泄露问题的分析和处理(二)——valgrind工具的用法

一个内存泄露问题的分析和处理(二)——valgrind工具的用法

  valgrind是linux下对C++和C程序进行内存泄露检测的工具,除了内存检测,valgrind还提供了很多其他的功能,这里主要介绍下valgrind的内存检测的功能。

  首先是文件的下载,valgrind的官方网址是http://valgrind.org/,最新版本的valgrind是3.9,下载地址如下:http://valgrind.org/downloads/。下载好的文件是tar.bz2格式的文件——valgrind-3.9.0.tar.bz2,linux下可以使用tar命令对压缩包进行解压,命令如下:

  

?
1
tar jxf valgrind-3.9.0.tar.bz2

  解压后,需要对软件进行编译安装,进入解压目录,执行配置命令

?
1
./configure

  可以指定路径安装,使用 --prefix + 路径 ,默认的安装路径为/usr/local/bin,如果指定了安装路径,后续需要把valgrind的安装路径配置到PATH环境变量中,接下来是编译安装,执行下面的命令

?
1
make install

  如果执行命令时发现没有权限创建目录,可以使用管理员权限执行,Ubuntu下使用sudo命令即可,安装完成,执行valgrind --version,如果有版本信息出现,就安装成功,否则是安装失败。

?
1
2
3
mengpl@mengpl-virtual-machine:/usr/local$ valgrind --version
valgrind-3.9.0
mengpl@mengpl-virtual-machine:/usr/local$

  安装过程中,还遇到另外一个问题

?
1
2
3
4
5
6
7
8
9
valgrind --leak-check=yes ls -l
==7674== Memcheck, a memory error detector
==7674== Copyright (C) 2002-2012, and GNU GPL‘d, by Julian Seward et al.
==7674== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==7674== Command: ls -l
==7674==
 
valgrind:  Fatal error at startup: a function redirection
valgrind:  which is mandatory for this platform-tool combination

  此时需要安装libc6-dbg库,Ubuntu下的安装命令为:

?
1
sudo apt-get install libc6-dbg

  正常使用valgrind检测内存的输出如下:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
mengpl@mengpl-virtual-machine:/usr/local$ valgrind --leak-check=yes ls
==8260== Memcheck, a memory error detector
==8260== Copyright (C) 2002-2013, and GNU GPL‘d, by Julian Seward et al.
==8260== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==8260== Command: ls
==8260==
bin  etc  games  include  lib  man  sbin  share  src
==8260==
==8260== HEAP SUMMARY:
==8260==     in use at exit: 21,373 bytes in 15 blocks
==8260==   total heap usage: 50 allocs, 35 frees, 58,571 bytes allocated
==8260==
==8260== LEAK SUMMARY:
==8260==    definitely lost: 0 bytes in 0 blocks
==8260==    indirectly lost: 0 bytes in 0 blocks
==8260==      possibly lost: 0 bytes in 0 blocks
==8260==    still reachable: 21,373 bytes in 15 blocks
==8260==         suppressed: 0 bytes in 0 blocks
==8260== Reachable blocks (those to which a pointer was found) are not shown.
==8260== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==8260==
==8260== For counts of detected and suppressed errors, rerun with: -v
==8260== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

  使用valgrind对程序进行分析,主要的使用就是内存泄露的分析,也是valgrind默认的工具,valgrind的具体用法,请看man手册,我这里只介绍一种用法,一条命令

?
1
valgrind --leak-check=full --show-leak-kinds=all -v

  这个命令 + 需要测试的程序,就可以输出一套valgrind的统计报告,下面我重点介绍下对于报告的分析,由于我这个程序比较复杂,里面执行到了很多的内容,我把程序的最终内容进行了归类,主要分为四部分的内容:

  1)内存泄露统计

?
1
2
3
4
5
6
==6622== LEAK SUMMARY:
==6622==    definitely lost: 1,636 bytes in 9 blocks
==6622==    indirectly lost: 9,421 bytes in 92 blocks
==6622==      possibly lost: 186,418 bytes in 840 blocks
==6622==    still reachable: 728,283 bytes in 1,917 blocks
==6622==         suppressed: 0 bytes in 0 blocks

  2、错误统计

?
1
ERROR SUMMARY: 3465 errors from 1182 contexts (suppressed: 2 from 2)

  3、内存泄露明细

?
1
2
3
4
5
6
7
8
9
10
11
12
13
==6622== 152 bytes in 19 blocks are indirectly lost in loss record 1,814 of 1,979
==6622==    at 0x4C2C857: malloc (vg_replace_malloc.c:291)
==6622==    by 0x7614CDB: link_nfa_nodes (regex_internal.c:1000)
==6622==    by 0x761D115: re_compile_internal (regcomp.c:1227)
==6622==    by 0x7620E7E: regcomp (regcomp.c:506)
==6622==    by 0xCE694B5: sal::routecfg::CalRouteHId(long long, int&) (routesalcfg_info.cpp:555)
==6622==    by 0xFC15A46: MRoute::CalcHorIdByMajorDim(long long) (route_common.cpp:141)
==6622==    by 0xFC38736: MRoute::CIRouteImp::query_routeByStr(SOBSession*, short const&, std::string const&, std::string const&, MRouteDef::SRouteInfo&, CBSErrorMsg&) (route_inf_sdl_i.cpp:411)
==6622==    by 0xFC40E59: MRoute::SearchDbRouteSync(std::string const&, MRouteDef::SRouteInfo&) (route_interface.cpp:10)
==6622==    by 0xFC48D0D: MRoute::search_routing_info(lua_State*) (route_lua_api.cpp:58)
==6622==    by 0xA3899E5: luaD_precall (ldo.cpp:320)
==6622==    by 0xA389C9D: luaD_call (ldo.cpp:377)
==6622==    by 0xA37F1AF: f_call(lua_State*, void*) (lapi.cpp:800)

  4、错误明细

?
1
2
3
4
5
6
7
8
9
10
11
12
13
==6622== Use of uninitialised value of size 8
==6622==    at 0x9643FD8: ztcedecb (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x96439C1: ztcedencbk (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x9643126: ztcebn (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x9642C46: ztcen (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x7DDE0C8: ztceenc (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x7E91769: ztcrbm (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x7E912F3: ztcrbh (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x7E911B5: ztcrbp (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x7E910BE: ztcr2seed (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x7E91081: ztcrseed3 (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x7DDECC7: ztcsh (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)
==6622==    by 0x7D35A08: kpusattr (in /opt/oracle/product/10.2.0/client_1/lib/libclntsh.so.11.1)

  首先要对统计信息进行分析,分为四类信息,我们来看下官方对于这四类的解释:

  1、"definitely lost" means your program is leaking memory -- fix those leaks

    这就是内存泄露,需要修改

  2、"indirectly lost" means your program is leaking memory in a pointer-based structure. (E.g. if the root node of a binary tree is "definitely lost", all the children will be "indirectly lost".) If you fix the "definitely lost" leaks, the "indirectly lost" leaks should go away.

    这个意思就是说,泄露的内存是一个指针,也就是我们平常讲的野指针的问题

  3、"still reachable" means your program is probably ok -- it didn’t free some memory it could have. This is quite common and often reasonable. Don’t use --show-reachable=yes if you don’t want to see these reports.

    你的程序可能是好的,意思是,在程序没有结束之前,这部分内存一直都没有释放

  4、"suppressed" means that a leak error has been suppressed. There are some suppressions in the default suppression files. You can ignore suppressed errors.

    这个可以直接忽略

  内存泄露的明细中,会表示出来每一个可能存在内存泄露的地方,可以逐个进行分析,当遇到情况比较多的情况,可以对这些进行分类,在具体分析代码。

  错误明细中,也会存在几种情况,我例子中提到的Use of uninitialised value of size 8 内存没有初始化等。具体的valgrind的用法,推荐一篇文件

https://www.ibm.com/developerworks/cn/linux/l-cn-valgrind/

  这里面介绍的非常详细和周到。

  关于内存泄露的问题,还没有定位到具体的位置,后续如果定位到,会把结果补充上来。