首页 > 代码库 > pg_repack bloat 处理测试初步
pg_repack bloat 处理测试初步
一、软件安装
1.软件需求:
postgresql-9.5.2.tar.gz
pg_repack-1.3.4.zip
2.安装pg_repack
[root@localhost pg_repack-1.3.4]# export PATH=/opt/pgsql/9.5.2/bin:$PATH
[root@localhost pg_repack-1.3.4]# export LD_LIBRARY_PATH=/opt/pgsql/9.5.2/lib
[root@localhost pg_repack-1.3.4]# export MANPATH=/opt/pgsql/9.5.2/share/man:$MANPATH
[root@localhost pg_repack-1.3.4]# make
make[1]: Entering directory `/home/soft/pg_repack-1.3.4/bin‘
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -I/opt/pgsql/9.5.2/include -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE -c -o pg_repack.o pg_repack.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -I/opt/pgsql/9.5.2/include -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE -c -o pgut/pgut.o pgut/pgut.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -I/opt/pgsql/9.5.2/include -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE -c -o pgut/pgut-fe.o pgut/pgut-fe.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 pg_repack.o pgut/pgut.o pgut/pgut-fe.o -L/opt/pgsql/9.5.2/lib -lpq -L/opt/pgsql/9.5.2/lib -Wl,--as-needed -Wl,-rpath,‘/opt/pgsql/9.5.2/lib‘,--enable-new-dtags -lpgcommon -lpgport -lz -lreadline -lrt -lcrypt -ldl -lm -o pg_repack
make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/bin‘
make[1]: Entering directory `/home/soft/pg_repack-1.3.4/lib‘
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -fpic -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE -c -o repack.o repack.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -fpic -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE -c -o pgut/pgut-be.o pgut/pgut-be.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -fpic -DREPACK_VERSION=1.3.4 -I. -I./ -I/opt/pgsql/9.5.2/include/server -I/opt/pgsql/9.5.2/include/internal -D_GNU_SOURCE -c -o pgut/pgut-spi.o pgut/pgut-spi.c
( echo ‘{ global:‘; gawk ‘/^[^#]/ {printf "%s;\n",$1}‘ exports.txt; echo ‘ local: *; };‘ ) >exports.list
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -fpic -shared -Wl,--version-script=exports.list -o pg_repack.so repack.o pgut/pgut-be.o pgut/pgut-spi.o -L/opt/pgsql/9.5.2/lib -Wl,--as-needed -Wl,-rpath,‘/opt/pgsql/9.5.2/lib‘,--enable-new-dtags
sed ‘s,REPACK_VERSION,1.3.4,g‘ pg_repack.sql.in > pg_repack--1.3.4.sql;
sed ‘s,REPACK_VERSION,1.3.4,g‘ pg_repack.control.in > pg_repack.control
make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/lib‘
make[1]: Entering directory `/home/soft/pg_repack-1.3.4/regress‘
make[1]: Nothing to be done for `all‘.
make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/regress‘
[root@localhost pg_repack-1.3.4]# make install
make[1]: Entering directory `/home/soft/pg_repack-1.3.4/bin‘
/bin/mkdir -p ‘/opt/pgsql/9.5.2/bin‘
/usr/bin/install -c pg_repack ‘/opt/pgsql/9.5.2/bin‘
make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/bin‘
make[1]: Entering directory `/home/soft/pg_repack-1.3.4/lib‘
/bin/mkdir -p ‘/opt/pgsql/9.5.2/lib‘
/bin/mkdir -p ‘/opt/pgsql/9.5.2/share/extension‘
/bin/mkdir -p ‘/opt/pgsql/9.5.2/share/extension‘
/usr/bin/install -c -m 755 pg_repack.so ‘/opt/pgsql/9.5.2/lib/pg_repack.so‘
/usr/bin/install -c -m 644 .//pg_repack.control ‘/opt/pgsql/9.5.2/share/extension/‘
/usr/bin/install -c -m 644 pg_repack--1.3.4.sql pg_repack.control ‘/opt/pgsql/9.5.2/share/extension/‘
make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/lib‘
make[1]: Entering directory `/home/soft/pg_repack-1.3.4/regress‘
make[1]: Nothing to be done for `install‘.
make[1]: Leaving directory `/home/soft/pg_repack-1.3.4/regress‘
[root@localhost pg_repack-1.3.4]#
3.创建初始环境
[postgres@localhost ~]$ createdb bloatdb
[postgres@localhost ~]$ psql -d bloatdb -c "create extension pgstattuple;"
CREATE EXTENSION
[postgres@localhost ~]$ psql -d bloatdb -c "CREATE EXTENSION pg_repack;"
CREATE EXTENSION
[postgres@localhost ~]$
$ psql bloatdb
psql (9.5.2)
Type "help" for help.
bloatdb=# \dx
List of installed extensions
Name | Version | Schema | Description
-------------+---------+------------+--------------------------------------------------------------
pg_repack | 1.3.4 | public | Reorganize tables in PostgreSQL databases with minimal locks
pgstattuple | 1.3 | public | show tuple-level statistics
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(3 rows)
二、静态(无活跃交易)膨胀整理测试
1.处理表tbl指定索引
1).准备环境
bloatdb=# create table tbl(id int primary key, first varchar(20),second varchar(20));
CREATE TABLE
bloatdb=# create index idx_tbl_first on tbl (first);
CREATE INDEX
bloatdb=# create index idx_tbl_second on tbl (second);
CREATE INDEX
bloatdb=# SELECT count(*) FROM tbl;
count
-------
0
(1 row)
bloatdb=# SELECT pg_size_pretty(pg_total_relation_size(‘tbl‘));
pg_size_pretty
----------------
24 kB
(1 row)
bloatdb=# INSERT INTO tbl VALUES(generate_series(1,10000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
INSERT 0 10000
bloatdb=# SELECT count(*) FROM tbl;
count
-------
10000
(1 row)
bloatdb=# SELECT pg_size_pretty(pg_total_relation_size(‘tbl‘));
pg_size_pretty
----------------
1584 kB
(1 row)
bloatdb=#
更新列
bloatdb=# UPDATE tbl SET first= ‘updated-001‘;
UPDATE 10000
bloatdb=# SELECT count(*) FROM tbl;
count
-------
10000
(1 row)
bloatdb=# SELECT pg_size_pretty(pg_total_relation_size(‘tbl‘));
pg_size_pretty
----------------
3376 kB
(1 row)
bloatdb=#
2).查询膨胀率
建立膨胀统计表
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" --create_stats_table
膨胀统计
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.idx_tbl_second.......................................................(52.69%) 417 kB wasted
2. public.idx_tbl_first........................................................(52.64%) 413 kB wasted
3. public.tbl_pkey.............................................................(57.79%) 388 kB wasted
[postgres@localhost ~]$
3).处理膨胀
指定数据库的特定索引
[postgres@localhost ~]$ pg_repack -d bloatdb --index idx_tbl_first
INFO: repacking index "public"."idx_tbl_first"
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.idx_tbl_second.......................................................(52.69%) 417 kB wasted
2. public.tbl_pkey.............................................................(57.79%) 388 kB wasted
3. public.idx_tbl_first.....................................................(0.93%) 3121 bytes wasted
[postgres@localhost ~]$
2.处理表tbl所有索引
1).准备环境
bloatdb=# update tbl set second=‘chris‘;
UPDATE 10000
bloatdb=# SELECT count(*) FROM tbl;
count
-------
10000
(1 row)
bloatdb=# SELECT pg_size_pretty(pg_total_relation_size(‘tbl‘));
pg_size_pretty
----------------
3600 kB
(1 row)
bloatdb=#
bloatdb=# update tbl set first=‘chris‘;
UPDATE 10000
bloatdb=# SELECT count(*) FROM tbl;
count
-------
10000
(1 row)
bloatdb=# SELECT pg_size_pretty(pg_total_relation_size(‘tbl‘));
pg_size_pretty
----------------
4176 kB
(1 row)
bloatdb=#
2).检查膨胀
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.idx_tbl_second.......................................................(59.94%) 820 kB wasted
2. public.idx_tbl_first........................................................(40.94%) 409 kB wasted
3. public.tbl_pkey.............................................................(28.73%) 193 kB wasted
[postgres@localhost ~]$
3).处理tbl表所有索引膨胀
[postgres@localhost ~]$ pg_repack -d bloatdb --table tbl --only-indexes
INFO: repacking indexes of "tbl"
INFO: repacking index "public"."idx_tbl_first"
INFO: repacking index "public"."idx_tbl_second"
INFO: repacking index "public"."tbl_pkey"
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.idx_tbl_first.....................................................(1.23%) 3028 bytes wasted
2. public.idx_tbl_second....................................................(1.23%) 3028 bytes wasted
3. public.tbl_pkey..........................................................(1.23%) 3028 bytes wasted
[postgres@localhost ~]$
3.处理tbl数据和索引膨胀
1).索引膨胀
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.idx_tbl_first.........................................................(57.87%) 49 MB wasted
2. public.idx_tbl_second........................................................(39.29%) 34 MB wasted
3. public.tbl_pkey..............................................................(51.22%) 26 MB wasted
2).处理膨胀online VACUUM FULL 数据库bloatdb表tbl(数据和索引)
[postgres@localhost ~]$ pg_repack --no-order --table tbl -d bloatdb
INFO: repacking table "tbl"
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.tbl_pkey..............................................................(0.0%) 0 bytes wasted
2. public.idx_tbl_second........................................................(0.0%) 0 bytes wasted
3. public.idx_tbl_first.........................................................(0.0%) 0 bytes wasted
[postgres@localhost ~]$
三、动态(有交易发生时)膨胀处理
1.整个表做膨胀处理
1).初始条件
-- clear table data
bloatdb=# select * from tbl;
id | first | second
----+-------+--------
(0 rows)
bloatdb=#
bloatdb=# INSERT INTO tbl VALUES(generate_series(1,100000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
INSERT 0 100000
bloatdb=# UPDATE tbl SET first= ‘updated-001‘;
UPDATE 100000
bloatdb=#
-- check bloat
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.idx_tbl_second........................................................(67.26%) 17 MB wasted
2. public.idx_tbl_first.........................................................(67.46%) 17 MB wasted
3. public.tbl_pkey............................................................(63.91%) 9832 kB wasted
[postgres@localhost ~]$
2).大量插入数据同时做膨胀处理
statement_timeout=0, 视情况调整:maintenance_work_mem,wal_keep_segments(streaming,SSD<2000>)
先插入数据,过程中处理膨胀加上-T参数值为3600.
-- session 1:insert data
bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
光标闪烁
-- session 2:repack during insert
$ pg_repack -d bloatdb --no-order --table tbl --wait-timeout=3600
INFO: repacking table "tbl"
光标闪烁
--session 1 finish insert
bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
INSERT 0 2900000
bloatdb=#
-- session 2: finish repack
[postgres@localhost ~]$ pg_repack -d bloatdb --no-order --table tbl --wait-timeout=3600
INFO: repacking table "tbl"
-- session 2:膨胀检查
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.tbl_pkey..............................................................(0.0%) 0 bytes wasted
2. public.idx_tbl_second........................................................(0.0%) 0 bytes wasted
3. public.idx_tbl_first.........................................................(0.0%) 0 bytes wasted
[postgres@localhost ~]$
-- session 1: 数据检查
bloatdb=# select count(*) from tbl ;
count
---------
3000000
(1 row)
bloatdb=#
2.指定tbl表所有索引膨胀处理
1).准备数据
--session 1: insert data
bloatdb=# delete FROM tbl;
DELETE 3000000
bloatdb=# INSERT INTO tbl VALUES(generate_series(1,100000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
INSERT 0 100000
bloatdb=# update tbl set first=‘chris‘;
UPDATE 100000
bloatdb=#
-- session 2:check bloat
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.tbl_pkey..............................................................(41.14%) 28 MB wasted
2. public.idx_tbl_second.......................................................(4.32%) 4471 kB wasted
3. public.idx_tbl_first........................................................(2.96%) 2889 kB wasted
[postgres@localhost ~]$
2).online insert and repack
--session 1: insert large data
bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
光标闪烁
-- session 2:process bloat,during session 1 inert large data
[postgres@localhost ~]$ pg_repack -d bloatdb --table tbl --only-indexes -T 3600
INFO: repacking indexes of "tbl"
INFO: repacking index "public"."idx_tbl_first"
INFO: repacking index "public"."idx_tbl_second"
光标闪烁
--session 1:insert finish
bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
INSERT 0 2900000
bloatdb=#
--session 2:repack finish
[postgres@localhost ~]$ pg_repack -d bloatdb --table tbl --only-indexes -T 3600
INFO: repacking indexes of "tbl"
INFO: repacking index "public"."idx_tbl_first"
INFO: repacking index "public"."idx_tbl_second"
INFO: repacking index "public"."tbl_pkey"
3) check table data and index bloat
--session 2:check bloat
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.tbl_pkey..............................................................(0.0%) 0 bytes wasted
2. public.idx_tbl_first.........................................................(0.0%) 0 bytes wasted
3. public.idx_tbl_second........................................................(0.0%) 0 bytes wasted
[postgres@localhost ~]$
--session 1:check table data
bloatdb=# select count(*) from tbl;
count
---------
3000000
(1 row)
bloatdb=#
3.指定tbl表指定索引膨胀处理
注意:--index(concurrently方式创建索引),--only-indexes无法同时使用。
[postgres@localhost ~]$ pg_repack -d bloatdb --index idx_tbl_first --only-indexes
ERROR: cannot specify --index (-i) and --only-indexes (-x)
1).准备数据
-- read data
bloatdb=# delete FROM tbl;
DELETE 3000000
bloatdb=# INSERT INTO tbl VALUES(generate_series(1,100000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
INSERT 0 100000
bloatdb=# update tbl set first=‘chris‘;
UPDATE 100000
bloatdb=#
-- check bloat
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.idx_tbl_second........................................................(47.57%) 97 MB wasted
2. public.tbl_pkey.............................................................(9.44%) 7206 kB wasted
3. public.idx_tbl_first........................................................(3.11%) 3040 kB wasted
[postgres@localhost ~]$
2).online insert and repack
--session 1: insert large data
bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
光标闪烁
-- session 2:process bloat,during session 1 inert large data
[postgres@localhost ~]$ pg_repack -d bloatdb --index idx_tbl_second --wait-timeout=3600
INFO: repacking index "public"."idx_tbl_second"
光标闪烁
--session 1:insert finish
bloatdb=# INSERT INTO tbl VALUES(generate_series(100001,3000000), ‘first‘||(random()*(10^3))::integer, ‘second‘||(random()*(10^3))::integer);
INSERT 0 2900000
bloatdb=#
--session 2:repack finish
[postgres@localhost ~]$ pg_repack -d bloatdb --index idx_tbl_second --wait-timeout=3600
INFO: repacking index "public"."idx_tbl_second"
[postgres@localhost ~]$
3) check table data and index bloat
--session 2:check bloat
[postgres@localhost ~]$ /home/soft/pg_bloat_check-master/pg_bloat_check.py -c "dbname=bloatdb" -t tbl
1. public.idx_tbl_first........................................................(50.77%) 102 MB wasted
2. public.tbl_pkey...............................................................(47.6%) 65 MB wasted
3. public.idx_tbl_second........................................................(0.0%) 0 bytes wasted
[postgres@localhost ~]$
--session 1:check table data
bloatdb=# select count(*) from tbl;
count
---------
3000000
(1 row)
bloatdb=#
测试结论:
一般同等条件下,索引比数据更容易膨胀。
在磁盘空间较紧张的情况下,建议一条接着一条索引处理。
一般bloat处理所需磁盘空闲空间是对象size的2倍,所以处理前必须先关注空闲磁盘空间大小。
注意pg_repack版本对Pg版本的支持情况,9.6截至2016-11-26仍未支持,详见http://pgxn.org/dist/pg_repack/doc/pg_repack.html#Releases。
处理存在在线交易的表或者索引对象的bloat时,注意设置超时参数--wait-timeout,一般设置为1800或3600(特别感谢李海龙建议)。
特别声明:本说明只针对此次测试环境,在生产环境要在业务低峰时期运行,为了保证系统数据安全,建议先备份数据,然后做膨胀处理
本文出自 “yiyi” 博客,请务必保留此出处http://heyiyi.blog.51cto.com/205455/1876843
pg_repack bloat 处理测试初步