首页 > 代码库 > Mysql Partition 理论知识总结

Mysql Partition 理论知识总结

简述: 
本文内容主要 Giuseppe Maxia 曾在Mysql Conference & Expo 2010发表关于 <Mysql Partition in Mysql 5.1 & 5.5>  经由整理后的内容,原文在下面的
Presentation URL,本文用于自身学习 。我自身关于分区与未分区的测试,打算发表于另一篇博文。
Giuseppe Maxia Blog
http://datacharmer.blogspot.com/
http://datacharmer.com/
Presentation URL http://en.oreilly.com/mysql2010/public/schedule/detail/12431 

------------------------------------------------------------------
1.why partition
  数据量太大,以至于indexes大小超出RAM能的保存范围时
2.物理分区与逻辑分区的区别:
  好比一张纸查一小块区域,物理分区是分成各碎片,逻辑分区则是折叠起来隐藏不需要的部分
------------------------------------------------------------------
3.在5.1中解析分区表的查询
mysql> explain partitions select * from click_statistics where add_time between "2011-07-25" and "2011-07-26" ;
+----+-------------+---------------+-----------------+-------+---------------+--------------+---------+------+-------+-------------+
| id | select_type | table         | partitions      | type  | possible_keys | key          | key_len | ref  | rows  | Extra       |
+----+-------------+---------------+-----------------+-------+---------------+--------------+---------+------+-------+-------------+
|  1 | SIMPLE      | statistics    | p201012,p201107 | range | idx_reg_time  | idx_reg_time | 8       | NULL | 49180 | Using where |
+----+-------------+---------------+-----------------+-------+---------------+--------------+---------+------+-------+-------------+
1 row in set (0.03 sec)
.查询partition信息
select partition_name part,
from_days(partition_expression) expr,
partition_descriptiton val 
from information_schema.partitions 
where table_name=‘t1‘ ; 

------------------------------------------------------------------
4.各分区类型的介绍
.Partition ranges must be listed smallest to greatest and must be integers
.The primary key must include all columns in the partitioning expression.
.HASH(expr) must return an integer
.HASH partition-mysql decides row placement using mod;
.Key partition,unlike HASH,the partitioning expression does not have to be an integer,the      hashing algorithm is similar to PASSWORD();
.RANGE,LIST and HASH must use integer values.
.No stored functions,stored procedures,or UDFs
------------------------------------------------------------------
5
.RANGE,LIST and HASH 分区支持的函数
ABS()
CEILING()
DAY()
DAYOFMONTH()
DAYOFWEEK()
DAYOFYEAR()
DATEDIFF()
EXTRACT()
FLOOR()
HOUR()
MICROSECOND()
MINUTE()
MOD()
MONTH()
QUARTER()
SECOND()
TIME_TO_SEC()
TO_DAYS()
WEEKDAY()
YEAR()
YEARWEEK()
-------------------------------------------------------------------
6.
.Indexed are partitioned along with the data
.partition keys must be unique constraint
.MYISAM - When partition , DATA DIRECTORY and INDEX DIRECTORY parameter can be used per partition
.INNODB - if innodb_file_per_table is not set, the partitions are in the centralized table space,if set each partition has an .ibd file
.when need partition by date , thinking ?
 可取方案 YEAR(date_column) / TO_DAYS(date_column) 意义明了,查询条件确定能使用到分区
 ------------------------------------------------------------------
7.Advanced partitions
   (to differentiate roles on master and slave - fast insert on the master with HASH partitions ,fast statistics in a slave with range partitions)
. Subpartitioning
RANGE or LIST partitions can be subpartitioned with HASH or KEY
.当没有匹配的partition分配时 ERROR 1526,insert 失败,此时partitions as constraints
------------------------------------------------------------------
8.Benchmarking partition
(If needed,remove PK from partitioned table)
(restart the server before each test)
(Do NOT mix partitioned and unpartitioned table in ther same server)
(Measure more than once)
Innodb 
 disk usage(expect more than myisam)
 CPU usage
MyISAM
 2 file handles per partition
 If more than one partitioned table,count total file handles
Archive
 CPU usage
 memory usage

------------------------------------------------------------------
9.Partitons Maintenance
.For RANGE partitioning,partitions can only be added to the end。最末尾,加上值更大的类别
.For LIST partitioning,partitions can be added that use different values 新增一个类别
.For HASH and KEY partitioning,Adding partitions will reorganize the data using the hashing algorithm. 重组
.For HASH and KEY partitioning,DROP PARTITION does not work,instead,use coalesce to merge partititons, 并非删除记录,而是分区数量减小,
    ALTER TABLE ... COALESCE PARTITION 2;
.REORGANIZE PARTITION - split or merge partitions
.REMOVE PARTITIONING - reverts to an unpartitioned table
.CHECK PARTITION - same as check table
.REPAIR PARTITION - same as repair table
.ANALYZE PARTITION -same as analyze table
.OPTIMIZE PARTITION - same as optimize table
.A perl script that creates partitioning statements
.mysqldump_partition_backup
.drop_oldest_partition (procedure+event per month)
.check & repair test - via move .MYI

------------------------------------------------------------------
10.实验部分 (略)
Partitions with InnoDB - Slower than MyISAM / But more robust / Requires more storage
相同记录,不同表引擎的数据文件大小
 group 1 : innodb / myisam / archive
 group 2 : innodb partititoned(whole) / (file per table) /myisam partitioned / archive partitoned
 
 Benchmark results:
Partitions with InnoDB
Partitions with Archive
wget http://launchpad.net/mysql-sandbox/mysql-sandbox-3/mysql-sandbox-3/+download/MySQL-Sandbox-3.0.15.tar.gz
wget http://launchpad.net/test-db/employees-db-1/1.0.6/+download/employees_db-dump-files-1.0.5.tar.bz2
-----------------------------------
leveraging replication 杠杆作用- 以不同组合的复制架构 满足不同的使用需求
Master[INNODB NOT PARTITIONED]"concurrent insert" ---    |
                                                                                                  |---- > Slave1 [INNODB NOT PARTITIONED] "concurrent read"
                                                                                                  |------ > Slave2 [INNODB PARTITIONED BY RANGE]"concurrent batch processing"
                                                                                                  |-------- > Slave3 [MyISAM PARTITIONED BY RANGE] "large batch processing"
解析架构1: 满足大并发读写 ,又满足并发批处理,大型批处理

Master[INNODB PARTITIONED BY HASH]"concurrent insert" ----|
                                                                                                          |---- > Slave1[INNODB NOT PARTITIONED]"concurrent read"
                                                                                                          |------ > Slave2[MYISAM PARTITIONED BY RANGE]"large batch processing"           
解析架构2: 架构2在架构1的基础上削剪了slave2

Master[INNODB PARTITIONED BY HASH]"concurrent insert" ----|
                                                                                                          |---- > Slave1 [ARCHIVE PARTITIONED BY RANGE(locations)]"dimensional processing"
                                                                                                          |------ > Slave2 [ARCHIVE PARTITIONED BY RANGE(date)]"dimensional processing"
                                                                                                          |------ >  Slave3 [ARCHIVE PARTITIONED BY RANGE(product)]"dimensional processing"
解析架构3: 各个按不同的标准分区,以满足快速根据不同条件查询,且使用Archive引擎
-----------------------------------
Mysql 5.5对于partition的增强
PARTITION BY RANGE COLUMNS ,并且可以按两列来分区,典型的如gender+hire_date,男分N个,女分N个
PARTITION BY LIST COLUMNS ,
TO_SECONDS

Mysql Partition 理论知识总结