mysql分区表
时间:2017-08-06 11:15 来源:潇湘夜雨 作者:liangzh 点击:次
一、什么是表分区
通俗地讲表分区是将一大表,根据条件分割成若干个小表。mysql5.1开始支持数据表分区了。
如:某用户表的记录超过了600万条,那么就可以根据入库日期将表分区,也可以根据所在地将表分区。当然也可根据其他的条件分区。
二、为什么要对表进行分区
为了改善大型表以及具有各种访问模式的表的可伸缩性,可管理性和提高数据库效率。
分区的一些优点包括:
1)、与单个磁盘或文件系统分区相比,可以存储更多的数据。
2)、对于那些已经失去保存意义的数据,通常可以通过删除与那些数据有关的分区,很容易地删除那些数据。相反地,在某些情况下,添加新数据的过程又可以通过为那些新数据专门增加一个新的分区,来很方便地实现。通常和分区有关的其他优点包括下面列出的这些。MySQL分区中的这些功能目前还没有实现,但是在我们的优先级列表中,具有高的优先级;我们希望在5.1的生产版本中,能包括这些功能。
3)、一些查询可以得到极大的优化,这主要是借助于满足一个给定WHERE语句的数据可以只保存在一个或多个分区内,这样在查找时就不用查找其他剩余的分区。因为分区可以在创建了分区表后进行修改,所以在第一次配置分区方案时还不曾这么做时,可以重新组织数据,来提高那些常用查询的效率。
4)、涉及到例如SUM()和COUNT()这样聚合函数的查询,可以很容易地进行并行处理。这种查询的一个简单例子如 “SELECT salesperson_id, COUNT (orders) as order_total FROM sales GROUP BY salesperson_id;”。通过“并行”,这意味着该查询可以在每个分区上同时进行,最终结果只需通过总计所有分区得到的结果。
5)、通过跨多个磁盘来分散数据查询,来获得更大的查询吞吐量。
三、分区类型
· RANGE分区:基于属于一个给定连续区间的列值,把多行分配给分区。
· LIST分区:类似于按RANGE分区,区别在于LIST分区是基于列值匹配一个离散值集合中的某个值来进行选择。
· HASH分区:基于用户定义的表达式的返回值来进行选择的分区,该表达式使用将要插入到表中的这些行的列值进行计算。这个函数可以包含MySQL 中有效的、产生非负整数值的任何表达式。
· KEY分区:类似于按HASH分区,区别在于KEY分区只支持计算一列或多列,且MySQL 服务器提供其自身的哈希函数。必须有一列或多列包含整数值。
1.RANGE分区
RANGE分区基于一个给定的连续区间范围,早期版本RANGE主要是基于整数的分区。
在5.7版本中DATE、DATETIME列也可以使用RANGE分区,同时在5.5以上的版本提供
了基于非整形的RANGE COLUMN分区。RANGE分区必须的连续的且不能重叠。使用
“VALUES LESS THAN ()” 来定义分区区间,非整形的范围值需要使用单引号,并且可以使用MAXVALUE作为分区的最高值。
示例:
CREATE TABLE part_date1
( c1 int default NULL,
c2 varchar(30) default NULL,
c3 date default NULL) engine=myisam
partition by range (cast(date_format(c3,'%Y%m%d') as signed))
(PARTITION p0 VALUES LESS THAN (19950101),
PARTITION p1 VALUES LESS THAN (19960101) ,
PARTITION p2 VALUES LESS THAN (19970101) ,
PARTITION p3 VALUES LESS THAN (19980101) )
2.LIST分区
类似于按RANGE分区,区别在于LIST分区是基于列值匹配一个离散值集合中的某个值来进行选择。
LIST分区通过使用“PARTITION BY LIST(expr)”来实现,其中“expr” 是某列值或一个基于某个列值、并返回一个整数值的表达式,然后通过“VALUES IN (value_list)”的方式来定义每个分区,其中“value_list”是一个通过逗号分隔的整数列表。
注释:在MySQL 5.1中,当使用LIST分区时,有可能只能匹配整数列表。
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
);
3.HASH分区
基于用户定义的表达式的返回值来进行选择的分区,该表达式使用将要插入到表中的这些行的列值进行计算。这个函数可以包含MySQL 中有效的、产生非负整数值的任何表达式。
要使用HASH分区来分割一个表,要在CREATE TABLE 语句上添加一个“PARTITION BY HASH (expr)”子句,其中“expr”是一个返回一个整数的表达式。它可以仅仅是字段类型为MySQL 整型的一列的名字。此外,你很可能需要在后面再添加一个“PARTITIONS num”子句,其中num 是一个非负的整数,它表示表将要被分割成分区的数量。
CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
)
PARTITION BY HASH(store_id)
PARTITIONS 4;
4.KSY分区
类似于按HASH分区,区别在于KEY分区只支持计算一列或多列,且MySQL 服务器提供其自身的哈希函数。必须有一列或多列包含整数值。
CREATE TABLE tk (
col1 INT NOT NULL,
col2 CHAR(5),
col3 DATE
)
PARTITION BY LINEAR KEY (col1)
PARTITIONS 3;
在KEY分区中使用关键字LINEAR和在HASH分区中使用具有同样的作用,分区的编号是通过2的幂(powers-of-two)算法得到,而不是通过模数算法。
四、分区表创建方法
1、查看数据库版本是否支持分区表
mysql> show plugins; #查看分区功能是否支持
| INNODB_SYS_DATAFILES | ACTIVE | INFORMATION SCHEMA | NULL | GPL |
| INNODB_SYS_VIRTUAL | ACTIVE | INFORMATION SCHEMA | NULL | GPL |
| BLACKHOLE | ACTIVE | STORAGE ENGINE | NULL | GPL |
| partition | ACTIVE | STORAGE ENGINE | NULL | GPL |
| FEDERATED | DISABLED | STORAGE ENGINE | NULL | GPL |
| ARCHIVE | ACTIVE | STORAGE ENGINE | NULL | GPL |
| ngram | ACTIVE | FTPARSER | NULL | GPL |
+----------------------------+----------+--------------------+---------+---------+
SELECT
PLUGIN_NAME as Name,
PLUGIN_VERSION as Version,
PLUGIN_STATUS as Status
FROM INFORMATION_SCHEMA.PLUGINS
WHERE PLUGIN_TYPE='STORAGE ENGINE';
+--------------------+---------+----------+
| Name | Version | Status |
+--------------------+---------+----------+
| binlog | 1.0 | ACTIVE |
| CSV | 1.0 | ACTIVE |
| MyISAM | 1.0 | ACTIVE |
| MEMORY | 1.0 | ACTIVE |
| MRG_MYISAM | 1.0 | ACTIVE |
| PERFORMANCE_SCHEMA | 0.1 | ACTIVE |
| InnoDB | 5.7 | ACTIVE |
| BLACKHOLE | 1.0 | ACTIVE |
| partition | 1.0 | ACTIVE |
| FEDERATED | 1.0 | DISABLED |
| ARCHIVE | 3.0 | ACTIVE |
+--------------------+---------+----------+
2、创建分区表
CREATE TABLE members2 (
firstname VARCHAR(25) NOT NULL,
lastname VARCHAR(25) NOT NULL,
username VARCHAR(16) NOT NULL,
email VARCHAR(35),
joined DATE NOT NULL
)
PARTITION BY RANGE( YEAR(joined) ) (
PARTITION p0 VALUES LESS THAN (1960),
PARTITION p1 VALUES LESS THAN (1970),
PARTITION p2 VALUES LESS THAN (1980),
PARTITION p3 VALUES LESS THAN (1990),
PARTITION p4 VALUES LESS THAN MAXVALUE
);
3、查看分区表信息
查看information_schema.partitions表
mysql> select
partition_name part,
partition_expression expr,
partition_description descr,
table_rows
from information_schema.partitions where
table_schema = schema()
and table_name='members2';
+------+---------------+----------+------------+
| part | expr | descr | table_rows |
+------+---------------+----------+------------+
| p0 | YEAR(joined) | 1960 | 0 |
| p1 | YEAR(joined) | 1970 | 0 |
| p2 | YEAR(joined) | 1980 | 0 |
| p3 | YEAR(joined) | 1990 | 1 |
| p4 | YEAR(joined) | MAXVALUE | 0 |
+------+---------------+----------+------------+
可以查看表是不是分区表
mysql> show table status;
+---------------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+--------------------+------------+
| Name | Engine | Version | Row_format | Rows | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time | Update_time | Check_time | Collation | Checksum | Create_options | Comment |
+---------------------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+---------------------+------------+-----------------+----------+--------------------+------------+
| loc_term_record_src | InnoDB | 10 | Dynamic | 2 | 8192 | 16384 | 0 | 0 | 0 | NULL | 2017-06-17 10:37:04 | NULL | NULL | utf8_general_ci | NULL | row_format=DYNAMIC | |
| members | InnoDB | 10 | Dynamic | 2 | 49152 | 98304 | 0 | 0 | 0 | NULL | 2017-08-03 17:27:08 | 2017-08-03 17:42:59 | NULL | utf8_general_ci | NULL | partitioned | |
| members2 | InnoDB | 10 | Dynamic | 1 | 81920 | 81920 | 0 | 0 | 0 | NULL | 2017-08-03 17:31:26 | 2017-08-03 17:34:28 | NULL | utf8_general_ci | NULL | partitioned | |
| members3 | InnoDB | 10 | Dynamic | 0 | 0 | 81920 | 0 | 0 | 0 | NULL | 2017-08-03 17:44:37 | NULL | NULL | utf8_general_ci | NULL | partitioned | |
| rp_br_reallocation | InnoDB | 10 | Dynamic | 49431 | 201 | 9977856 | 0 | 0 | 4194304 | 50663 | 2017-08-03 16:09:40 | 2017-08-03 16:09:42 | NULL | utf8_bin | NULL | | 鸽环轨迹表 |
| students | InnoDB | 10 | Dynamic | 2 | 8192 | 16384 | 0 | 0 | 0 | NULL | 2017-03-16 11:28:49 | NULL | NULL | utf8_general_ci | NULL | | |
| tnp | InnoDB | 10 | Dynamic | 0 | 0 | 16384 | 0 | 16384 | 0 | 1 | 2017-08-03 17:25:23 | NULL | NULL | utf8_general_ci | NULL | | |
| uuid_cpp_test | InnoDB | 10 | Dynamic | 2 | 8192 | 16384 | 0 | 0 | 0 | NULL | 2017-06-17 10:37:04 | NULL | NULL | utf8_general_ci | NULL | row_format=DYNAMIC |
4、分析查询语句是否扫描了那些分区表
通过此语句来显示扫描哪些分区,及他们是如何使用的
mysql> explain partitions select * from members2 where joined='1980-10-11';
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | members2 | p3 | ALL | NULL | NULL | NULL | NULL | 1 | 100 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+------+----------+-------------+
五、分区表创建示例
1、根据id号范围创建分区表
mysql> select * from tnp;
+----+-----+
| id | ref |
+----+-----+
| 1 | 12 |
| 2 | 13 |
| 3 | 14 |
+----+-----+
3 rows in set
mysql> alter TABLE tnp PARTITION BY range(id)
(
partition p1 values less than (2),
partition p2 values less than (3)
);
1526 - Table has no partition for value 3
mysql> alter TABLE tnp PARTITION BY range(id)
(
partition p1 values less than (2),
partition p2 values less than (8)
);
Query OK, 3 rows affected
Records: 3 Duplicates: 0 Warnings: 0
mysql> explain partitions select * from tnp where id<3;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | tnp | p1,p2 | range | PRIMARY | PRIMARY | 4 | NULL | 2 | 100 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set
插入数据验证数据是否自动添加到对应分区
mysql> insert into tnp values(4,44);
Query OK, 1 row affected
mysql> explain partitions select * from tnp where id>3;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | tnp | p2 | range | PRIMARY | PRIMARY | 4 | NULL | 1 | 100 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set
mysql> explain partitions delete
from tnp where id<3;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| 1 | DELETE | tnp | p1,p2 | range | PRIMARY | PRIMARY | 4 | const | 2 | 100 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
1 row in set
mysql> explain partitions delete from tnp where id<2
;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| 1 | DELETE | tnp | p1 | range | PRIMARY | PRIMARY | 4 | const | 1 | 100 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
删除分区
mysql> alter table tnp drop partition p1;
Query OK, 0 rows affected
Records: 0 Duplicates: 0 Warnings: 0
mysql> select * from tnp; #数据也没有了
+----+-----+
| id | ref |
+----+-----+
| 2 | 13 |
| 3 | 14 |
| 4 | 44 |
+----+-----+
3 rows in set
mysql> explain partitions select * from tnp where id<3;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | tnp | p2 | range | PRIMARY | PRIMARY | 4 | NULL | 1 | 100 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set
2、按天创建分区
alter TABLE rp_br_reallocation PARTITION BY range(to_days(gps_time))
(
partition p1 values less than (to_days('2017-04-01')),
partition p2 values less than (to_days('2017-06-01')),
partition p3 values less than (to_days('2017-07-01')),
partition p4 values less than (to_days('2017-08-01')),
);
1503 - A PRIMARY KEY must include all columns in the table's partitioning function
错误原因:对于有主键的表,创建分区的字段必须包含在主键中。
解决一: 将gps_time字段也设置为主键
创建分区后添加分区
mysql> alter table rp_br_reallocation add partition (partition p804 values less than (to_days('2017-08-05')));
Query OK, 0 rows affected
Records: 0 Duplicates: 0 Warnings: 0
查询分区数据
select * from rp_br_reallocation PARTITION (p1);
select * from rp_br_reallocation PARTITION (p1,p2);
移除表的所有分区: #不会丢失数据,表分变成普通表
alter TABLE rp_br_reallocation REMOVE PARTITIONING;
删除分区,删除后数据丢失。
alter TABLE rp_br_reallocation drop PARTITION p7,p8;
六、自动创建分区
在实际环境,可能需要按天创建分区,如果没有创建分区,则新增的数据较不能正常写入,
自动创建分区可以通过存储过程和客户端脚本的方式。
自动按天创建分区的脚本
#!/bin/bash
#create partition everyday
db=数据库
tablename="数据表"
user=用户
password=密码
host=主机
#分区的时间参数,这里参数要加1,因为less than的范围是小于,不是小于等于。
day=`date +%F -d'+1 day'`
pname=p
atime=`date +%y%m%d`
#获取最新的分区参数
last1=`mysql -u $user -h $host -p$password -e "select PARTITION_DESCRIPTION from information_schema.partitions where table_schema = '$db' and table_name='rp_br_reallocation' ORDER BY partition_ordinal_position DESC LIMIT 1;" 2>/dev/null | sed -n '2p'`
#要创建的分区参数
value=`echo "$last1 + 1"|bc`
#分区名称
pname=$pname$atime
#创建分区
mysql -u $user -h $host -p$password -e "alter TABLE $tablename add partition (partition $pname values less than (to_days('$day')));"
#获取创建后的分区参数
last2=`mysql -u $user -h $host -p$password -e "select PARTITION_DESCRIPTION from information_schema.partitions where table_schema = '$db' and table_name='rp_br_reallocation' ORDER BY partition_ordinal_position DESC LIMIT 1;" 2>/dev/null | sed -n '2p'`
echo "last1:$last1 vs last2:$last2"
#判断分区是否正确创建
if [ "$value" == "$last2" ]
then
echo "create $pname successful at `date`"
else
echo "create $pname fail at `date`"
fi
将以上脚本的执行加入到每天凌晨的定时任务中,就可以实现每天自动创建当日的分区。
七、常见错误
1、对于已经有数据的表创建分区时报错
alter TABLE rp_br_reallocation PARTITION BY range(id)
(
partition p1 values less than (50)
);
报错:Table has no partition for value 50
原因:对已有数据进行分区,必须包含所以数据,该表中有50000多数据,分到50肯定不够。
解决方法:增大参数最大值,或者最大值改为MAXVALUE
alter TABLE rp_br_reallocation PARTITION BY range(id)
(
partition p1 values less than (3000),
partition p2 values less than (60000)
);
EXPLAIN select count(*) from rp_br_reallocation where id <3000;
1 SIMPLE rp_br_reallocation p1 range PRIMARY PRIMARY 4 1409 100 Using where; Using index
|