MySQL索引对NULL值的处理
# 索引不会包含有NULL值的列
只要列中包含有NULL值都将不会被包含在索引中,复合索引中只要有⼀列含有NULL值,那么这⼀列对于此复合索引就是⽆效的。所以我们在数据库设计时不要让字段的默认值为NULL。
在很多库表设计规范、某某军规的⽂章中,是不是经常会看到类似这样的内容。⼩编也经常看到这样的内容,并且在编写规范的时候,准备也把这⼀条加进去。但在按部就班之余,⼩编抽空验证了⼀下,发现事实却并⾮如此!
⼩编使⽤的MySQL版本是社区版 5.7.21
新建测试表 t1,插⼊不含NULL值得100⾏数据,然后插⼊1⾏带NULL的数据 insert into t1(id) values(101); 表中有主键id,索引a
CREATE TABLE `t1` (
`id` int(11) NOT NULL,
`a` int(11) DEFAULT NULL,
`b` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
测试1,包含NULL单列索引的查询,可以看到即使是查 IS NULL的⾏,也是可以⽤上索引的
测试1:
desc select*from t1 where a >82;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type |table| partitions | type  | possible_keys |key| key_len | ref  | rows | filtered | Extra                |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+---------------------
--+
|1| SIMPLE      | t1    |NULL| range | a            | a    |5|NULL|18|100.00| Using index condition |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
desc select*from t1 where a is NULL;
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
| id | select_type |table| partitions | type | possible_keys |key| key_len | ref  | rows | filtered | Extra                |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
|1| SIMPLE      | t1    |NULL| ref  | a            | a    |5| const |1|100.00| Using index condition |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-----------------------+
desc select*from t1 where a =20or a is null;
+----+-------------+-------+------------+-------------+---------------+--------+---------+-------+------+----------+--------------------------+
| id | select_type |table| partitions | type        | possible_keys |key| key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------------+---------------+--------+---------+-------+------+----------+--------------------------+
|1| SIMPLE      | t1    |NULL| ref_or_null | idx_ab        | idx_ab |5| const |2|100.00| Using where; Using index|
+----+-------------+-------+------------+-------------+---------------+--------+---------+-------+------+----------+--------------------------+
注意对 NULL 值的检索只能使⽤ is null / is not null / <=>,不能使⽤=,<,>这样的运算符(mysql中可以⽤a <=> NULL 表⽰查 a is NULL'的⾏)
测试2,包含NULL复合索引的查询,⾸先加⼀个复合索引 alter table t1 drop index a,add index idx_ab(a,b); 可以看到不管是指定 a is null ,或者指定 b is null ,都可以利⽤上索引 idx_ab(key_len 可以看出)
测试2:
desc select*from t1 where a=50and b>20;
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
| id | select_type |table| partitions | type  | possible_keys |key| key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
|1| SIMPLE      | t1    |NULL| range | idx_ab        | idx_ab |10|NULL|1|100.00| Using where; Using index|
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
desc select*from t1 where a=50and b is null;
+----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
| id | select_type |table| partitions | type | possible_keys |key| key_len | ref        | rows | filtered | Extra                    |
+----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
|1| SIMPLE      | t1    |NULL| ref  | idx_ab        | idx_ab |10| const,const |1|100.00| Using where; Using index|
+----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
查看mysql索引
desc select*from t1 where a is null and b>20;
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
| id | select_type |table| partitions | type  | possible_keys |key| key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
|1| SIMPLE      | t1    |NULL| range | idx_ab        | idx_ab |10|NULL|1|100.00| Using where; Using index|
+----+-------------+-------+------------+-------+---------------+--------+---------+------+------+----------+--------------------------+
desc select*from t1 where a is null and b is null;
+----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
| id | select_type |table| partitions | type | possible_keys |key| key_len | ref        | rows | filtered | Extra                    |
+----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
|1| SIMPLE      | t1    |NULL| ref  | idx_ab        | idx_ab |10| const,const |1|100.00| Using where; Using index|
+----+-------------+-------+------------+------+---------------+--------+---------+-------------+------+----------+--------------------------+
由此,只要列中包含有NULL值都将不会被包含在索引中,复合索引中只要有⼀列含有NULL值,那么这⼀列对于此复合索引就是⽆效的。所以我们在数据库设计时不要让字段的默认值为NULL。这句的前半句是不对的(可参考官⽹说明:),但是后半句的结论确是可以采纳的。
虽然MySQL可以在含有null的列上使⽤索引,但不代表null和其他数据在索引中是⼀样的。不建议列上允许为空,最好限制 not null ,并设置⼀个默认值,⽐如0和''空字符串等,如果是datetime类型,可以设置成'1970-01-01 00:00:00'这样的值。对MySQL来说,null 是⼀个特殊的值,Conceptually, NULL means “a missing unknown value” and it is treated somewhat differently from other values。对null做算术运算的结果都是null,count时不会包括null⾏,null ⽐空字符串需要更多的存储空间等。
附:上⾯说到可⽤通过 key_len 看出使⽤了索引列的个数,a,b 都是 int 类型,4 byte,为什么 key_len 是 5 byte 和 10 byte 呢?是因为如果索引列定义时允许NULL,其key_len还需要再加 1 bytes. 参考好友王的⽂章,可以移步我们的站点查看详情: