python基础教程之MySQL 之视图、触发器、事务、存(2)

当前位置:

首页 > Python基础教程 >

python基础教程之MySQL 之视图、触发器、事务、存(2)

'), ('wdc','2010-11-20'), ('qxl','2010-11-12'), ('ffz','2009-09-18'), ('yl','2014-06-16'), ('fqy','2014-06-05'), ('cnj','2012-04-28'), ('bvf','2009-09-18'), ('hge','2012-04-22'); -- 提取hire_date字段的值，按照格式化后结果年-月来分组。 select date_format(hiredate,'%Y-%m'),count(id) from emp_info GROUP BY date_format(hiredate,'%Y-%m') --结果 -- date_format(hiredate,'%Y-%m') count(id) -- 2009-09 3 -- 2010-11 2 -- 2012-04 2 -- 2014-06 2 -- 按时间分组，统计同一入职时间段的员工人数

关于函数的其他知识点可参考这篇博客

http://www.cnblogs.com/linhaifeng/articles/7495918.html#_label2

六、流程控制

流程控制实际上就是在sql语句中使用if else、while循环的操作，不同的是语法格式有所不同

# if条件语句
delimiter //
CREATE PROCEDURE proc_if ()
BEGIN
    
    declare i int default 0;
    if i = 1 THEN
        SELECT 1;
    ELSEIF i = 2 THEN
        SELECT 2;
    ELSE
        SELECT 7;
    END IF;

END //
delimiter ;

# while循环
delimiter //
CREATE PROCEDURE proc_while ()
BEGIN

    DECLARE num INT ;
    SET num = 0 ;
    WHILE num < 10 DO
        SELECT
            num ;
        SET num = num + 1 ;
    END WHILE ;

END //
delimiter ;

七、索引和慢查询优化

首先我们要知道索引的存在感，数据库中的数据最终形式肯定是存在硬盘中，那么我们队数据库的操作不可厚非的都会进行IO操作
既然有IO操作，肯定会存在操作耗时问题，简单的单个或者多个数据在操作时候我们感觉不出来快慢或者耗时太少，但是现实工作中或者实际的项目的数据库中的数据可能非常的大，这样在对数据库数据操作的IO过程中就会无法忽视IO操作的耗时问题。
由于数据的IO操作导致MySQL的存储引擎有一套用于快速找到记录的一种数据结构，这个结构称之为--索引，在MySQL中也叫‘键’
- primary key
- unique key
- index key
- 注意，我们前面学的外键foreign key 不是用来加速查询的，不算索引，上面的三种key前两种出了有加速查询的作用，还有额外的约束条件（primary key：非空且唯一，unique key ：唯一），而index 没有任何约束功能，仅仅具有加速查询的作用。
所以我们可以总结一下：索引就是一种数据结构，类似于新华字典的最前面那几页的查询方法，按拼音还是按部首可以根据实际情况去选择，目的只有一个：加速我们找到一个汉字的具体位置，索引就是这样，也就是说我们在数据库操作数据应该都是先查找数据，怎么用最快的速度找到数据是个重要的问题，对该数据修改仅仅只算小问题了。
本质：通过不断缩小我们想要获取数据的查找范围，来筛选出最终我们想要的结果，把漫无目的的搜寻变成顺理成章的查找，有方法有目的性的去锁定我们需要的数据。
索引有优点肯定也有缺点：
1. 在表中有大量数据的时候，创建索引速度会很慢，同时索引也会占用数据空间
2. 在索引创建完成后，对表的查询性能会大幅度提升，但是写的能力会降低
b+树

上图，就是b+树的结构图，只有最下面的叶子节点是存放真实数据的，根和树枝节点存的仅仅是虚拟的数据
查询次数主要由树的层数决定，也就是说层数越少查找次数越少
同时一块磁盘块的大小也是有限制的，也就是说叶子节点处的磁盘快存放的数据应该是那种特别小的单位。
要想降低树的层级高度，我们应该在一张表中建立一个主键id字段，通过该唯一的主键字段来锁定到我们想要的数据。
聚集索引（primary key）
聚集索引其实指的就是表的主键，innodb存储引擎规定一张表中必须要有主键，在建表时候会在硬盘产生2个文件，一个是表结构frm文件，一个是ibd文件，这里面就存着记录数据和索引数据。而myisam存储引擎在建表的时候会在硬盘中产生3个文件，一个表结构frm文件，一个MYD文件存记录，一个MYI索引文件专门来存索引。
辅助索引（unique，index）
查询数据的时候不可能都是用id作为查询筛选条件，也会用到其他字段名作为查询条件，name这个时候无法利用到聚集索引来加速查询，就需要给其他字段建立索引，这些索引就叫做辅助索引。
特点：叶子节点存放的是辅助索引字段对应的那条记录的主键的值，比如辅助索引是name这个字段，那么叶子节点存放的是name这个字段对应的id主键值。
```
-- 覆盖索引
select name from user where name = 'sgt';
此语句叫做覆盖索引
只在辅助索引的叶子节点中就找到了我们想要的数据记录

-- 非覆盖索引
select age from user where name = 'sgt';
此语句叫非覆盖索引，虽然查询的时候用的是name索引字段，但是要查询的是age字段。
```
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
以下是了解内容，有兴趣可以测试下

索引的测试

#1. 准备表
create table s1(
id int,
name varchar(20),
gender char(6),
email varchar(50)
);

#2. 创建存储过程，实现批量插入记录
delimiter $$ #声明存储过程的结束符号为$$
create procedure auto_insert1()
BEGIN
    declare i int default 1;
    while(i<3000000)do
        insert into s1 values(i,'jason','male',concat('jason',i,'@oldboy'));
        set i=i+1;
    end while;
END$$ #$$结束
delimiter ; #重新声明 分号为结束符号

#3. 查看存储过程
show create procedure auto_insert1\G 

#4. 调用存储过程
call auto_insert1();

# 表没有任何索引的情况下
select * from s1 where id=30000;
# 避免打印带来的时间损耗
select count(id) from s1 where id = 30000;
select count(id) from s1 where id = 1;

# 给id做一个主键
alter table s1 add primary key(id);  # 速度很慢

select count(id) from s1 where id = 1;  # 速度相较于未建索引之前两者差着数量级
select count(id) from s1 where name = 'jason'  # 速度仍然很慢


"""
范围问题
"""
# 并不是加了索引，以后查询的时候按照这个字段速度就一定快   
select count(id) from s1 where id > 1;  # 速度相较于id = 1慢了很多
select count(id) from s1 where id >1 and id < 3;
select count(id) from s1 where id > 1 and id < 10000;
select count(id) from s1 where id != 3;

alter table s1 drop primary key;  # 删除主键 单独再来研究name字段
select count(id) from s1 where name = 'jason';  # 又慢了

create index idx_name on s1(name);  # 给s1表的name字段创建索引
select count(id) from s1 where name = 'jason'  # 仍然很慢！！！
"""
再来看b+树的原理，数据需要区分度比较高，而我们这张表全是jason，根本无法区分
那这个树其实就建成了“一根棍子”
"""
select count(id) from s1 where name = 'xxx';  
# 这个会很快，我就是一根棍，第一个不匹配直接不需要再往下走了
select count(id) from s1 where name like 'xxx';
select count(id) from s1 where name like 'xxx%';
select count(id) from s1 where name like '%xxx';  # 慢 最左匹配特性

# 区分度低的字段不能建索引
drop index idx_name on s1;

# 给id字段建普通的索引
create index idx_id on s1(id);
select count(id) from s1 where id = 3;  # 快了
select count(id) from s1 where id*12 = 3;  # 慢了  索引的字段一定不要参与计算

drop index idx_id on s1;
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';
# 针对上面这种连续多个and的操作，mysql会从左到右先找区分度比较高的索引字段，先将整体范围降下来再去比较其他条件
create index idx_name on s1(name);
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';  # 并没有加速

drop index idx_name on s1;
# 给name，gender这种区分度不高的字段加上索引并不难加快查询速度

create index idx_id on s1(id);
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';  # 快了  先通过id已经讲数据快速锁定成了一条了
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 慢了  基于id查出来的数据仍然很多，然后还要去比较其他字段

drop index idx_id on s1

create index idx_email on s1(email);
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 快 通过email字段一剑封喉

联合索引

select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  
# 如果上述四个字段区分度都很高，那给谁建都能加速查询
# 给email加然而不用email字段
select count(id) from s1 where name='jason' and gender = 'male' and id > 3; 
# 给name加然而不用name字段
select count(id) from s1 where gender = 'male' and id > 3; 
# 给gender加然而不用gender字段
select count(id) from s1 where id > 3; 

# 带来的问题是所有的字段都建了索引然而都没有用到，还需要花费四次建立的时间
create index idx_all on s1(email,name,gender,id);  # 最左匹配原则，区分度高的往左放
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 速度变快

慢查询日志

设定一个时间检测所有超出改时间的sql语句，然后针对性的进行优化！

栏目列表

首页 > Python基础教程 >