MySQL 8.0 新特性之横向(LATERAL)派生表
作者: 不剪发的Tony老师
毕业于北京航空航天大学,十多年数据库管理与开发经验,目前在一家全球性的金融公司从事数据库架构设计。CSDN学院签约讲师以及GitChat专栏作者。csdn上的博客收藏于以下地址:https://tonydong.blog.csdn.net
MySQL 将FROM中的子查询称为派生表(Derived Table)。以下查询使用了一个派生表:
SELECT * FROM (SELECT 1) AS dt;
+---+
| 1 |
+---+
| 1 |
+---+
不过,MySQL 中的派生表存在一些限制:
派生表不能是关联子查询
派生表不能引用它所在的SELECT语句中的其他表
在 MySQL 8.0.14 之前,派生表不能引用它所在的SELECT语句外部的表
简单来说,就是派生表必须能够单独运行,而不能依赖其他表。
从 MySQL 8.0.14 开始,派生表支持LATERAL关键字前缀,表示允许派生表引用它所在的FROM子句中的其他表。横向派生表能够完成普通派生表无法完成或者效率低下的操作。
考虑以下应用场景:departments 表存储了部门的信息,employees 表存储了员工信息。如何查找每个部门中薪水最高的 Top 5 和对应的员工?
示例表和数据
首先,使用传统的方法很难实现这样的功能:
SELECT d.department_name,
(SELECT e.salary
FROM employees e
WHERE e.department_id = d.department_id
ORDER BY e.salary DESC LIMIT 5
)
FROM departments d;
ERROR 1242 (21000): Subquery returns more than 1 row
以上查询失败的原因在于SELECT子查询只能返回 1 条数据。
按照需求,我们可以先按照部门编号对员工信息进行分组,获得每个组内的薪水最高的 5 个员工,然后和部门表进行连接查询:
SELECT d.department_name, t.first_name, t.last_name, t.salary
FROM departments d
LEFT JOIN (SELECT e.department_id, e.first_name, e.last_name, e.salary
FROM employees e
WHERE e.department_id = d.department_id
ORDER BY e.salary DESC LIMIT 5) t
ON d.department_id = t.department_id
ORDER BY d.department_name, t.salary DESC;
ERROR 1054 (42S22): Unknown column 'd.department_id' in 'where clause'
以上语句失败的原因在于子查询 t 不能引用外部查询中的 departments 表。
可以利用 MySQL 中的自定义变量实现该功能:
SELECT d.department_name, w.first_name, w.last_name, w.salary, w.rn
FROM departments d
LEFT JOIN (
SELECT *
FROM (
SELECT a.*, if(@did = a.department_id, @rn := @rn+1, @rn := 1) AS rn, @did := a.department_id AS did
FROM (SELECT e.* FROM employees e ORDER BY department_id, salary DESC) a,
(SELECT @rn := 0 rn, @did := 0) b
) AS t
WHERE t.rn <= 5
) AS w
ON d.department_id = w.department_id
ORDER BY d.department_name, w.salary DESC;
+----------------------+-------------+-----------+----------+------+
| department_name | first_name | last_name | salary | rn |
+----------------------+-------------+-----------+----------+------+
| Accounting | Shelley | Higgins | 12008.00 | 1 |
| Accounting | William | Gietz | 8300.00 | 2 |
| Administration | Jennifer | Whalen | 4400.00 | 1 |
| Benefits | NULL | NULL | NULL | NULL |
| Construction | NULL | NULL | NULL | NULL |
| Contracting | NULL | NULL | NULL | NULL |
| Control And Credit | NULL | NULL | NULL | NULL |
| Corporate Tax | NULL | NULL | NULL | NULL |
| Executive | Steven | King | 24000.00 | 1 |
| Executive | Neena | Kochhar | 17000.00 | 2 |
| Executive | Lex | De Haan | 17000.00 | 3 |
| Finance | Nancy | Greenberg | 12008.00 | 1 |
| Finance | Daniel | Faviet | 9000.00 | 2 |
| Finance | John | Chen | 8200.00 | 3 |
| Finance | Jose Manuel | Urman | 7800.00 | 4 |
| Finance | Ismael | Sciarra | 7700.00 | 5 |
| Government Sales | NULL | NULL | NULL | NULL |
| Human Resources | Susan | Mavris | 6500.00 | 1 |
...
51 rows in set (0.00 sec)
可以看出,这种方法比较复杂,也不具有通用性。
现在来看一下如何使用LATERAL派生表实现这个需求:
SELECT d.department_name, t.first_name, t.last_name, t.salary
FROM departments d
LEFT JOIN LATERAL (SELECT e.department_id, e.first_name, e.last_name, e.salary
FROM employees e
WHERE e.department_id = d.department_id
ORDER BY e.salary DESC LIMIT 5) t
ON d.department_id = t.department_id
ORDER BY d.department_name, t.salary DESC;
+----------------------+-------------+-----------+----------+------+
| department_name | first_name | last_name | salary | rn |
+----------------------+-------------+-----------+----------+------+
| Accounting | Shelley | Higgins | 12008.00 | 1 |
| Accounting | William | Gietz | 8300.00 | 2 |
| Administration | Jennifer | Whalen | 4400.00 | 1 |
| Benefits | NULL | NULL | NULL | NULL |
| Construction | NULL | NULL | NULL | NULL |
| Contracting | NULL | NULL | NULL | NULL |
| Control And Credit | NULL | NULL | NULL | NULL |
| Corporate Tax | NULL | NULL | NULL | NULL |
| Executive | Steven | King | 24000.00 | 1 |
| Executive | Neena | Kochhar | 17000.00 | 2 |
| Executive | Lex | De Haan | 17000.00 | 3 |
| Finance | Nancy | Greenberg | 12008.00 | 1 |
| Finance | Daniel | Faviet | 9000.00 | 2 |
| Finance | John | Chen | 8200.00 | 3 |
| Finance | Jose Manuel | Urman | 7800.00 | 4 |
| Finance | Ismael | Sciarra | 7700.00 | 5 |
| Government Sales | NULL | NULL | NULL | NULL |
| Human Resources | Susan | Mavris | 6500.00 | 1 |
...
51 rows in set (0.00 sec)
以上语句基于前面的一个失败示例,只是在LEFT JOIN之后,派生表之前加上一个LATERAL关键字,使得子查询 t 能够引用前面的 departments 表。这种方法能够实现很多类似的复杂功能,简单而且高效。
MySQL 从 8.0.14 开始支持横向派生表,同时存在以下限制:
横向派生表只能出现在FROM子句中,包括使用逗号分隔的表或者标准的连接语句(JOIN、INNER JOIN、CROSS JOIN、LEFT [OUTER] JOIN以及RIGHT [OUTER] JOIN)。
如果横向派生表位于连接操作的右侧,并且引用了左侧的表,连接类型必须为INNER JOIN、CROSS JOIN或者LEFT [OUTER] JOIN。
如果横向派生表位于连接操作的左侧,并且引用了右侧的表,连接类型必须为INNER JOIN、CROSS JOIN或者RIGHT [OUTER] JOIN。
如果横向派生表引用了聚合函数,那么该函数的聚合查询语句不能是横向派生表所在的FROM子句所属的查询语句。
根据 SQL 标准,表函数拥有一个隐式的LATERAL,这与 MySQL 8.0 到 MySQL 8.0.14 之前版本的实现一致。但是,根据标准,函数 JSON_TABLE() 之前不能存在LATERAL关键字,包括隐式的LATERAL。
当然,也可以使用 MySQL 8.0 中新增的窗口函数完成相同的功能:
SELECT *
FROM (SELECT d.department_name, e.first_name, e.last_name, e.salary,
row_number() OVER (PARTITION BY d.department_id ORDER BY e.salary DESC) AS rn
FROM departments d
LEFT JOIN employees e
ON (e.department_id = d.department_id)
) AS t
WHERE t.rn <= 5
ORDER BY t.department_name,t.rn;
+----------------------+-------------+-----------+----------+----+
| department_name | first_name | last_name | salary | rn |
+----------------------+-------------+-----------+----------+----+
| Accounting | Shelley | Higgins | 12008.00 | 1 |
| Accounting | William | Gietz | 8300.00 | 2 |
| Administration | Jennifer | Whalen | 4400.00 | 1 |
| Benefits | NULL | NULL | NULL | 1 |
| Construction | NULL | NULL | NULL | 1 |
| Contracting | NULL | NULL | NULL | 1 |
| Control And Credit | NULL | NULL | NULL | 1 |
| Corporate Tax | NULL | NULL | NULL | 1 |
| Executive | Steven | King | 24000.00 | 1 |
| Executive | Neena | Kochhar | 17000.00 | 2 |
| Executive | Lex | De Haan | 17000.00 | 3 |
| Finance | Nancy | Greenberg | 12008.00 | 1 |
| Finance | Daniel | Faviet | 9000.00 | 2 |
| Finance | John | Chen | 8200.00 | 3 |
| Finance | Jose Manuel | Urman | 7800.00 | 4 |
| Finance | Ismael | Sciarra | 7700.00 | 5 |
| Government Sales | NULL | NULL | NULL | 1 |
| Human Resources | Susan | Mavris | 6500.00 | 1 |
...
51 rows in set (0.00 sec)
相关文档:
MySQL 8.0 横向派生表
其他数据库产品的功能实现:
Oracle 18c 横向(LATERAL)内联视图
Oracle 18c APPLY 连接
SQL Server 2017 APPLY 运算符
PostgreSQL 11 横向(LATERAL)子查询
Db2 11 横向(LATERAL)连接