对检索结果使用 UNIOn, INTERSECT, EXCEPT 来将检索结果进行并,交和差运算
所有的表–以及查询结果–都可以视为集合
SELECt product_id, product_name FROM product UNIOn SELECt product_id, product_name FROM product2;
UNIOn 等集合运算符通常都会除去重复的记录。
select product_id,product_name,product_type,sale_price,purchase_price from product where sale_price>purchase_price*1.5 union select product_id,product_name,product_type,sale_price,purchase_price from product where sale_price<800 select product_id,product_name,product_type,sale_price,purchase_price from product where sale_price>purchase_price*1.5 or sale_price<800
要将两个不同的表中的结果合并在一起, 就得使用 UNIOn
select * from product where sale_price < 1.3*purchase_price or sale_price is null or purchase_price is null; -- 使用 OR 谓词 select * from product where sale_price/purchase_price < 1.3 or sale_price/purchase_price is null; select * from product where sale_price/purchase_price is null union select * from product where sale_price/purchase_price < 1.3包含重复行的集合运算 union All
select product_id,product_name from product union all select product_id,product_name from product
练习题
select * from product where sale_price/purchase_price <1.5 union all select * from product where sale_price<1000隐式数据类型转换
hive中进行join关联时,关联列要避免使用隐式数据类型转换,否则容易导致数据倾斜
select product_id,product_name,'1' from product union select product_id,product_name,sale_price from product2
练习题
测试可兼容
需要用 inner join 来求得交集
SELECt p1.product_id, p1.product_name FROM product p1 INNER JOIN product2 p2 ON p1.product_id=p2.product_id
练习题
select * from product where sale_price>1.5*purchase_price and sale_price < 1500差集,补集与表的减法 MySQL 8.0不支持交运算except->not in
select * from product where product_id not in (select product_id from product2)
练习题
select * from product where sale_price>2000 and product_id not in (select product_id from product where sale_price<1.3*purchase_price)对称差
首先使用UNIOn求两个表的并集, 然后使用INTERSECT求两个表的交集, 然后用并集减去交集, 就得到了对称差。
select * from product where product_id not in (select product_id from product2) union select * from product where product_id not in (select product_id from product)连结join
使用 UNIOn 会增加记录行数,而使用 INTERSECT 或者 EXCEPT 会减少记录行数。
使用关联子查询也可以从其他表获取信息, 但连结更适合从多张表获取信息。
inner join关联子查询像vlookup函数,以表A为主,根表A为主表,根据表A的关联列的每一行的取值,逐个到表 B 中的关联列中去查找取值相等的行。
from table inner join ** on *condition(s)** 在 FROM 子句中使用 INNER JOIN 将两张表连接起来, 并为 ON 子句指定连结条件为 ShopProduct.product_id=Product.product_id
找出东京商店里的衣服类商品的商品名称,商品价格,商品种类,商品数量信息。
Product 表保存了商品编号,商品名称,商品种类
ShopProduct 表里有商店编号名称,商店的商品编号及数量
公共列商品编号将Product和ShopProduct这两张表连接起来
select sp.shop_id,sp.shop_name,sp.product_id,p.product_name,p.product_type,p.sale_price,sp.quantity from shopproduct as sp inner join product as p on sp.product_id=p.product_id;inner join需要注意的三点
- 进行连结的时候需要在from子句中使用多张表
from shopproduct as sp inner join product as p
必须使用on子句指定连结条件
ON 子句是专门用来指定连结条件的, 我们在上述查询的 ON 之后指定两张表连结所使用的列以及比较条件, 基本上, 它能起到与 WHERe 相同的筛选作用
select子句中的列最好按照表名.列名的格式来使用
WHERe 子句写在 ON 子句的后边。
把上述查询作为子查询
select * from ( select sp.shop_id, sp.shop_name, sp.product_id, p.product_name, p.product_type, p.sale_price, sp.quantity from shopproduct as sp inner join product as p on sp.product_id = p.product_id)as step1 where shop_name='东京' and product_type = '衣服';
在做完 INNER JOIN … ON 得到一个新表后, 才会执行 WHERe 子句
FROM 子句->WHERe 子句->SELECt 子句
- 两张表连接 from inner joinwhere子句按照条件进行筛选select子句筛选
可以将 WHERe 子句中的条件直接添加在 ON 子句中, 这时候 ON 子句后最好用括号将连结条件和筛选条件括起来
select sp.shop_id,sp.shop_name,sp.product_id,p.product_name,p.product_type,p.sale_price,sp.quantity from shopproduct as sp inner join product as p on (sp.product_id=p.product_id And sp.shop_name='东京' And p.product_type='衣服')
先连结再筛选的标准写法的执行顺序是, 两张完整的表做了连结inner join之后再做筛选where
或者先用where拆解 后inner join
select sp.shop_id,sp.shop_name,sp.product_id,p.product_name,p.product_type,p.sale_price,sp.quantity from (select * from shopproduct where shop_name='东京')as sp inner join (select * from product where product_type='衣服')as p on sp.product_id=p.product_id;
练习题
select sp.shop_id,sp.shop_name,sp.product_id,p.product_name,p.product_type,p.purchase_price from shopproduct as sp inner join product as p on sp.product_id = p.product_id where p.product_type='衣服';
select sp.*,p.* from shopproduct as sp inner join product as p on sp.product_id = p.product_id where shop_id='000A' and sale_price < 200结合groupBy inner join
select sp.shop_id,sp.shop_name,max(p.sale_price)as max_price from shopproduct as sp inner join product as p on sp.product_id = p.product_id group by sp.shop_id,sp.shop_name;使用连结求交集
SELECt P1.*
FROM Product AS P1
INNER JOIN Product2 AS P2
ON (P1.product_id = P2.product_id
AND P1.product_name = P2.product_name
AND P1.product_type = P2.product_type
AND P1.sale_price = P2.sale_price
AND P1.regist_date = P2.regist_date)
regist_date有空值不能比较
SELECt P1.*
FROM Product AS P1
INNER JOIN Product2 AS P2
ON P1.product_id = P2.product_id
self join
自连结并不是区分于内连结和外连结的第三种连结, 自连结可以是外连结也可以是内连结, 它是不同于内连结外连结的另一个连结的分类方法。
natural joinSELECt * FROM shopproduct NATURAL JOIN Product
---
SELECt SP.product_id,SP.shop_id,SP.shop_name,SP.quantity
,P.product_name,P.product_type,P.sale_price
,P.purchase_price,P.regist_date
FROM shopproduct AS SP
INNER JOIN Product AS P
ON SP.product_id = P.product_id
自然连结并不是区别于内连结和外连结的第三种连结, 它其实是内连结的一种特例–当两个表进行自然连结时, 会按照两个表中都包含的列名来进行等值内连结, 此时无需使用 ON 来指定连接条件。
求表 Product 和表 Product2 中的公共部分
SELECt * FROM Product NATURAL JOIN Product2
在进行自然连结时, 来自于 Product 和 Product2 的运动 T 恤这一行数据在进行比较时, 实际上是在逐字段进行等值连结
两个缺失值进行=比较, 结果都不为真
正确的写法有省略空列
SELECt *
FROM (SELECt product_id, product_name
FROM Product ) AS A
NATURAL JOIN
(SELECt product_id, product_name
FROM Product2) AS B;
外连结 outer join⭐⭐⭐
内连结会丢弃两张表中不满足 ON 条件的行,和内连结相对的就是外连结. 外连结会根据外连结的种类有选择地保留无法匹配到的行。
外连结有三种形式: 左连结, 右连结和全外连结。
左连结会保存左表中无法按照 ON 子句匹配到的行, 此时对应右表的行均为缺失值; 右连结则会保存右表中无法按照 ON 子句匹配到的行, 此时对应左表的行均为缺失值; 而全外连结则会同时保存两个表中无法按照 ON子句匹配到的行, 相应的另一张表中的行用缺失值填充。
所有的内容在调换两个表的前后位置, 并将左连结改为右连结之后, 都能得到相同的结果. 稍后再介绍全外连结的概念。
-- 左连结 FROMLEFT OUTER JOIN ON -- 右连结 FROM RIGHT OUTER JOIN ON -- 全外连结 FROM FULL OUTER JOIN ON
左连结从两个表获取信息
SELECt SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
FROM Product AS P
LEFT OUTER JOIN ShopProduct AS SP
ON SP.product_id = P.product_id;
内连结只能选取出同时存在于两张表中的数据,
如果使用内连结的话,根据 SELECt 语句执行时商店库存状况的不同,结果的行数也会发生改变,生成的单据的版式也会受到影响
外连结能够得到固定行数的结果.
外连结会根据外连结的种类有选择地保留无法匹配到的行。
使用 LEFT 时 FROM 子句中写在左侧的表是主表,使用 RIGHT 时右侧的表是主表
结合 WHERe 子句使用左连结SELECt P.product_id, P.product_name, P.sale_price, SP.shop_id, SP.shop_name, SP.quantity FROM Product AS P LEFT OUTER JOIN - ( SELECt * FROM ShopProduct WHERe quantity < 50 ) AS SP ON SP.product_id = P.product_id全外连接
多表连结
inner join
SELECt SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
,IP.inventory_quantity
FROM ShopProduct AS SP4
---
INNER JOIN Product AS P
ON SP.product_id = P.product_id
INNER JOIN InventoryProduct AS IP
ON SP.product_id = IP.product_id
---
WHERe IP.inventory_id = 'P001';
outer join
SELECt P.product_id
,P.product_name
,P.sale_price
,SP.shop_id
,SP.shop_name
,IP.inventory_quantity
FROM Product AS P
LEFT OUTER JOIN ShopProduct AS SP
ON SP.product_id = P.product_id
LEFT OUTER JOIN InventoryProduct AS IP
ON SP.product_id = IP.product_id
ON 子句进阶–非等值连结 谓词 比较运算符
使用相等判断的等值连结, 也可以使用比较运算符来进行连接. 实际上, 包括比较运算符(<,<=,>,>=, BETWEEN)和谓词运算(LIKE, IN, NOT 等等)在内的所有的逻辑运算都可以放在 ON 子句内作为连结条件。
非等值自左连结(SELF JOIN)SELECt product_id
,product_name
,sale_price
,COUNT(p2_id) AS my_rank
FROM (--使用自左连结对每种商品找出价格不低于它的商品
SELECt P1.product_id
,P1.product_name
,P1.sale_price
,P2.product_id AS P2_id
,P2.product_name AS P2_name
,P2.sale_price AS P2_price
FROM Product AS P1
LEFT OUTER JOIN Product AS P2
ON P1.sale_price <= P2.sale_price
) AS X
GROUP BY product_id, product_name, sale_price
ORDER BY my_rank;
SELECt P1.product_id
,P1.product_name
,P1.sale_price
,P2.product_id AS P2_id
,P2.product_name AS P2_name
,P2.sale_price AS P2_price
FROM Product AS P1
LEFT OUTER JOIN Product AS P2
ON P1.sale_price >= P2.sale_price
ORDER BY P1.sale_price,P1.product_id
对每种商品使用自左连结, 找出比该商品售价价格更低或相等的商品
SELECt product_id, product_name, sale_price
,SUM(P2_price) AS cum_price
FROM
(SELECt P1.product_id, P1.product_name, P1.sale_price
,P2.product_id AS P2_id
,P2.product_name AS P2_name
,P2.sale_price AS P2_price
FROM Product AS P1
LEFT OUTER JOIN Product AS P2
ON ((P1.sale_price > P2.sale_price)
OR (P1.sale_price = P2.sale_price
AND P1.product_id<=P2.product_id))
ORDER BY P1.sale_price,P1.product_id) AS X
GROUP BY product_id, product_name, sale_price
ORDER BY sale_price,cum_price;
cross join 笛卡尔积
-- 1.使用关键字 CROSS JOIN 显式地进行交叉连结
SELECt SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
FROM ShopProduct AS SP
CROSS JOIN Product AS P;
--2.使用逗号分隔两个表,并省略 ON 子句
SELECt SP.shop_id
,SP.shop_name
,SP.product_id
,P.product_name
,P.sale_price
FROM ShopProduct AS SP , Product AS P;
过时的语法不推荐
练习
select * from product where sale_price > 500 union select * from product2 where sale_price > 500;
select * from product where product_id not in (select product_id from product2) union select * from product2 where product_id not in (select product_id from product);
select sp.shop_id, sp.shop_name, sp.quantity,
p.product_id, p.product_name, p.product_type, p.sale_price
from product as p
inner join shopproduct as sp
on sp.product_id = p.product_id;
select product_type, max(sale_price) as maxp from product
group by product_type
select sp.shop_id, sp.shop_name, sp.quantity,
p.product_id, p.product_name, p.product_type, p.sale_price,
mp.maxp as '该类商品的最大售价'
from product as p
inner join shopproduct as sp
on sp.product_id = p.product_id
inner join (
select product_type, max(sale_price) as maxp from product
group by product_type
) as mp
on mp.product_type = p.product_type and p.sale_price = mp.maxp;
select p.product_id, p.product_name, p.product_type, p.sale_price
from product as p
inner join (
select product_type, max(sale_price) as maxp from product
group by product_type
) as mp
on mp.product_type = p.product_type and p.sale_price = mp.maxp;
select p.product_id, p.product_name, p.product_type, p.sale_price
from product as p
where sale_price = (
select max(sale_price) from product as p1
where p.product_type = p1.product_type
group by product_type
);



