栏目分类:
子分类:
返回
名师互学网用户登录
快速导航关闭
当前搜索
当前分类
子分类
实用工具
热门搜索
名师互学网 > IT > 前沿技术 > 大数据 > 大数据系统

【踩坑】SUM(A)+SUM(B)和SUM(A+B)不相等的情况

【踩坑】SUM(A)+SUM(B)和SUM(A+B)不相等的情况

文章目录
  • SUN(A)+SUM(B)!=SUM(A+B)的情况
    • 1. 验证
      • 1.1. 对单独的一列进行SUM
      • 1.2. SUM(A)+SUM(B)
      • 1.3. SUM(A+B)
    • 2. 解决

SUN(A)+SUM(B)!=SUM(A+B)的情况

结论: 在hive、presto、spark中,当字段A或字段B列中存在null值时,SUN(A)+SUM(B)!=SUM(A+B)

tips
1.sum(col) 当col的值为null时,结果为null
2.null+数值=null

1. 验证

示例数据如下

customer_nostock_feeadvertisement_fee
C0001NULL1130.52
C00010.11084.72
C0001NULL1738.32
C0001NULL1817.67
C0001NULL3271.45
C0002NULL2587.76
C0002NULL1397.60
C0002NULL2871.75
C0002NULL812.08
C0002NULL1643.39
C00032635.232587.76
C00031253.691397.60
C0003253.692871.75
C00031528.12812.08
C0003412.531643.39
1.1. 对单独的一列进行SUM
SELECt  customer_no
       ,SUM(stock_fee)
FROM table1
GROUP BY  customer_no;

+--------------+--------------------+
| customer_no  |        _c1         |
+--------------+--------------------+
| C0001        | 0.1                |
| C0002        | NULL               |
| C0003        | 6083.259999999999  |
+--------------+--------------------+
  • 当列值中存在NULL时(C0001),sum结果不会报错,且计算结果正确
  • 当列值中全部为NULL时(C0002),sum结果为null
  • 当列值中不存在NULL值时(C0003),对数值进行sum,无异议
1.2. SUM(A)+SUM(B)
SELECt  customer_no
       ,SUM(stock_fee)+SUM(advertisement_fee)
FROM table1
GROUP BY  customer_no;

+--------------+-----------+
| customer_no  |    _c1    |
+--------------+-----------+
| C0001        | 9042.78   |
| C0002        | NULL      |
| C0003        | 15395.84  |
+--------------+-----------+
  • 对于c0001,SUM(stock_fee)=0.1,SUM(advertisement_fee)=9042.68,因此SUM(stock_fee)+SUM(advertisement_fee)=9042.78
  • 对于c0002,SUM(stock_fee)=null,SUM(advertisement_fee)=9312.58,因此SUM(stock_fee)+SUM(advertisement_fee)=null
  • 对于c0003,SUM(stock_fee)=6083.259999999999,SUM(advertisement_fee)=9312.58,因此SUM(stock_fee)+SUM(advertisement_fee)=15395.84
1.3. SUM(A+B)
SELECt  customer_no
       ,SUM(stock_fee + advertisement_fee)
FROM table1
GROUP BY  customer_no;

+--------------+---------------------+
| customer_no  |         _c1         |
+--------------+---------------------+
| C0001        | 1084.82             |
| C0002        | NULL                |
| C0003        | 15395.839999999998  |
+--------------+---------------------+
  • 对于c0001,只有0.1 + 1084.72 不为null,因此SUM(stock_fee + advertisement_fee)结果为1084.82
  • 对于c0002,只有stock_fee + advertisement_fee都为null,因此SUM(stock_fee + advertisement_fee)结果为null
  • 对于c0003,只有stock_fee + advertisement_fee都不为为null,因此SUM(stock_fee + advertisement_fee)结果为15395.839999999998
2. 解决

想要sum时得到正确结果,在sum时对每个字段值都判断是否为null,当为null时置为0

以下二者结果相同

SELECt  customer_no
       ,SUM(if(stock_fee is null,0,stock_fee) + if(advertisement_fee is null,0,advertisement_fee))
FROM table1
GROUP BY  customer_no;

SELECt  customer_no
       ,SUM(if(stock_fee is null,0,stock_fee)) + SUM(if(advertisement_fee is null,0,advertisement_fee))
FROM table1
GROUP BY  customer_no;

+--------------+-----------+
| customer_no  |    _c1    |
+--------------+-----------+
| C0001        | 9042.78   |
| C0002        | 9312.58   |
| C0003        | 15395.84  |
+--------------+-----------+
转载请注明:文章转载自 www.mshxw.com
本文地址:https://www.mshxw.com/it/663115.html
我们一直用心在做
关于我们 文章归档 网站地图 联系我们

版权所有 (c)2021-2022 MSHXW.COM

ICP备案号:晋ICP备2021003244-6号