hive--grouping sets相关报错

文章目录

missing ) at ',' near ')'，同时其报错位置指向grouping sets 括号内部SemanticException 104:1 [Error 10213]: Grouping sets expression is not in GROUP BY key. Error encountered near token ...

missing ) at ‘,’ near ‘)’，同时其报错位置指向grouping sets 括号内部

报错sql

select  tab1.a
       ,tab1.b
       ,sum(tab1.c)
from tab1
group by  tab1.a
         ,tab1.b
grouping sets (
    (tab1.a, tab1.b)
    )
;

报错信息

ParseException line 7:22 missing ) at ',' near ''
line 7:31 extraneous input ')' expecting EOF near ''

解决

select  tab1.a
       ,tab1.b
       ,sum(tab1.c)
from tab1
group by  tab1.a
         ,tab1.b
grouping sets ((a, tab1.b))
;

原因

HIve自身的bug参考链接：https://issues.apache.org/jira/browse/HIVE-6950详细

也就是在 grouping sets 后面每一种组合里，如果组合在2个以上，就不能把x.column1 放第一个位置，应该改为 column1。

例如：
表tab1 有a,b,c 三列
表tab2 有a,d两列
就不能写成如下形式
select  tab1.a
       ,b
       ,d 
       ,sum(tab1.c)
from tab1
join tab2
on tab1.a = tab2.b
group by  tab1.a
         ,b
         ,d
grouping sets ((tab1.a, b, d))
;

应该改为：
select  tab1.a
       ,b
       ,d 
       ,sum(tab1.c)
from tab1
join tab2
on tab1.a = tab2.b
group by  tab1.a
         ,b
         ,d
grouping sets ((b, tab1.a, d))
;

即不能使tab1.a 放在第一位置，但如果只有tab1.a的话是可以的

如下是正确的
select  tab1.a
       ,b
       ,d 
       ,sum(tab1.c)
from tab1
join tab2
on tab1.a = tab2.b
group by  tab1.a
         ,b
         ,d
grouping sets ((tab1.a))
;

SemanticException 104:1 [Error 10213]: Grouping sets expression is not in GROUP BY key. Error encountered near token …

group by中字段与Grouping sets中字段，两者必须保持完全一致

hive--grouping sets相关报错

大数据系统相关栏目本月热门文章