我会使用这个jar,它是的更好的实现
collect(并需要复杂的数据类型)。
查询 :
add jar /path/to/jar/brickhouse-0.7.1.jar;create temporary function collect as 'brickhouse.udf.collect.CollectUDAF';select house_id , collect(named_struct("first_name", first_name, "last_name", last_name))from db.tablegroup by house_id输出 :
1 [{"first_name":"bob","last_name":"jones"}, {"first_name":"jenny","last_name":"jones"}]2 [{"first_name":"sally","last_name":"johnson"}]3 [{"first_name":"john","last_name":"smith"},{"first_name":"barb","last_name":"smith"}]


