您需要递归方法,但是
WITH RECURSIVE会产生巨大的中间结果,因此 不再需要假脱机 。
对于类似的过程,我使用了以下方法(最初在存储过程中使用WHILE循环):
CREATE MULTISET VOLATILE TABLE vt_tmp, NO Log AS ( SELECt group_id, category_1, category_2, -- assign a unique number to Dense_Rank() Over (ORDER BY group_id, category_1) AS rnk -- remove when you source data is unique GROUP BY 1,2,3 -- same result as a DISTINCT, but processed before DENSE_RANK FROM match_detail )WITH DATAPRIMARY INDEX (category_2)ON COMMIT PRESERVE ROWS;
现在重复以下更新,直到
0 rows processed:
-- find matching categories and assign them a common number UPDATe vt_tmp FROM ( SELECt e2.group_id, e2.category_1, Min(e1.rnk) AS minrnk FROM vt_tmp e1 JOIN vt_tmp e2 ON e1.category_2 = e2.category_2 AND e1.rnk < e2.rnk GROUP BY e2.group_id, e2.category_1 ) xSET rnk = minrnkWHERe vt_tmp.group_id = x.group_idAND vt_tmp.category_1 = x.category_1;
要获得相关类别,您最终需要:
SELECt group_id, category_1 AS category, rnk AS related_categoriesFROM vt_tmpUNIOnSELECt group_id, category_2, rnk FROM vt_tmp
为了与您的预期结果完全匹配,您需要添加一个
DENSE_RANK:
SELECt group_id, category, Dense_Rank() Over (PARTITION BY group_id ORDER BY related_categories)FROM ( SELECt group_id, category_1 AS category, rnk AS related_categories FROM vt_tmp UNIOn SELECt group_id, category_2, rnk FROM vt_tmp ) AS dt



