SQL查询将src获取到dest路径以连接航班

可以使用特殊语法解决SQL中这些类型的树遍历问题

WITHRECURSIVE

。

为了测试解决方案，让我们用虚拟数据创建下表：

CREATE TABLE flight (  src TEXT, dest TEXT, stt TIMESTAMP, endt TIMESTAMP);INSERT INTO flight VALUES ('SIN', 'DAC', '2016-12-31 22:45:00', '2017-01-01 01:45:00'),('DAC', 'SIN', '2017-01-01 16:30:00', '2017-01-01 21:30:00'),('SIN', 'DAC', '2017-01-01 22:45:00', '2017-01-02 01:45:00'),('DAC', 'DEL', '2017-01-01 10:00:00', '2017-01-01 11:30:00'),('DEL', 'DAC', '2017-01-01 12:30:00', '2017-01-01 14:00:00'),('DAC', 'CCU', '2017-01-01 10:30:00', '2017-01-01 11:15:00'),('CCU', 'DAC', '2017-01-01 11:45:00', '2017-01-01 12:30:00'),('SIN', 'DEL', '2017-01-01 11:00:00', '2017-01-01 15:00:00'),('DEL', 'SIN', '2017-01-01 16:30:00', '2017-01-01 20:30:00'),('CCU', 'DEL', '2017-01-01 12:00:00', '2017-01-01 12:45:00'),('DEL', 'CCU', '2017-01-01 13:15:00', '2017-01-01 14:00:00');

递归CTE很难理解，因此我将逐步构建逻辑。

在递归CTE中，有两个部分。锚点子查询和递归子查询。首先执行锚子查询，然后执行递归子查询，直到返回空结果集。棘手的部分（至少对我而言）正在构造将执行从1个节点到下一个节点的遍历的联接。

锚和递归子查询使用

UNIOn ALL

（有时是

UNIOn

）合并并返回。

由于我们对飞行路线感兴趣，因此从以下开始的简单锚子查询将是：

SELECt src, ARRAY[src] path, dest FROM flight

上面的查询包含3列，分别是开始，路径和结束，在第二列中，我们将

src

字段从转换

TEXT

为

TEXT[]

。尽管并非严格要求这样做，但由于数组易于附加，因此它将大大简化其余步骤。

使用上面的锚点查询，我们现在可以定义递归cte。

WITH RECURSIVE flight_paths (src, path, dest) AS (SELECt  src, ARRAY[src], dest FROM flightUNIOn ALLSELECt  fp.src, fp.path || f.src -- appends another 'flight source', f.dest -- updates the dest to the new destFROM flight fJOIN flight_paths fp ON f.src = fp.dest -- this is the join that links the tree nodesWHERe NOT f.src = ANY(fp.path) -- this condition prevents infinite recursion by not visiting nodes that have already been visited.) SELECt * FROM flight_paths-- finally, we can select from the flight_paths recursive cte

上面的代码返回了136行我的测试数据。前15行如下所示：

src path        destSIN {SIN}       DACDAC {DAC}       SINSIN {SIN}       DACDAC {DAC}       DELDEL {DEL}       DACDAC {DAC}       CCUCCU {CCU}       DACSIN {SIN}       DELDEL {DEL}       SINCCU {CCU}       DELDEL {DEL}       CCUDEL {DEL,CCU}   DACDAC {DAC,CCU}   DACDEL {DEL,CCU}   DELDAC {DAC,CCU}   DELDEL {DEL,CCU}   DELDAC {DAC,CCU}   DEL

这 是树遍历的设置 ，是编写递归cte的最难部分。现在，遍历已建立，我们可以修改上面的内容，以便：

我们从源头开始，到目的地结束
到达目的地时停止迭代
仅在到达（AB）<离开（BC）时才认为转机航班AB和BC有效

在此示例中，让

src := SIN

＆
dest := CCU

WITH RECURSIVE flight_paths (src, path, dest, stt, endt) AS (SELECt  src, ARRAY[src], dest , stt, endtFROM flightUNIOn ALLSELECt  fp.src, fp.path || f.src, f.dest , fp.stt, f.endt -- update endt to the new endtFROM flight fJOIN flight_paths fp ON f.src = fp.dest WHERe NOT f.src = ANY(fp.path)   AND NOT 'CCU' = ANY(fp.path)   -- (2) this new predicate stop iteration when the destination is reached  AND f.stt > fp.endt  -- (3) this new predicate only proceeds the iteration if the connecting flights are valid) SELECt * FROM flight_pathsWHERe src = 'SIN' AND dest = 'CCU'-- (1) specify the start & end

这给出了2条可能的路线，以第一个航班的出发时间为列

stt

，最后一个航班的到达时间为

endt

：

src path dest    stt      endtSIN {SIN,DAC}       CCU     2016-12-31 22:45:00 2017-01-01 11:15:00SIN {SIN,DAC,DEL}   CCU     2016-12-31 22:45:00 2017-01-01 14:00:00

现在可以非常轻松地优化结果集。例如，可以将最终查询修改为仅返回中午之前到达目的地的航班，如下所示：

SELECt * FROM flight_pathsWHERe src = 'SIN' AND dest = 'CCU'   AND endt::time < '12:00:00'::time

从这一点开始，在递归cte中添加计算飞行时间和连接时间的列，然后过滤适合两次谓词的飞行，这也是相当容易的。希望我已经为您提供了足够的详细信息以生成这两种变体。

您也可以通过过滤

path

数组的长度来限制连接数，尽管一旦达到最大连接数，在递归cte中停止迭代可能会更有效。

最后一点：为了使事情变得简单，我使用了除最终目的地之外的路径，例如，

{SIN, DAC, DEL}

而不是航班序列

{SIN-DAC, DAC-DEL,DEL-CCU}

（或停靠站
{DAC, DEL}
），但是我们可以很容易地获得这些表示形式，如下所示：

WITH RECURSIVE flight_paths (src, flights, path, dest, stt, endt) AS (SELECt  src, ARRAY[src || '-' || dest], ARRAY[src], dest , stt, endtFROM flightUNIOn ALLSELECt  fp.src, fp.flights || (f.src || '-' || f.dest), fp.path || f.src, f.dest, fp.stt, f.endtFROM flight fJOIN flight_paths fp ON f.src = fp.dest WHERe NOT f.src = ANY(fp.path)   AND NOT 'CCU' = ANY(fp.path)   AND f.stt > fp.endt) SELECt flights, stt, endt, path[2:] stopoversFROM flight_pathsWHERe src = 'SIN' AND dest = 'CCU'

SQL查询将src获取到dest路径以连接航班

面试问答相关栏目本月热门文章