尽管匈牙利算法将起作用,但在您的情况下,可以使用更简单的算法。您的隐式成本矩阵是一种特殊形式的对称矩阵:
ABS(SUBJ.match_field-CTRL.match_field)
因此,您可以相对容易地证明,在最佳分配 _{SUBJ i,CTRL j
}中,_按
SUBJ.match_field的值
CTRL.match_field排序也将是有序的。
证明: 考虑一个由 { }_排序的赋值 {SUBJ_ i ,CTRL j
}
SUBJ.match_field而不是
CTRL.match_field。然后,您至少会有一个 反转 ,即一对赋值 {SUBJ
i1,CTRL j1 }_和 _这样,
SUBJ.match_fieldi1 <
SUBJ.match_fieldi2,但是
CTRL.match_fieldj1 >
CTRL.match_fieldj2
然后,您可以将反相对替换为非反相对
{SUBJ i1,CTRL j2 }_和 _
小于或等于
SUBJ.match_field(i1,i2)和
CTRL.match_field(j1,j2)的所有六个相对位置的反向分配的成本(链接到Wolfram
Alpha)。
:证明
有了这个观察,就很容易证明以下 动态编程
算法具有最佳分配:
- 让
N
每个主题的复印件; 排序match_field
- 订单控制依据
match_field
- 准备一个
assignments
大小为空的数组N * subject.SIZE
- 准备一个空的2D阵列
mem
尺寸的N * subject.SIZE
由control.SIZE
用于 记忆化 ; 将所有元素设置为-1
Recursive_Assign
在下面的伪代码中定义的调用- 该
assignments
表现在包含N
每个主题i
在N*i
包括(含)和N*(i+1)
排除(排除)之间的作业。
FUNCTION Recursive_Assign // subjects contains each original subj repeated N times PARAM subjects : array of int[subjectLength] PARAM controls: array of int[controlLength] PARAM mem : array of int[subjectLength,controlLength] PARAM sp : int // current subject position PARAM cp : int // current control position PARAM assign : array of int[subjectLength]BEGIN IF sp == subjects.Length THEN RETURN 0 ENDIF IF mem[sp, cp] > 0 THEN RETURN mem[sp, cp] ENDIF int res = ABS(subjects[sp] - controls[cp]) + Recursive_Assign(subjects, controls, mem, sp + 1, cp + 1, assign) assign[sp] = cp IF cp+1+subjects.Length-sp < controls.Length THEN int alt = Recursive_Assign(subjects, controls, mem, sp, cp + 1, assign) IF alt < res THEN res = alt ELSE assign[sp] = cp ENDIF ENDIF RETURN (mem[sp, cp] = res)END
这是在ideone上使用C#的上述伪代码的实现。
该算法已准备好在SQL中重新编写为基于集合的算法。尝试使其适应原始问题设置(按位置分组并制作主题的多个副本)会给已经相当复杂的过程增加不必要的复杂性,因此我将通过使用表来简化一些事情值的SQL
Server参数。我不确定DB2是否提供类似的功能,但是如果没有,您应该能够将它们替换为临时表。
下面的存储过程几乎是将上述伪代码直接转换为SQL Server的存储过程语法:
CREATE TYPE SubjTableType AS TABLE (row int, id int, match_field int)CREATE TYPE ControlTableType AS TABLE (row int, id int, match_field int)CREATE PROCEDURE RecAssign ( @subjects SubjTableType READONLY, @controls ControlTableType READONLY, @sp int, @cp int, @subjCount int, @ctrlCount int) AS BEGIN IF @sp = @subjCount BEGIN RETURN 0 END IF 1 = (SELECt COUNT(1) FROM #MemoTable WHERe sRow=@sp AND cRow=@cp) BEGIN RETURN (SELECt best FROM #MemoTable WHERe sRow=@sp AND cRow=@cp) END DECLARE @res int, @spNext int, @cpNext int, @prelim int, @alt int, @diff int, @sId int, @cId int SET @spNext = @sp + 1 SET @cpNext = @cp + 1 SET @sId = (SELECt id FROM @subjects WHERe row = @sp) SET @cId = (SELECt id FROM @controls WHERe row = @cp) EXEC @prelim = RecAssign @subjects=@subjects, @controls=@controls, @sp=@spNext, @cp=@cpNext, @subjCount=@subjCount, @ctrlCount=@ctrlCount SET @diff = ABS((SELECt match_field FROM @subjects WHERe row=@sp)-(SELECt match_field FROM @controls WHERe row=@cp)) SET @res = @prelim + @diff IF 1 = (SELECt COUNT(1) FROM #Assignments WHERe sRow=@sp) BEGIN UPDATe #Assignments SET cId=@cId, sId=@sId, diff=@diff WHERe sRow=@sp END ELSE BEGIN INSERT INTO #Assignments(sRow, sId, cId, diff) VALUES (@sp, @sId, @cId, @diff) END IF @cp+1+@subjCount-@sp < @ctrlCount BEGIN EXEC @alt = RecAssign @subjects=@subjects, @controls=@controls, @sp=@sp, @cp=@cpNext, @subjCount=@subjCount, @ctrlCount=@ctrlCount IF @alt < @res BEGIN SET @res = @alt END ELSE BEGIN UPDATE #Assignments SET cId=@cId, sId=@sId, diff=@diff WHERe sRow=@sp END END INSERT INTO #MemoTable (sRow, cRow, best) VALUES (@sp, @cp, @res) RETURN @resEND
这是调用此存储过程的方式:
-- The procedure uses a temporary table for memoization:CREATE TABLE #MemoTable (sRow int, cRow int, best int)-- The procedure returns a table with assignments:CREATE TABLE #Assignments (sRow int, sId int, cId int, diff int)DECLARE @subj as SubjTableTypeINSERT INTO @SUBJ (row, id, match_field) SELECT ROW_NUMBER() OVER(ORDER BY match_field ASC)-1 AS row, id, match_field FROM subjectsDECLARE @ctrl as ControlTableTypeINSERT INTO @ctrl (row, id, match_field) SELECt ROW_NUMBER() OVER(ORDER BY match_field ASC)-1 AS row, id, match_field FROM controlsDECLARE @subjCount intSET @subjCount = (SELECt COUNT(1) FROM subjects)DECLARE @ctrlCount intSET @ctrlCount = (SELECt COUNT(1) FROM controls)DECLARE @best intEXEC @best = RecAssign @subjects=@subj, @controls=@ctrl, @sp=0, @cp=0, @subjCount=@subjCount, @ctrlCount=@ctrlCountSELECt @bestSELECT sId, cId, diff FROM #Assignments
上面的呼叫假定两个
subjects和
controls已经由位置滤除,以及
N拷贝
subjects已被插入到在进行呼叫之前的表值参数(或DB2的情况下,临时表)。
这是在sqlfiddle上运行的演示。



