HMM 博客汇总
- 全部(为了防止部分读者厌烦,故而单独发了三个问题的博客)
- 概率计算问题
- 学习问题
- 解码问题
解码问题就是求 a r g m a x I P ( I ∣ O , λ ) underset{I}{argmax} P(I|O,lambda) Iargmax P(I∣O,λ)
Viterbi算法Viterbi算法事实上是一个动态规划的算法
这个图来自知乎
我们把概率当成距离
那么只要确定了唯一的终点,到这个终点的最大距离必然等于到前一个时间轴5个点的最大距离分别乘以这5个点到终点的距离
我们也可以用公式严格推导出这一性质
定义距离为
δ
t
(
i
)
=
m
a
x
i
1
,
i
2
,
.
.
.
,
i
t
−
1
P
(
o
1
,
o
2
,
.
.
.
,
o
t
,
i
1
,
i
2
,
.
.
.
,
i
t
−
1
,
i
t
=
q
i
)
delta_t(i) = underset{i_1,i_2,...,i_{t-1}}{max} P(o_1,o_2,...,o_t,i_1,i_2,...,i_{t-1},i_t=q_i)
δt(i)=i1,i2,...,it−1maxP(o1,o2,...,ot,i1,i2,...,it−1,it=qi)
δ i + 1 ( i ) = m a x i 1 , i 2 , . . . , i t P ( o 1 , o 2 , . . . , o t , o t + 1 , i 1 , i 2 , . . . , i t − 1 , i t , i t + 1 = q i ) = m a x i 1 , i 2 , . . . , i t P ( o t + 1 ∣ i t + 1 = q i ) ⋅ P ( o 1 , o 2 , . . . , o t , i 1 , i 2 , . . . , i t , i t + 1 = q i ) = b i ( o t + 1 ) ⋅ m a x i 1 , i 2 , . . . , i t P ( o 1 , o 2 , . . . , o t , i 1 , i 2 , . . . , i t , i t + 1 = q i ) = b i ( o t + 1 ) ⋅ m a x j m a x i 1 , i 2 , . . . , i t − 1 P ( o 1 , o 2 , . . . , o t , i 1 , i 2 , . . . , i t = q j , i t + 1 = q i ) = b i ( o t + 1 ) ⋅ m a x j m a x i 1 , i 2 , . . . , i t − 1 P ( i t + 1 = q i ∣ i t = q j ) ⋅ P ( o 1 , o 2 , . . . , o t , i 1 , i 2 , . . . , i t = q j ) = b i ( o t + 1 ) ⋅ m a x j m a x i 1 , i 2 , . . . , i t − 1 α j i ⋅ P ( o 1 , o 2 , . . . , o t , i 1 , i 2 , . . . , i t = q j ) = b i ( o t + 1 ) ⋅ m a x j ( α j i ⋅ m a x i 1 , i 2 , . . . , i t − 1 P ( o 1 , o 2 , . . . , o t , i 1 , i 2 , . . . , i t = q j ) ) = b i ( o t + 1 ) ⋅ m a x j ( α j i ⋅ δ t ( j ) ) begin{aligned} delta_{i+1}(i) &= underset{i_1,i_2,...,i_t}{max} P(o_1,o_2,...,o_t,o_{t+1},i_1,i_2,...,i_{t-1},i_t,i_{t+1}=q_i) \ &= underset{i_1,i_2,...,i_t}{max} P(o_{t+1}|i_{t+1}=q_i)cdot P(o_1,o_2,...,o_t,i_1,i_2,...,i_t,i_{t+1}=q_i) \ &=b_i(o_{t+1})cdot underset{i_1,i_2,...,i_t}{max} P(o_1,o_2,...,o_t,i_1,i_2,...,i_t,i_{t+1}=q_i) \ &=b_i(o_{t+1})cdot underset{j}{max}underset{i_1,i_2,...,i_{t-1}}{max}P(o_1,o_2,...,o_t,i_1,i_2,...,i_t=q_j,i_{t+1}=q_i) \ &=b_i(o_{t+1})cdot underset{j}{max}underset{i_1,i_2,...,i_{t-1}}{max}P(i_{t+1}=q_i|i_t=q_j)cdot P(o_1,o_2,...,o_t,i_1,i_2,...,i_t=q_j) \ &=b_i(o_{t+1})cdot underset{j}{max}underset{i_1,i_2,...,i_{t-1}}{max}alpha_{ji}cdot P(o_1,o_2,...,o_t,i_1,i_2,...,i_t=q_j) \ &=b_i(o_{t+1})cdot underset{j}{max}left( alpha_{ji}cdot underset{i_1,i_2,...,i_{t-1}}{max}P(o_1,o_2,...,o_t,i_1,i_2,...,i_t=q_j)right) \ &=b_i(o_{t+1})cdot underset{j}{max}left( alpha_{ji}cdot delta_t(j)right) \ end{aligned} δi+1(i)=i1,i2,...,itmaxP(o1,o2,...,ot,ot+1,i1,i2,...,it−1,it,it+1=qi)=i1,i2,...,itmaxP(ot+1∣it+1=qi)⋅P(o1,o2,...,ot,i1,i2,...,it,it+1=qi)=bi(ot+1)⋅i1,i2,...,itmaxP(o1,o2,...,ot,i1,i2,...,it,it+1=qi)=bi(ot+1)⋅jmaxi1,i2,...,it−1maxP(o1,o2,...,ot,i1,i2,...,it=qj,it+1=qi)=bi(ot+1)⋅jmaxi1,i2,...,it−1maxP(it+1=qi∣it=qj)⋅P(o1,o2,...,ot,i1,i2,...,it=qj)=bi(ot+1)⋅jmaxi1,i2,...,it−1maxαji⋅P(o1,o2,...,ot,i1,i2,...,it=qj)=bi(ot+1)⋅jmax(αji⋅i1,i2,...,it−1maxP(o1,o2,...,ot,i1,i2,...,it=qj))=bi(ot+1)⋅jmax(αji⋅δt(j))
推导完毕
但是上面还没有给出路径
对于给定终点,我们要知道到达它的上一个点
即
ψ
t
+
1
(
i
)
=
argmax
j
δ
t
(
j
)
⋅
a
j
i
psi_{t+1}(i)=underset{j}{operatorname{argmax}} delta_{t}(j)cdot a_{ji}
ψt+1(i)=jargmaxδt(j)⋅aji
(1)初值
δ
1
(
i
)
=
P
(
o
1
,
i
1
=
q
i
)
=
P
(
o
1
∣
i
1
=
q
i
)
⋅
P
(
i
1
=
q
i
)
=
b
i
(
o
1
)
π
i
delta_1(i) = P(o_1,i_1=q_i) = P(o_1mid i_1=q_i)cdot P(i_1=q_i) = b_i(o_1)pi_i
δ1(i)=P(o1,i1=qi)=P(o1∣i1=qi)⋅P(i1=qi)=bi(o1)πi
(2)递推
δ
i
+
1
(
i
)
=
b
i
(
o
t
+
1
)
⋅
m
a
x
j
(
α
j
i
⋅
δ
t
(
j
)
)
ψ
t
+
1
(
i
)
=
argmax
j
(
δ
t
(
j
)
⋅
a
j
i
)
begin{aligned} delta_{i+1}(i) &= b_i(o_{t+1})cdot underset{j}{max}left( alpha_{ji}cdot delta_t(j)right) \ psi_{t+1}(i) &= underset{j}{operatorname{argmax}}(delta_{t}(j)cdot a_{ji}) end{aligned}
δi+1(i)ψt+1(i)=bi(ot+1)⋅jmax(αji⋅δt(j))=jargmax(δt(j)⋅aji)
(3)终止
P
∗
=
m
a
x
1
⩽
i
⩽
N
δ
T
(
i
)
i
T
∗
=
a
r
g
m
a
x
1
⩽
i
⩽
N
[
δ
T
(
i
)
]
begin{array}{c} P^{*}=underset{{1 leqslant i leqslant N}}{max} delta_{T}(i) \ i_{T}^{*}=underset{{1 leqslant i leqslant N}}{argmax}left[delta_{T}(i)right] end{array}
P∗=1⩽i⩽NmaxδT(i)iT∗=1⩽i⩽Nargmax[δT(i)]
(4)回溯(对
t
t
t 从
T
−
1
,
T
−
2
,
.
.
.
,
1
T-1,T-2,...,1
T−1,T−2,...,1)
i
t
∗
=
a
r
g
m
a
x
1
⩽
i
⩽
N
[
δ
t
(
i
)
]
i_t^* = underset{{1 leqslant i leqslant N}}{argmax}left[delta_{t}(i)right]
it∗=1⩽i⩽Nargmax[δt(i)]
import numpy as np
pi= [0.25,0.25,0.25,0.25]
A = [
[0, 1, 0, 0 ],
[0.4,0,0.6,0],
[0,0.4,0,0.6],
[0,0,0.5,0.5]
]
B = [
[0.5,0.5],
[0.3,0.7],
[0.6,0.4],
[0.8,0.2]
]
定义模型
class Model:
def __init__(self,pi,A,B) -> None:
self.pi = pi
self.A = A
self.B = B
self.N = len(A)
self.M = len(B[0])
def decode(self,O):
path = np.zeros(shape=(len(O),self.N),dtype=int)
delta = np.zeros(shape=(self.N,))
delta_cp = np.zeros(shape=(self.N,))
for i in range(self.N):
delta[i] = self.B[i][O[0]]*self.pi[i]
for t in range(1,len(O)):
for i in range(self.N):
max_val = 0
max_index = 0
for j in range(self.N):
temp = self.A[j][i]*delta[j]
if temp>max_val:
max_index = j
max_val = temp
path[t][i] = max_index
delta_cp[i] = self.B[i][O[t]]*max_val
delta = delta_cp.copy()
p_star = np.max(delta)
I = []
index = np.argmax(delta)
for t in reversed(range(0,len(O))):
I.insert(0,index)
index = path[t][index]
return p_star,I
解码
model = Model(pi,A,B) I,O = generate(5) print(I) print(O) model.decode(O)
代码有问题,待修改…



