算法导论32.5 Exercises 答案
32.5-1
如下两表,是分别执行完COMPUTE-SUFFIX-ARRAY的第2-7行和第8行后的结果。
$\begin{array}{ccccl}
i&left\text{-}rank&right\text{-}rank&index&\text{substring}\\hline
1&8&9&1&\texttt{hi}\
2&9&16&2&\texttt{ip}\
3&16&16&3&\texttt{pp}\
4&16&9&4&\texttt{pi}\
5&9&20&5&\texttt{it}\
6&20&25&6&\texttt{ty}\
7&25&8&7&\texttt{yh}\
8&8&15&8&\texttt{ho}\
9&15&16&9&\texttt{op}\
10&16&16&10&\texttt{pp}\
11&16&9&11&\texttt{pi}\
12&9&20&12&\texttt{it}\
13&20&25&13&\texttt{ty}\
14&25&0&14&\texttt{y}\
\end{array}
\qquad
\begin{array}{ccccl}
i&left\text{-}rank&right\text{-}rank&index&\text{substring}\\hline
1&8&9&1&\texttt{hi}\
2&8&15&8&\texttt{ho}\
3&9&16&2&\texttt{ip}\
4&9&20&5&\texttt{it}\
5&9&20&12&\texttt{it}\
6&15&16&9&\texttt{op}\
7&16&9&4&\texttt{pi}\
8&16&9&11&\texttt{pi}\
9&16&16&3&\texttt{pp}\
10&16&16&10&\texttt{pp}\
11&20&25&6&\texttt{ty}\
12&20&25&13&\texttt{ty}\
13&25&0&14&\texttt{y}\
14&25&8&7&\texttt{yh}\
\end{array}$
当$l=2$时,分别执行完COMPUTE-SUFFIX-ARRAY的第11行,第12-17行和第18行的结果如下三个表所示。
$\begin{array}{ccl}
i&rank&\text{substring}\\hline
1&1&\texttt{hi}\
2&3&\texttt{ip}\
3&7&\texttt{pp}\
4&6&\texttt{pi}\
5&4&\texttt{it}\
6&8&\texttt{ty}\
7&10&\texttt{yh}\
8&2&\texttt{ho}\
9&5&\texttt{op}\
10&7&\texttt{pp}\
11&6&\texttt{pi}\
12&4&\texttt{it}\
13&8&\texttt{ty}\
14&9&\texttt{y}\
\end{array}\qquad
\begin{array}{ccccl}
i&left\text{-}rank&right\text{-}rank&index&\text{substring}\\hline
1&1&7&1&\texttt{hipp}\
2&3&6&2&\texttt{ippi}\
3&7&4&3&\texttt{ppit}\
4&6&8&4&\texttt{pity}\
5&4&10&5&\texttt{ityh}\
6&8&2&6&\texttt{tyho}\
7&10&5&7&\texttt{yhop}\
8&2&7&8&\texttt{hopp}\
9&5&6&9&\texttt{oppi}\
10&7&4&10&\texttt{ppit}\
11&6&8&11&\texttt{pity}\
12&4&9&12&\texttt{ity}\
13&8&0&13&\texttt{ty}\
14&9&0&14&\texttt{y}\
\end{array}\qquad
\begin{array}{ccccl}
i&left\text{-}rank&right\text{-}rank&index&\text{substring}\\hline
1&1&7&1&\texttt{hipp}\
2&2&7&8&\texttt{hopp}\
3&3&6&2&\texttt{ippi}\
4&4&9&12&\texttt{ity}\
5&4&10&5&\texttt{ityh}\
6&5&6&9&\texttt{oppi}\
7&6&8&4&\texttt{pity}\
8&6&8&11&\texttt{pity}\
9&7&4&3&\texttt{ppit}\
10&7&4&10&\texttt{ppit}\
11&8&0&13&\texttt{ty}\
12&8&2&6&\texttt{tyho}\
13&9&0&14&\texttt{y}\
14&10&5&7&\texttt{yhop}\
\end{array}$
当$l=4$时,分别执行完COMPUTE-SUFFIX-ARRAY的第11行,第12-17行和第18行的结果如下三个表所示。
$\begin{array}{ccl}
i&rank&\text{substring}\\hline
1&1&\texttt{hipp}\
2&3&\texttt{ippi}\
3&8&\texttt{ppit}\
4&7&\texttt{pity}\
5&5&\texttt{ityh}\
6&10&\texttt{tyho}\
7&12&\texttt{yhop}\
8&2&\texttt{hopp}\
9&6&\texttt{oppi}\
10&8&\texttt{ppit}\
11&7&\texttt{pity}\
12&4&\texttt{ity}\
13&9&\texttt{ty}\
14&11&\texttt{y}\
\end{array}\qquad
\begin{array}{ccccl}
i&left\text{-}rank&right\text{-}rank&index&\text{substring}\\hline
1&1&5&1&\texttt{hippityh}\
2&3&10&2&\texttt{ippityho}\
3&8&12&3&\texttt{ppityhop}\
4&7&2&4&\texttt{pityhopp}\
5&5&6&5&\texttt{ityhoppi}\
6&10&8&6&\texttt{tyhoppit}\
7&12&7&7&\texttt{yhoppity}\
8&2&4&8&\texttt{hoppity}\
9&6&9&9&\texttt{oppity}\
10&8&11&10&\texttt{ppity}\
11&7&0&11&\texttt{pity}\
12&4&0&12&\texttt{ity}\
13&9&0&13&\texttt{ty}\
14&11&0&14&\texttt{y}\
\end{array}\qquad
\begin{array}{ccccl}
i&left\text{-}rank&right\text{-}rank&index&\text{substring}\\hline
1&1&5&1&\texttt{hippityh}\
2&2&4&8&\texttt{hoppity}\
3&3&10&2&\texttt{ippityho}\
4&4&0&12&\texttt{ity}\
5&5&6&5&\texttt{ityhoppi}\
6&6&9&9&\texttt{oppity}\
7&7&0&11&\texttt{pity}\
8&7&2&4&\texttt{pityhopp}\
9&8&11&10&\texttt{ppity}\
10&8&12&3&\texttt{ppityhop}\
11&9&0&13&\texttt{ty}\
12&10&8&6&\texttt{tyhoppit}\
13&11&0&14&\texttt{y}\
14&12&7&7&\texttt{yhoppity}\
\end{array}$
当$l=8$时,分别执行完COMPUTE-SUFFIX-ARRAY的第11行,第12-17行和第18行的结果如下三个表所示。
$\begin{array}{ccl}
i&rank&\text{substring}\\hline
1&1&\texttt{hippityh}\
2&3&\texttt{ippityho}\
3&10&\texttt{ppityhop}\
4&8&\texttt{pityhopp}\
5&5&\texttt{ityhoppi}\
6&12&\texttt{tyhoppit}\
7&14&\texttt{yhoppity}\
8&2&\texttt{hoppity}\
9&6&\texttt{oppity}\
10&9&\texttt{ppity}\
11&7&\texttt{pity}\
12&4&\texttt{ity}\
13&11&\texttt{ty}\
14&13&\texttt{y}\
\end{array}\qquad\begin{array}{ccccl}
i&left\text{-}rank&right\text{-}rank&index&\text{substring}\\hline
1&1&6&1&\texttt{hippityhoppity}\
2&3&9&2&\texttt{ippityhoppity}\
3&10&7&3&\texttt{ppityhoppity}\
4&8&4&4&\texttt{pityhoppity}\
5&5&11&5&\texttt{ityhoppity}\
6&12&13&6&\texttt{tyhoppity}\
7&14&0&7&\texttt{yhoppity}\
8&2&0&8&\texttt{hoppity}\
9&6&0&9&\texttt{oppity}\
10&9&0&10&\texttt{ppity}\
11&7&0&11&\texttt{pity}\
12&4&0&12&\texttt{ity}\
13&11&0&13&\texttt{ty}\
14&13&0&14&\texttt{y}\
\end{array}\qquad\begin{array}{ccccl}
i&left\text{-}rank&right\text{-}rank&index&\text{substring}\\hline
1&1&6&1&\texttt{hippityhoppity}\
2&2&0&8&\texttt{hoppity}\
3&3&9&2&\texttt{ippityhoppity}\
4&4&0&12&\texttt{ity}\
5&5&11&5&\texttt{ityhoppity}\
6&6&0&9&\texttt{oppity}\
7&7&0&11&\texttt{pity}\
8&8&4&4&\texttt{pityhoppity}\
9&9&0&10&\texttt{ppity}\
10&10&7&3&\texttt{ppityhoppity}\
11&11&0&13&\texttt{ty}\
12&12&13&6&\texttt{tyhoppity}\
13&13&0&14&\texttt{y}\
14&14&0&7&\texttt{yhoppity}\
\end{array}$
因此,得到的$SA$数组为$[1,8,2,12,5,9,11,4,10,3,13,6,14,7]$。
最终,按序填入$LCP$数组的顺序如下表所示:
$\begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|c|}\hline
i&1&2&3&4&5&6&7&8&9&10&11&12&13&14\\hline
s&1&3&10&8&5&12&14&2&6&9&7&4&11&13\\hline
LCP[i]&0&0&5&4&3&2&1&1&0&1&0&1&0&0\\hline
\end{array}$
因此,得到的$LCP$数组为$[0,1,0,1,3,0,0,4,1,5,0,2,0,1]$。
32.5-2
基本思想是,如果不需要$\lfloor\lg n\rfloor-1$次while循环就能够区分出所有后缀的排名,那么while循环可以提前终止。
具体细节是,在MAKE-RANKS的过程中,如果最后一名的后缀的排名已经恰好达到了$n$,那么可以终止while循环。修改狗的算法由COMPUTE-SUFFIX-ARRAY'给出。
1 | COMPUTE-SUFFIX-ARRAY'(T, n) |
32.5-3
使用一个不曾出现在$T_1,T_2$中的字符$\texttt{@}$,并将其和$T_1,T_2$拼接起来,得到$T=T_1\texttt{@}T_2$,其长度为$n=n_1+n_2+1$。对字符串$T$求出它的SA数组和LCP数组后,枚举$T$中每对排名相邻的后缀$SA[i],SA[i-1]$。如果这两个后缀来自$T$的不同部分(也就是其中一个来自$T_1$,另一个来自$T_2$)那么说明这两个后缀的最长公共前缀为$T_1,T_2$这两个字符串的子串之一,由于$\texttt{@}$不在$T_1$中,因此可以确保$LCP$不会恰好经过$\texttt{@}$。具体过程由程序LONGEST-COMMON-SUBSTRINGS给出。
1 | LONGEST-COMMON-SUBSTRINGS(T1, T2, n1, n2) |
32.5-4
这里首先需要解释一下这个约束$SA[i-1]=n’-SA[i]-LCP[i]+2$是怎么来的:
首先,$T$中有一个子串$[l,r]$是回文串,那么在$T’$这个字符串中,其在后半段对应的位置中是$[l’,r’]$。这意味着$r-l+1=r’-l’+1=\text{len}$,由于$T’$的最后$n$个字符是由$T$反转而来,并且字符$\texttt{@}$处在下标$\dfrac{n’+1}{2}$中,因此有$\dfrac{n’+1}{2}-r=l’-\dfrac{n’+1}{2}$。最终联立上面两个式子可以得到$n’+1=l’+l+\text{len}-1$,更进一步可以得到$l’=n’-l-\text{len}+2$,和上面给定的约束很像。
因此,这个求解最长回文子串的算法的错误在于,它认为后缀$T[l:]$和$T[l’:]$在$SA$数组中是相邻的,然而并非如此。
考虑字符串$T=\texttt{abbcabb}$,那么有$T’=\texttt{abbcabb@bbacbba}$。等算法结束后,我们可以得到关于这个字符串的表格:
$\begin{array}{cclc}
i&SA[i]&\text{substring}&LCP[i]&SA[i-1]=n’-SA[i]-LCP[i]+2\\hline
1&8&\texttt{@bbacbba}&0&\text{No}\
2&15&\texttt{a}&0&\text{No}\
3&5&\texttt{abb@bbacbba}&1&\text{No}\
4&1&\texttt{abbcabb@bbacbba}&3&\text{No}\
5&11&\texttt{acbba}&1&\text{No}\
6&7&\texttt{b@bbacbba}&0&\text{No}\
7&14&\texttt{ba}&1&\text{No}\
8&10&\texttt{bacbba}&2&\text{No}\
9&6&\texttt{bb@bbacbba}&1&\text{Yes}\
10&13&\texttt{bba}&2&\text{No}\
11&9&\texttt{bbacbba}&3&\text{No}\
12&2&\texttt{bbcabb@bbacbba}&2&\text{No}\
13&3&\texttt{bcabb@bbacbba}&1&\text{No}\
14&4&\texttt{cabb@bbacbba}&0&\text{No}\
15&12&\texttt{cbba}&1&\text{Yes}\
\end{array}$
由表格可知这个算法的输出结果为$1$,但是实际上答案为$2$。$T$一共有两个子串$\texttt{bb}$满足答案。其中一对$\texttt{bb}$及其在后半段的反转分别是$T’[2:3],T’[13:14]$,但是在$SA$数组中,$2$和$13$不相邻。同样的,另一对$\texttt{bb}$则及其反转分别是在$T’[6:7],T’[9:10]$,但是在$SA$数组中,$6$和$10$不相邻。
最终是因为约束$SA[i-1]=n’-SA[i]-LCP[i]+2$认为后缀$T[l:]$和$T[l’:]$在$SA$数组中是相邻的,导致了错误。因此这个约束不正确。