Project Euler 419

Project Euler 419

题目

Look and say sequence

The look and say sequence goes \(1, 11, 21, 1211, 111221, 312211, 13112221, 1113213211, \dots\)

The sequence starts with \(1\) and all other members are obtained by describing the previous member in terms of consecutive digits.

It helps to do this out loud:

\(1\) is ‘one one’ → \(11\)

\(11\) is ‘two ones’ → \(21\)

\(21\) is ‘one two and one one’ → \(1211\)

\(1211\) is ‘one one, one two and two ones’ → \(111221\)

\(111221\) is ‘three ones, two twos and one one’ → \(312211\)

\(\dots\)

Define \(A(n), B(n)\) and \(C(n)\) as the number of ones, twos and threes in the \(n\)’th element of the sequence respectively.

One can verify that \(A(40) = 31254, B(40) = 20259\) and \(C(40) = 11625\).

Find \(A(n), B(n)\) and \(C(n)\) for \(n = 10^{12}\).

Give your answer modulo \(2^{30}\) and separate your values for \(A, B\) and \(C\) by a comma.

E.g. for \(n = 40\) the answer would be \(31254,20259,11625\)

解决方案

Cosmological 定理 说明:当外观数列发展到足够大时,序列会“衰变”为若干个互不干扰的 原子串(atomic elements) 的拼接。论文介绍了这些原子串集合;这些原子来自一个有限集合(在只含数字 \(1,2,3\) 的情形下共有 \(92\) 个)。

换句话说,对足够大的 \(n\),存在唯一的分解:\(a_n = E_{i_1}E_{i_2}\cdots E_{i_t},\)其中每个 \(E_i\) 都是固定的原子串,并且下一步外观变换满足:\(a_{n+1}=s(a_n)=s(E_{i_1})s(E_{i_2})\cdots s(E_{i_t}),\)也就是说原子之间不会在边界发生混合。

因此从某一项开始,整个问题转化为:追踪各原子出现次数的线性演化。

\(92\) 个原子编号为:\(E_0,E_1,\dots,E_{91}.\)

定义第 \(n\) 项的原子出现次数向量(列向量)\(\mathbf{v}_n\in\mathbb Z_{\ge 0}^{92}\),其中\((\mathbf{v}_n)_i\)表示在\(a_n\)的分解中\(E_i\)出现次数.

对每个原子 \(E_j\),它在一次外观变换后会衰变为若干原子拼接:\(s(E_j)=E_{t_1}E_{t_2}\cdots E_{t_r}.\)

于是存在一个固定的 \(92\times 92\) 转移矩阵 \(T\),满足\(\mathbf{v}_{n+1}=T\mathbf{v}_n,\)并且矩阵元素\(T_{i,j}\)含义为:在\(s(E_j)\)中原子\(E_i\)的出现次数。从而得到闭式:\(\mathbf{v}_n=T^{n-n_0}\mathbf{v}_{n_0}.\)

对每个原子 \(E_i\),预处理它内部数字 \(1,2,3\) 的出现次数:\(\mathbf{d}_i=(d_{i,1},d_{i,2},d_{i,3}),\)其中\(d_{i,j},j\in\{1,2,3\}\)表示\(j\)在原子\(E_i\)出现的次数。

则第 \(n\) 项中数字统计为

\[ \begin{pmatrix} A(n)\\B(n)\\C(n) \end{pmatrix} =\sum_{i=0}^{91}(\mathbf{v}_n)_i \begin{pmatrix} d_{i,1}\\d_{i,2}\\d_{i,3} \end{pmatrix}. \]

外观数列前几项为:\(1,11,21,1211,111221,312211,13112221,1113213211,\dots\),从已知的原子分解表可得到第 \(8\) 项(即 \(a_8\))的原子分解为两种原子之和:\(a_8=E_{\text{Hf}},E_{\text{Sn}}.\)

在编号体系下对应两个下标(用 \(0\) 开始计数):\(\text{Hf}=23,\text{Sn}=38.\)因此初始向量为:\((\mathbf{v}_8)_{23}=1,(\mathbf{v}_8)_{38}=1\)\(\mathbf{v}_8\)的其余位置是\(0\)。对于目标 \(n=10^{12}\),于是\(\mathbf{v}_{n}=T^{n-8}\mathbf{v}_8.\)

综上即可计算出\(A(n),B(n),C(n)\)的值。

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
import numpy as np

M = 10 ** 12
MOD = 1 << 30
MASK = MOD - 1

strings = [
"1112",
"1112133",
"111213322112",
"111213322113",
"1113",
"11131",
"111311222112",
"111312",
"11131221",
"1113122112",
"1113122113",
"11131221131112",
"111312211312",
"11131221131211",
"111312211312113211",
"111312211312113221133211322112211213322112",
"111312211312113221133211322112211213322113",
"11131221131211322113322112",
"11131221133112",
"1113122113322113111221131221",
"11131221222112",
"111312212221121123222112",
"111312212221121123222113",
"11132",
"1113222",
"1113222112",
"1113222113",
"11133112",
"12",
"123222112",
"123222113",
"12322211331222113112211",
"13",
"131112",
"13112221133211322112211213322112",
"13112221133211322112211213322113",
"13122112",
"132",
"13211",
"132112",
"1321122112",
"132112211213322112",
"132112211213322113",
"132113",
"1321131112",
"13211312",
"1321132",
"13211321",
"132113212221",
"13211321222113222112",
"1321132122211322212221121123222112",
"1321132122211322212221121123222113",
"13211322211312113211",
"1321133112",
"1322112",
"1322113",
"13221133112",
"1322113312211",
"132211331222113112211",
"13221133122211332",
"22",
"3",
"3112",
"3112112",
"31121123222112",
"31121123222113",
"3112221",
"3113",
"311311",
"31131112",
"3113112211",
"3113112211322112",
"3113112211322112211213322112",
"3113112211322112211213322113",
"311311222",
"311311222112",
"311311222113",
"3113112221131112",
"311311222113111221",
"311311222113111221131221",
"31131122211311122113222",
"3113112221133112",
"311312",
"31132",
"311322113212221",
"311332",
"3113322112",
"3113322113",
"312",
"312211322212221121123222113",
"312211322212221121123222112",
"32112",
]

trans = [
[63],
[64, 62],
[65],
[66],
[68],
[69],
[84, 55],
[70],
[71],
[76],
[77],
[82],
[78],
[79],
[80],
[81, 29, 91],
[81, 29, 90],
[81, 30],
[75, 29, 92],
[75, 32],
[72],
[73],
[74],
[83],
[86],
[87],
[88],
[89, 92],
[1],
[3],
[4],
[2, 61, 29, 85],
[5],
[28],
[24, 33, 61, 29, 91],
[24, 33, 61, 29, 90],
[7],
[8],
[9],
[10],
[21],
[22],
[23],
[11],
[19],
[12],
[13],
[14],
[15],
[18],
[16],
[17],
[20],
[6, 61, 29, 92],
[26],
[27],
[25, 29, 92],
[25, 29, 67],
[25, 29, 85],
[25, 29, 68, 61, 29, 89],
[61],
[33],
[40],
[41],
[42],
[43],
[38, 39],
[44],
[48],
[54],
[49],
[50],
[51],
[52],
[47, 38],
[47, 55],
[47, 56],
[47, 57],
[47, 58],
[47, 59],
[47, 60],
[47, 33, 61, 29, 92],
[45],
[46],
[53],
[38, 29, 89],
[38, 30],
[38, 31],
[34],
[36],
[35],
[37],
]

N = 92
mat = np.zeros((N, N), dtype=np.uint64)
for i, outs in enumerate(trans):
for d in outs:
mat[i, d - 1] += 1

cnt = np.zeros((N, 3), dtype=np.uint64)
for i, s in enumerate(strings):
cnt[i, 0] = s.count("1")
cnt[i, 1] = s.count("2")
cnt[i, 2] = s.count("3")

def mat_mul(a, b):
return (a @ b) & MASK

def mat_pow(a, e):
r = np.identity(N, dtype=np.uint64)
b = a.copy()
while e:
if e & 1:
r = mat_mul(r, b)
e >>= 1
if e:
b = mat_mul(b, b)
return r

def solve(M):
init = np.zeros((1, N), dtype=np.uint64)
init[0, 23] = 1
init[0, 38] = 1
p = M - 8
mp = mat_pow(mat, p)
out = (init @ mp @ cnt) & MASK
return int(out[0, 0]), int(out[0, 1]), int(out[0, 2])

print(",".join(map(str, solve(M))))