Project Euler 829

Project Euler 829

题目

Integral Fusion

Given any integer \(n>1\) a binary factor tree \(T(n)\) is defined to be:

  • A tree with the single node \(n\) when \(n\) is prime.
  • A binary tree that has root node \(n\), left subtree \(T(a)\) and right subtree \(T(b)\), when \(n\) is not prime. Here \(a\) and \(b\) are positive integers such that \(n = ab\), \(a\le b\) and \(b-a\) is the smallest.

For example \(T(20)\):

We define \(M(n)\) to be the smallest number that has a factor tree identical in shape to the factor tree for \(n!!\), the double factorial of \(n\).

For example, consider \(9!! = 9\times 7\times 5\times 3\times 1 = 945\). The factor tree for \(945\) is shown below together with the factor tree for \(72\) which is the smallest number that has a factor tree of the same shape. Hence \(M(9) = 72\).

Find \(\displaystyle\sum_{n=2}^{31} M(n)\).

解决方案

\(N=31\)。对任意合数 \(x\),题目要求在所有 \(x=ab\)\(a\le b\))的分解中选择使 \(b-a\) 最小的一组。记\(f(x)=\max\{d: d\mid x,d\le \sqrt{x}\}.\)也就是 \(x\)\(\sqrt{x}\) 以下最大的因子。

下面证明:题目的分解必然取 \((a,b)=(f(x),x/f(x))\)

对每个 \(d\mid x\)\(d\le \sqrt{x}\),对应的配对因子是 \(x/d\ge \sqrt{x}\),差值为\(\Delta(d)=\frac{x}{d}-d.\)\(d>0\) 上,\(\Delta'(d)=-\frac{x}{d^2}-1<0,\)因此 \(\Delta(d)\)\(d\) 的增大严格下降。所以在所有可行的 \(d\le \sqrt{x}\) 中,差值最小当且仅当 \(d\) 最大,即 \(d=f(x)\)

结论:对任意合数 \(x\),因子树根节点的划分唯一确定为\(x = f(x)\cdot \dfrac{x}{f(x)}.\)

由于我们只关心树形,不关心每个结点的数值。最方便的做法是把形状递归编码为一个类型:叶子(素数结点)形状记为 \(0\);非叶子结点形状记为有序对 \((L,R)\),其中 \(L\)\(R\) 分别是左右子树形状。这样,两个数的因子树形状相同,当且仅当它们得到的形状编码完全一致。

设某个形状为 \(S\),那么:

  • \(S\) 是叶子,则所有满足该形状的数只能是素数;
  • \(S=(S_L,S_R)\),那么某个整数 \(X\) 的根要拆成 \(X=uv\),并且必须满足:\(T(u)\) 的形状是 \(S_L\)\(T(v)\) 的形状是 \(S_R\)\(u\le v\);根的拆分必须由规则选中,即\(u=f(X).\)

因此,对形状 \(S=(S_L,S_R)\) 的所有候选根值集合可以写成 \(\{uv \mid u\in \mathcal{A}(S_L),v\in \mathcal{A}(S_R),u\le v,f(uv)=u\},\) 其中 \(\mathcal{A}(S)\) 表示形状为 \(S\) 的所有可能取值集合。

我们要求的是每个 \(n\) 的根形状 \(S(n!!)\) 的最小可行值,即\(M(n) = \min\{\mathcal{A}(S(n!!))\}.\)

直接生成 \(\mathcal{A}(S)\) 的全部元素不可行,因此只生成从小到大的前若干项,直到拿到最小值即可。基本思想是用优先队列按从小到大“流式生成”候选值。

因此,对每个形状 \(S\),维护一个生成器流按升序输出合法的候选数。叶子流会按升序输出素数 \(2,3,5,\dots\);那么对于内部形状 \(S=(S_L,S_R)\): * 左右子形状分别有升序流 \(\mathcal{S}(S_L),\mathcal{S}(S_R)\); * 候选根值来自所有乘积 \(u_i v_j\)(类似一个乘法矩阵),用最小堆做多路归并; * 每弹出最小乘积就测试是否满足 \(f(u_iv_j)=u_i\),满足则作为该形状的下一个输出。

这相当于按值域最小优先地搜索每个形状下可行的根值。

设当前试图用 \((u,v)\) 作为根的左右因子,其中 \(p\)\(v\) 的最小质因子。若\(v\ge u p^2,\)\(up \le \sqrt{uv}\),并且 \(up\mid uv\)\(up>u\),这说明 \(u\) 不可能是 \(\sqrt{uv}\) 以下最大的因子,从而不可能有 \(f(uv)=u\),可以直接跳过。这个剪枝能显著减少真正需要计算 \(f(uv)\) 的次数。

可见,\(N!!\)的因子个数不大,于是可以递归枚举全部约数,并取其中满足 \(d\le\sqrt{x}\) 的最大者,即得到 \(f(x)\)。再配合缓存即可非常快。

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
from functools import lru_cache
from heapq import heappop, heappush

from tools import int_sqrt, Factorizer

N = 31
def sieve(n: int):
bs = [True] * (n + 1)
bs[0] = bs[1] = False
for i in range(2, int(n**0.5) + 1):
if bs[i]:
for j in range(i * i, n + 1, i):
bs[j] = False
return [i for i, v in enumerate(bs) if v]


factorizer = Factorizer(N)
leaf_primes = factorizer.primes


def double_factorial_fac(n: int):
fac = {}
for k in range(n,1,-2):
for p,e in factorizer.factor_small(k):
fac[p] = fac.get(p,0)+e
return fac


def fac_to_tuple(fac: dict):
return tuple(sorted(fac.items()))


def compute_val_from_fac(fac: dict):
v = 1
for p, e in fac.items():
v *= p**e
return v


@lru_cache(maxsize=None)
def psr_from_fac(val: int, fac_tup: tuple):
fac = dict(fac_tup)
lim = int(int_sqrt(val))
items = list(fac.items())
best = 1

def dfs(i: int, cur: int):
nonlocal best
if cur > lim:
return
if i == len(items):
if cur > best:
best = cur
return
p, e = items[i]
v = 1
for _ in range(e + 1):
dfs(i + 1, cur * v)
v *= p

dfs(0, 1)
return best


shape_map = {None: 0}
shape_children = {0: None}


@lru_cache(maxsize=None)
def shape_id_from_fac(val: int, fac_tup: tuple):
fac = dict(fac_tup)
if len(fac) == 1:
(p, e), = fac.items()
if e == 1:
return 0

a = psr_from_fac(val, fac_tup)
b = val // a

fa = {}
x = a
for p in sorted(fac.keys()):
if x == 1:
break
ee = 0
while ee < fac[p] and x % p == 0:
x //= p
ee += 1
if ee:
fa[p] = ee

fb = {}
for p, e in fac.items():
r = e - fa.get(p, 0)
if r:
fb[p] = r

ida = shape_id_from_fac(a, fac_to_tuple(fa))
idb = shape_id_from_fac(b, fac_to_tuple(fb))
key = (ida, idb)

if key not in shape_map:
shape_map[key] = len(shape_map)
shape_children[shape_map[key]] = key
return shape_map[key]


def traverse(val: int, fac: dict, limits: dict):
sid = shape_id_from_fac(val, fac_to_tuple(fac))
limits[sid] = max(limits.get(sid, 0), val)
ch = shape_children[sid]
if ch is None:
return

a = psr_from_fac(val, fac_to_tuple(fac))
b = val // a

fa = {}
x = a
for p in sorted(fac.keys()):
if x == 1:
break
ee = 0
while ee < fac[p] and x % p == 0:
x //= p
ee += 1
if ee:
fa[p] = ee

fb = {}
for p, e in fac.items():
r = e - fa.get(p, 0)
if r:
fb[p] = r

traverse(a, fa, limits)
traverse(b, fb, limits)


@lru_cache(maxsize=None)
def merge_fac_tup(a: tuple, b: tuple):
da = dict(a)
for p, e in b:
da[p] = da.get(p, 0) + e
return tuple(sorted(da.items()))


class Stream:
def __init__(self, sid: int, streams: dict, limits: dict):
self.sid = sid
self.streams = streams
self.ch = shape_children[sid]
self.limit = limits.get(sid, 0)
self.vals = []
self.facs = []
self.smpfs = []
self.heap = []
self.seen = set()
self.row_start = {}
self.inited = False

if self.ch is None:
for p in leaf_primes:
if self.limit and p > self.limit:
break
self.vals.append(p)
self.facs.append(((p, 1),))
self.smpfs.append(p)

def at(self, idx: int):
if self.ch is None:
if idx < len(self.vals):
return self.vals[idx], self.facs[idx], self.smpfs[idx]
return None
while idx >= len(self.vals):
if not self._gen_next():
return None
return self.vals[idx], self.facs[idx], self.smpfs[idx]

def _push(self, lidx: int, ridx: int):
key = (lidx, ridx)
if key in self.seen:
return
L, R = self.ch
l = self.streams[L].at(lidx)
r = self.streams[R].at(ridx)
if l is None or r is None:
return
lv, lf, ls = l
rv, rf, rs = r
if lv > rv:
return
prod = lv * rv
if self.limit and prod > self.limit:
return
self.seen.add(key)
heappush(self.heap, (prod, lidx, ridx))

def _start_row(self, l_idx: int):
L, R = self.ch
l = self.streams[L].at(l_idx)
if l is None:
return
lv, _, _ = l
r_idx = 0
while True:
r = self.streams[R].at(r_idx)
if r is None:
return
rv, _, _ = r
if rv >= lv:
break
r_idx += 1
if L == R and r_idx < l_idx:
r_idx = l_idx
self.row_start[l_idx] = r_idx
self._push(l_idx, r_idx)

def _ensure_init(self):
if self.inited:
return
self.inited = True
if self.streams[self.ch[0]].at(0) is None or self.streams[self.ch[1]].at(0) is None:
return
self._start_row(0)

def _gen_next(self):
self._ensure_init()
if not self.heap:
return False

prod, lidx, ridx = heappop(self.heap)
L, R = self.ch
l = self.streams[L].at(lidx)
r = self.streams[R].at(ridx)
if l is None or r is None:
return False
lv, lf, ls = l
rv, rf, rs = r

self._push(lidx, ridx + 1)
if ridx == self.row_start.get(lidx, None):
self._start_row(lidx + 1)

if rv >= lv * (rs * rs):
return True

ft = merge_fac_tup(lf, rf)
if psr_from_fac(prod, ft) != lv:
return True

self.vals.append(prod)
self.facs.append(ft)
self.smpfs.append(ls if ls < rs else rs)
return True


limits = {}
root_shape = {}

for n in range(2, N + 1):
fac = double_factorial_fac(n)
val = compute_val_from_fac(fac)
traverse(val, fac, limits)
root_shape[n] = shape_id_from_fac(val, fac_to_tuple(fac))

streams = {}
for sid in shape_children:
streams[sid] = Stream(sid, streams, limits)

ans = 0
for n in range(2, N + 1):
sid = root_shape[n]
ans += streams[sid].at(0)[0]
print(ans)