Project Euler 415

Project Euler 415

题目

Titanic sets

A set of lattice points \(S\) is called a titanic set if there exists a line passing through exactly two points in \(S\).

An example of a titanic set is \(S = \{(0, 0), (0, 1), (0, 2), (1, 1), (2, 0), (1, 0)\}\), where the line passing through \((0, 1)\) and \((2, 0)\) does not pass through any other point in \(S\).

On the other hand, the set \(\{(0, 0), (1, 1), (2, 2), (4, 4)\}\) is not a titanic set since the line passing through any two points in the set also passes through the other two.

For any positive integer \(N\), let \(T(N)\) be the number of titanic sets \(S\) whose every point \((x, y)\) satisfies \(0 \le x, y \le N\).

It can be verified that \(T(1) = 11, T(2) = 494, T(4) = 33554178, T(111) \bmod 10^8 = 13500401\) and \(T(10^5)\bmod10^8 = 63259062\).

Find \(T(10^{11})\bmod10^8\).

解决方案

在整数格点集合\(\mathcal{G}_N=\{(x,y)\mid 0\le x,y\le N\},\)称点集 \(S\subseteq\mathcal{G}_N\)titanic set,若存在一条直线 恰好穿过 \(S\) 中两个点

若一个点集 \(S\) 不是 titanic,则对任意两点所确定的直线都至少包含第三个点,即:

  • \(|S|\le 1\),显然不是 titanic;
  • \(|S|\ge 2\),则“任意两点直线都包含第三点”意味着所有点必须 共线(这可以视为 Sylvester–Gallai 定理在整数格点上的一个直接推论:只要不全共线,就一定存在一条普通线只过两点)。

因此,非 titanic 集合 只有两类:空集或单点集;至少 \(3\) 个点且全部共线的集合。

于是我们只需数出“非 titanic”的数量 \(U(N)\),则\(T(N)=2^{(N+1)^2}-U(N).\)

可见,空集、单点集的数量一共有\(1+(N+1)^2\)个。

每一条水平线含 \(N+1\) 点,总共有 \(N+1\) 条;垂直线同理,所以共有 \(2(N+1)\) 条。

在一条含 \(m=N+1\) 点的直线上,选取至少 \(3\) 个点的子集数为\(g(m)=2^m-\dbinom m0-\dbinom m1-\dbinom m2=2^m-1-m-\dfrac{m(m-1)}2.\)因此贡献:\(2(N+1)g(N+1).\)

斜率 \(+1\)(主对角)有 \(1\) 条长为 \(N+1\),其余平行线长度从 \(1\)\(N\) 各出现 \(2\) 次。斜率 \(-1\) 同理。 因此把 \(\pm 1\) 的所有对角线(两种方向)合计的“长度 \(\ge 3\) 子集数”可以写成:\(\displaystyle{2g(N+1)+4\sum_{k=3}^{N}g(k).}\)

我们令\(\displaystyle{G(N)=\sum_{k=3}^{N}g(k)=2^{N+1}-\frac{N(N+1)(N+2)}6-N-2(N\ge 3).}\)那么综合起来,上面这块简单部分可记为

现在只剩斜率既非 \(0\)\(\infty\)\(\pm1\) 的直线。它们可用 原始方向向量 \((a,b)\) 描述:设 \(a>b\ge 1,\gcd(a,b)=1\),方向向量为 \((a,b)\);若再乘以整数 \(k\ge 2\),得到位移 \((ka,kb)\),对应线段端点差。

当端点差为 \((ka,kb)\)\(\gcd(a,b)=1\) 时,线段上整点数量为\(k+1\)(每走一步 \((a,b)\) 到下一格点,共走 \(k\) 步)。

我们需要统计:在一条包含 \(k+1\) 个格点的直线上,能够形成“至少 \(3\) 点共线集合”的选择数,并且 要避免重复计数(同一集合可能对应多个端点差)。

为处理重复,考虑一条直线上 连续\(k+1\) 个格点(记为 \(0,1,\dots,k\)),我们定义 \(p(k)\) 为其中所有非法点集 \(S\) 的数量,要求:\(|S|\ge 3\)\(S\) 的最左点恰为 \(0\),最右点恰为 \(k\)(即 \(S\) 的“最小覆盖线段”正好是整段 \([0,k]\))。换句话说,\(p(k)\) 只计数那些 “最短覆盖长度恰为 \(k\) 的共线点集,从而使每个非法集合都能被唯一归类到某一个最短线段上,避免在更长线段的统计中重复出现。

对固定的 \(a,k\),我们希望把所有满足 \(\gcd(a,b)=1\)\(b\) 的贡献累加起来。注意到摆放次数中,\((N+1-ka)\)\(b\) 无关,因此只需处理\(\displaystyle{\sum_{\substack{1\le b\le a-1\\gcd(a,b)=1}} (N+1-kb).}\)

将其展开为两部分:\(\displaystyle{\sum_{\substack{1\le b\le a-1\\gcd(a,b)=1}} (N+1-kb)=\sum_{\substack{1\le b\le a-1\\gcd(a,b)=1}} (N+1)-k\sum_{\substack{1\le b\le a-1\\gcd(a,b)=1}} b.}\)

第一项显然是\(\displaystyle{\sum_{\substack{1\le b\le a-1\\gcd(a,b)=1}} (N+1)=\varphi(a)(N+1).}\) 根据欧拉函数\(\varphi(a)\)的性质,可知\(\displaystyle{\sum_{\substack{1\le b\le a-1\\gcd(a,b)=1}}b=\dfrac{a\varphi(a)}{2}.}\)因此第二项就是\(\dfrac{ka\varphi(a)}{2}\)

代回原式,便得到\(\displaystyle{\sum_{\substack{1\le b\le a-1\\gcd(a,b)=1}} (N+1-kb)=\varphi(a)\left(N+1-\frac{ka}{2}\right).}\)

于是一般部分总贡献为:

\[ X(N)=4\sum_{k\ge2}p(k)\sum_{a\ge2,ka\le N}\varphi(a)(N+1-ka)\left(N+1-\frac{ka}{2}\right). \]

\(n=N+1\)。展开括号得到:\((N+1-ka)\left(N+1-\dfrac{ka}{2}\right)=n^2-\dfrac{3n}{2}(ka)+\dfrac12 (ka)^2.\)

因此\(\displaystyle{X(N)=4\left(n^2S_0-\frac{3n}{2}S_1+\frac12S_2\right),}\)其中\(\displaystyle{S_0=\sum_{\substack{k\ge 2,a\ge2\\ka\le N}}p(k)\varphi(a),S_1=\sum_{\substack{k\ge 2,a\ge2\\ka\le N}}kap(k)\varphi(a),S_2=\sum_{\substack{k\ge 2,a\ge2\\ka\le N}}k^2a^2p(k)\varphi(a).}\)

由于这里只统计斜率 \(0<\dfrac ba<1\) 的“斜线段族”,因此枚举变量满足 \(k\ge2,a\ge2\);等价地,\(p(1)=0\)是成立的,并对所有 \(\Phi_i\) 统一减去 \(\varphi(1)\cdot 1^i=1\),从而尝试直接使用标准的Dirichlet 双曲线方法

\(f_i(k)=k^i p(k),g_i(a)=a^i\varphi(a)\),则记\(\displaystyle{F_i(x)=\sum_{k\le x}f_i(k),\Phi_i(x)=\sum_{a\le x}g_i(a)}\)这些是 欧拉函数的加权前缀和,因此只要能快速得到:\(\Phi_0,\Phi_1,\Phi_2\)的函数值,就能计算 \(S_0,S_1,S_2\)。那么由Dirichlet 双曲线方法可以得到:

\[ \sum_{ka\le N}f_i(k)g_i(a) =\sum_{k\le \sqrt N}f_i(k)\Phi_i\left(\left\lfloor\frac Nk\right\rfloor\right) +\sum_{a\le \sqrt N}g_i(a)\Phi_i\left(\left\lfloor\frac Na\right\rfloor\right) -F_i(\sqrt N)\Phi_i(\sqrt N), \]

随后我们在后面令\(G_i(n)=\Phi_i(n)-1\),并在计算时去除\(a=1\)的情况即可。

我们令幂和\(\displaystyle{T_i(n)=\sum_{k=1}^n k^i}\)。那么有\(T_0(n)=n,T_1(n)=\dfrac{n(n+1)}{2},T_2(n)=\dfrac{n(n+1)(2n+1)}{6},T_3(n)=\left(\dfrac{n(n+1)}{2}\right)^2\)。可见,对于\(F_i(i=0,1,2)\)的计算结果如下:

\[ \begin{aligned} F_0(n)&=\sum_{k=1}^np(k)\\ &=2^n-n-1,\\ F_1(n)&=\sum_{k=1}^nkp(k)\\ &=\sum_{k=1}^{x}k2^{k-1}-T_1(n)\\&=2^n(n-1)-\dfrac{n(n+1)}{2}+1.\\ F_2(n)&=\sum_{k=1}^nk^2p(k)\\ &=\sum_{k=1}^{x}k^22^{k-1}-T_2(n)\\ &=(n^2-2n+3)2^n-\frac{n(n+1)(2n+1)}6-3 \end{aligned} \]

接下来我们求解\(\Phi_i(n)\)。欧拉函数有性质:\(\displaystyle\sum_{d\mid n}\varphi(d)=n.\),那么两侧同乘\(n^i\),并且级数每项分子分母各乘上一个\(d^i\),可得\(\displaystyle{\sum_{d\mid n} d^i\left(\frac{n}{d}\right)^i\varphi\left(\frac{n}{d}\right)=n^{i+1}.}\)这等价于一个 Dirichlet 卷积恒等式:

\[ (n^i) \ast (n^i\varphi(n)) = n^{i+1}. \]

对上式在 \(1\le n\le x\) 求和,那么有

\[T_{i+1}(x)=\sum_{n\le x}n^{i+1}=\sum_{n\le x}\sum_{d\mid n} d^i\left(\frac{n}{d}\right)^i\varphi\left(\frac{n}{d}\right)=\sum_{d\le x} d^i \sum_{m\le x/d} m^i\varphi(m)=\sum_{d\le x} d^i\Phi_i\left(\left\lfloor\frac{x}{d}\right\rfloor\right).\]

上式中 \(d=1\) 的项就是 \(\Phi_i(x)\),因此

\[ \Phi_i(x)=T_{i+1}(x)-\sum_{d=2}^{x} d^i\Phi_i\left(\left\lfloor\frac{x}{d}\right\rfloor\right). \]

注意到 \(\left\lfloor\dfrac{x}{d}\right\rfloor < x\)(当 \(d\ge2\)),这就是一个“严格变小”的递推,配合记忆化就能算所有需要的 \(\Phi_i(\cdot)\)。 可见,递推中出现的关键结构是\(\displaystyle{\sum_{d=2}^{x} d^i\Phi_i\left(\left\lfloor\frac{x}{d}\right\rfloor\right).}\)也就是说\(\Phi_i\)可以使用数论分块求解。设\(q=\left\lfloor\dfrac{x}{d}\right\rfloor,\)那么当 \(q\) 固定时,\(d\) 会落在一个连续区间\(L_q=\left\lfloor\dfrac{x}{q+1}\right\rfloor+1 \le d \le \left\lfloor\dfrac{x}{q}\right\rfloor=R_q.\)

因此可以把递推写成按 \(q\) 分块的形式:

\[ \sum_{d=2}^{x} d^i\Phi_i\left(\left\lfloor\frac{x}{d}\right\rfloor\right) =\sum_{q=1}\Phi_i(q)\sum_{d= L_q}^{R_q} d^i =\sum_{q=1}\Phi_i(q)(T_i(R_q)-T_i(L_q-1)), \]

其中对于\(T_i\)我们回代上文所提到的\(T_0,T_1,T_2,T_3\)即可。令\(U(N)=C(N)+X(N),\)则答案为

\[ T(N)= 2^{(N+1)^2}-U(N) \]

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
#include <bits/stdc++.h>
#include "tools.h"
using namespace std;

typedef long long ll;

ll N = 100000000000LL;
int LIM = 10000000;
ll MOD = 100000000LL;
ll MOD2 = MOD * 2;
ll MOD6 = MOD * 6;

ll mod_norm(ll x, ll mod)
{
x %= mod;
if (x < 0)
x += mod;
return x;
}

ll mod_mul(ll a, ll b, ll mod)
{
return a * b % mod;
}

ll tri_mod(ll k)
{
k %= MOD2;
return k * (k + 1) / 2 % MOD;
}

ll comb3_mod(ll k)
{
k %= MOD6;
return k * (k + 1) % MOD6 * (k + 2) / 6 % MOD;
}

ll Tsum(ll p, ll n)
{
if (n <= 0)
return 0;
if (p == 0)
return n % MOD;
if (p == 1)
{
ll a = n % MOD2;
ll b = (n + 1) % MOD2;
ll x = a * b % MOD2;
return (x / 2) % MOD;
}
if (p == 2)
{
ll a = n % MOD6;
ll b = (n + 1) % MOD6;
ll c = (n * 2 + 1) % MOD6;
return a * b % MOD6 * c / 6 % MOD6;
}
if (p == 3)
{
ll s1 = Tsum(1, n);
return s1 * s1 % MOD;
}
return 0;
}

ll f_mod(ll k, ll mod)
{
ll r = mod_norm(qpow(2, k, mod) - 1 - tri_mod(k), mod);
return r;
}

ll F_mod(ll k, ll mod)
{
ll r = mod_norm(qpow(2, k + 1, mod) - 2 - (k % mod) - comb3_mod(k), mod);
return r;
}

ll df_mod(ll k, ll mod)
{
ll r = mod_norm(qpow(2, k, mod) - (k % mod) - 1, mod);
return r;
}

vector<int> ph;
vector<int> pref0, pref1, pref2;
array<unordered_map<ll, ll>, 3> memo;

void init(ll n)
{
ph.assign(LIM + 1, 0);
for (int i = 0; i <= LIM; ++i)
ph[i] = i;
ph[0] = 0;
ph[1] = 1;
for (int i = 2; i <= LIM; ++i)
{
if (ph[i] == i)
{
for (int j = i; j <= LIM; j += i)
{
ph[j] -= ph[j] / i;
}
}
}

pref0.assign(LIM + 1, 0);
pref1.assign(LIM + 1, 0);
pref2.assign(LIM + 1, 0);
for (int i = 1; i <= LIM; ++i)
{
ll p0 = pref0[i - 1];
ll p1 = pref1[i - 1];
ll p2 = pref2[i - 1];
ll p = ph[i];
ll im = i % MOD2;
ll i2 = 1ll * im * im % MOD2;
p0 = (p0 + p) % MOD2;
p1 = (p1 + 1ll * im * p % MOD2) % MOD2;
p2 = (p2 + 1ll * i2 * p % MOD2) % MOD2;
pref0[i] = (int)p0;
pref1[i] = (int)p1;
pref2[i] = (int)p2;
}
}

ll pref_get(int k, int n)
{
if (k == 0)
return pref0[n];
if (k == 1)
return pref1[n];
return pref2[n];
}

ll pow_small(ll a, int k, ll mod)
{
a %= mod;
if (k == 0)
return 1;
if (k == 1)
return a;
return a * a % mod;
}

ll phSum(ll n, int k)
{
if (n <= LIM)
return pref_get(k, (int)n);
auto &mp = memo[k];
auto it = mp.find(n);
if (it != mp.end())
return it->second;

ll res = Tsum(k + 1, n);
ll sq = int_square(n);
ll lim = n / sq;

for (ll i = 2; i <= lim; ++i)
{
ll t = phSum(n / i, k);
res = mod_norm(res - mod_mul(pow_small(i, k, MOD2), t, MOD2), MOD2);
}

ll prev_sum = Tsum(k, n);
for (ll j = 1; j < sq; ++j)
{
ll next_sum = Tsum(k, n / (j + 1));
ll delta = mod_norm(prev_sum - next_sum, MOD2);
res = mod_norm(res - mod_mul(delta, pref_get(k, (int)j), MOD2), MOD2);
prev_sum = next_sum;
}

mp.emplace(n, res);
return res;
}

ll exprA(ll q)
{
ll a = mod_norm(2 * (ll)F_mod(q, MOD2), MOD2);
ll b = f_mod(q, MOD2);
ll c = df_mod(q, MOD2);
ll t1 = mod_mul(b, mod_norm(1 + 2 * (q % MOD2), MOD2), MOD2);
ll t2 = mod_mul(c, mod_mul(q % MOD2, q % MOD2, MOD2), MOD2);
return mod_norm(a - t1 + t2, MOD2);
}

ll compute_line_res()
{
ll sqrtN = int_square(N);
ll lim = N / sqrtN;

ll line_res = 0;

{
for (ll p = 2; p <= lim; ++p)
{
ll q = N / p;
ll num = mod_mul(mod_mul(mod_mul(p % MOD2, p % MOD2, MOD2), ph[(int)p] % MOD2, MOD2), exprA(q), MOD2);
line_res += (num / 2) % MOD;
if (line_res >= MOD)
line_res -= MOD;
}

ll prev_ts = phSum(N, 2);
for (ll j = 1; j < sqrtN; ++j)
{
ll next_ts = phSum(N / (j + 1), 2);
ll diff = mod_norm(prev_ts - next_ts, MOD2);
ll num = mod_mul(exprA(j), diff, MOD2);
line_res += (num / 2) % MOD;
line_res %= MOD;
prev_ts = next_ts;
}
}

{
ll N1 = (N + 1) % MOD2;

for (ll p = 2; p <= lim; ++p)
{
ll q = N / p;
ll f = f_mod(q, MOD2);
ll df = df_mod(q, MOD2);
ll diff2 = mod_norm(f - mod_mul(q % MOD2, df, MOD2), MOD2);
ll num = 3 % MOD2 * p % MOD2 * ph[(int)p] % MOD2 * N1 % MOD2 * diff2 % MOD2;
line_res += (num / 2) % MOD;
line_res %= MOD;
}

ll prev_ts = phSum(N, 1);
for (ll j = 1; j < sqrtN; ++j)
{
ll next_ts = phSum(N / (j + 1), 1);
ll diff = mod_norm(prev_ts - next_ts, MOD2);

ll fj = f_mod(j, MOD2);
ll dfj = df_mod(j, MOD2);
ll inner = mod_norm(fj - mod_mul(j % MOD2, dfj, MOD2), MOD2);

ll num = mod_mul(mod_mul(3 % MOD2, N1, MOD2), inner, MOD2);
num = mod_mul(num, diff, MOD2);

line_res += (num / 2) % MOD;
line_res %= MOD;
prev_ts = next_ts;
}
}

{
ll coef = mod_mul((N + 1) % MOD, (N + 1) % MOD, MOD);

for (ll p = 2; p <= lim; ++p)
{
ll q = N / p;
ll term = mod_mul(coef, df_mod(q, MOD) % MOD, MOD);
term = mod_mul(term, ph[(int)p] % MOD, MOD);
line_res += term;
line_res %= MOD;
}

ll prev_ts = phSum(N, 0) % MOD;
for (ll j = 1; j < sqrtN; ++j)
{
ll next_ts = phSum(N / (j + 1), 0) % MOD;
ll diff = mod_norm(prev_ts - next_ts, MOD);

ll term = mod_mul(coef, df_mod(j, MOD), MOD);
line_res += mod_mul(term, diff, MOD);
line_res %= MOD;

prev_ts = next_ts;
}
}

return line_res % MOD;
}

ll solve(ll N)
{
init(N);
ll P = N + 1;
ll res = 1;
res = (res + mod_mul(P % MOD, P % MOD, MOD)) % MOD;
res = (res + mod_mul(mod_norm(2 * (P % MOD), MOD), f_mod(P, MOD), MOD)) % MOD;
res = (res + f_mod(P, MOD) * 2 + F_mod(N, MOD) * 4) % MOD;
ll line_res = compute_line_res();
res = (res + mod_norm(4 * line_res, MOD)) % MOD;

ll tmp = qpow(2, P, MOD);
ll total = qpow(tmp, P, MOD);
return mod_norm(total - res, MOD);
}

int main()
{
cout << solve(N) << "\n";
return 0;
}