Project Euler 439

Project Euler 439

题目

Sum of sum of divisors

Let \(d(k)\) be the sum of all divisors of \(k\).

We define the function \(S(N) = \sum_{i=1}^N \sum_{j=1}^Nd(i \cdot j)\).

For example, \(S(3) = d(1) + d(2) + d(3) + d(2) + d(4) + d(6) + d(3) + d(6) + d(9) = 59\).

You are given that \(S(10^3) = 563576517282\) and \(S(10^5) \bmod 10^9 = 215766508\).

Find \(S(10^{11}) \bmod 10^9\).

解决方案

对质数幂 \(p^k\),约数和为等比数列:\(d(p^k)=1+p+p^2+\cdots+p^k=\dfrac{p^{k+1}-1}{p-1}.\)

\(a,b\ge 0\),直接相乘:\(\displaystyle{d(p^a)d(p^b)=\left(\sum_{x=0}^{a}p^x\right)\left(\sum_{y=0}^{b}p^y\right)=\sum_{t=0}^{a+b} c_t p^t,}\)其中系数 \(c_t\) 是满足 \(x+y=t\) 的解的个数。

注意到\(\displaystyle{pd(p^{a-1})d(p^{b-1})=p\left(\sum_{x=0}^{a-1}p^x\right)\left(\sum_{y=0}^{b-1}p^y\right)=\sum_{t=1}^{a+b-1} c'_t p^t,}\) 其中 \(c'_t\) 正好对应“既能从左取至少 1 次,又能从右取至少 1 次”的那部分。因此相减会把中间重复的项抵消,只留下

\[ d(p^{a+b}) = d(p^a)d(p^b) - pd(p^{a-1})d(p^{b-1}). \]

这就是后续所有化简的核心。

\(\mu(n)\) 为 Möbius 函数。我们希望得到一个把 \(d(ij)\)\(\gcd(i,j)\) 分解的式子。

\(i,j\) 的某个质因子 \(p\) 的指数分别为 \(a=v_p(i),b=v_p(j)\)。则\(v_p(ij)=a+b, v_p(\gcd(i,j))=\min(a,b).\)

考虑下面这个局部求和:\(\displaystyle{\sum_{t=0}^{\min(a,b)} p^t\mu(p^t)d(p^{a-t})d(p^{b-t}).}\) 由于 当 \(t\ge 2\)时,\(\mu(p^t)=0\),所以式子只有两项:\(d(p^a)d(p^b) - pd(p^{a-1})d(p^{b-1}) = d(p^{a+b}).\)这正好等于 \(d(p^{v_p(ij)})\)

因为 \(d(\cdot)\)\(\mu(\cdot)\) 都是乘法函数,这个对每个质数成立的分解可以乘起来,得到对任意 \(i,j\) 的恒等式:

\[ d(ij)=\sum_{k\mid \gcd(i,j)} k\mu(k)d\left(\frac{i}{k}\right)d\left(\frac{j}{k}\right). \]

把上式代回 \(S(N)\)\(\displaystyle{S(N)=\sum_{i\le N}\sum_{j\le N}\sum_{k\mid \gcd(i,j)} k\mu(k)d(i/k)d(j/k).}\)交换求和顺序。对固定 \(k\),令 \(i=kx,j=ky\),则 \(x,y\le N/k\)\[ S(N)=\sum_{k\le N} k\mu(k)\left(\sum_{x\le N/k}d(x)\right)\left(\sum_{y\le N/k}d(y)\right). \]

定义\(\displaystyle{D(Q)=\sum_{n\le Q} d(n),}\)

\[ S(N)=\sum_{k=1}^{N} k\mu(k)D\left(\left\lfloor\frac{N}{k}\right\rfloor\right)^2. \]

可见,对于小于等于\(Q\)的整数中,有\(\left\lfloor\dfrac{D}{t}\right\rfloor\)个因子为\(r\)。因此\(\displaystyle{D(Q)=\sum_{t=1}^{Q} t\Big\lfloor\frac{Q}{t}\Big\rfloor.}\)这正是一个典型可用数论分块双曲线方法加速的求和,后面我们介绍一个关于\(Q\)的一个数论分块递推来减少算多次\(Q(\cdot)\)的时间。


5. 定义并计算 \(F(Q)=\sum_{k\le Q} k\mu(k)\)

\(\displaystyle{F(Q)=\sum_{k\le Q} k\mu(k).}\)\(i(n)=n\)\(\mu(n)\)的卷积写出来:\(\displaystyle{\sum_{k\le n} kF\left(\left\lfloor\frac{n}{k}\right\rfloor\right)= \sum_{k\le n} k\sum_{d\le n/k} d\mu(d)= \sum_{kd\le n} kd,\mu(d)= \sum_{m\le n} m\sum_{d\mid m}\mu(d).}\)

\(\displaystyle{\sum_{d\mid m}\mu(d)=\begin{cases}1,&m=1\\0,&m>1\end{cases}}\),所以整个卷积和恒等于 1:

\[ \sum_{k=1}^{n} kF\left(\left\lfloor\frac{n}{k}\right\rfloor\right)=1. \]

移项即可得到递推,可以用数论分块来完成:

\[ F(n)=1-\sum_{k=2}^{n} kF\left(\left\lfloor\frac{n}{k}\right\rfloor\right). \]

为了满足\(D(\cdot)\)能够被高效计算的需求,我们接下来证明下面的Dirichlet卷积是成立的:

\[ \sum_{k=1}^{n} k\mu(k)D\left(\left\lfloor\frac{n}{k}\right\rfloor\right)=n. \]

把上面的 \(\displaystyle{D(Q)=\sum_{t\le Q} t\left\lfloor\dfrac{Q}{r}\right\rfloor}\) 代进去,其中 \(Q=\left\lfloor\dfrac{n}{k}\right\rfloor\)

\[ \sum_{k\le n} k\mu(k)D\left(\left\lfloor\frac{n}{k}\right\rfloor\right) =\sum_{k\le n} k\mu(k)\sum_{t\le \lfloor n/k\rfloor} t\left\lfloor\frac{\lfloor n/k\rfloor}{t}\right\rfloor. \]

注意一个关键:对整数 \(t\)\(\left\lfloor\dfrac{\lfloor n/k\rfloor}{t}\right\rfloor=\left\lfloor\dfrac{n}{kt}\right\rfloor.\)所以右边变成:\(\displaystyle{\sum_{k\le n} k\mu(k)\sum_{t\le n/k} t\left\lfloor\frac{n}{kt}\right\rfloor.}\)

我们现在把双重求和合并成对所有满足 \(kt\le n\) 的配对求和,那么左边得到\(\displaystyle{\sum_{kt\le n} k\mu(k)t\left\lfloor\frac{n}{kt}\right\rfloor.}\)

现在令\(m=kt,\)\(m\le n\),并且对固定的 \(m\),所有 \((k,t)\) 满足 \(kt=m\) 等价于枚举 \(k\mid m\),且 \(t=m/k\)。代入左边得到:\(\displaystyle{\sum_{m\le n}\sum_{k\mid m} k\mu(k)\cdot \frac{m}{k}\cdot \left\lfloor\frac{n}{m}\right\rfloor=\sum_{m\le n} m\left\lfloor\frac{n}{m}\right\rfloor \sum_{k\mid m}\mu(k).}\)

又因为\(\mu\)的性质满足: \[ \sum_{k\mid m}\mu(k)= \begin{cases} 1,&m=1,\\ 0,&m>1. \end{cases} \]

因此上式只有 \(m=1\) 会留下:\(\displaystyle{\sum_{m\le n} m\left\lfloor\frac{n}{m}\right\rfloor \sum_{k\mid m}\mu(k)=1\cdot \left\lfloor\frac{n}{1}\right\rfloor\cdot 1n.}\)于是证得:

\[ \sum_{k=1}^{n} k\mu(k)D\left(\left\lfloor\frac{n}{k}\right\rfloor\right)=n. \]

把上式单独拎出 \(k=1\) 那一项,那么得到递推:

\[ D(n)=n-\sum_{k=2}^{n} k\mu(k)D\left(\left\lfloor\frac{n}{k}\right\rfloor\right). \]

其中区间求和的系数 \(\sum k\mu(k)\) 正好是 \(F(\cdot)\) 的差分。

不过,直接用\(\displaystyle{S(N)=\sum_{k=1}^{N} k\mu(k)D(\lfloor N/k\rfloor)^2}\)仍然计算量太大。设\(q=\lfloor\sqrt N\rfloor, w=\left\lfloor\dfrac{N}{q+1}\right\rfloor.\)

\(k\le w\) 时,\(\lfloor N/k\rfloor>q\),此时 \(k\) 数量只有 \(O(\sqrt N)\),可以直接算。当 \(k>w\) 时,\(\lfloor N/k\rfloor\le q\) 只有 \(O(\sqrt N)\) 种值。把所有 \(k\)\(t=\lfloor N/k\rfloor\) 分组,则

\[ \sum_{\substack{k>w\\\lfloor N/k\rfloor=t}} k\mu(k)=F\left(\left\lfloor\frac{N}{t}\right\rfloor\right)-F\left(\left\lfloor\frac{N}{t+1}\right\rfloor\right). \]

最终得到经典分解:

\[ S(N)=\sum_{k=1}^{w} k\mu(k)D\left(\left\lfloor\frac{N}{k}\right\rfloor\right)^2 +\sum_{t=1}^{q} D(t)^2\left( F\left(\left\lfloor\frac{N}{t}\right\rfloor\right)- F\left(\left\lfloor\frac{N}{t+1}\right\rfloor\right) \right). \]

两段求和规模都是 \(O(\sqrt N)\)

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
#include <bits/stdc++.h>
#include "tools.h"
using namespace std;

typedef long long ll;
const ll N = 100000000000LL;
const ll MOD = 1000000000LL;

inline ll norm(ll x)
{
x %= MOD;
if (x < 0)
x += MOD;
return x;
}

inline ll tri(ll n)
{
if (n <= 0)
return 0;
if (n & 1)
return (n % MOD) * (((n + 1) / 2) % MOD) % MOD;
return ((n / 2) % MOD) * ((n + 1) % MOD) % MOD;
}

int pivot;
vector<int> mu;
vector<int> F_small;
vector<int> D_small;
unordered_map<ll, int> F_cache;
unordered_map<ll, int> D_cache;

void precompute(int P)
{
pivot = P;

vector<int> lp(pivot + 1, 0);
vector<int> primes;
primes.reserve(pivot / 10);

mu.assign(pivot + 1, 0);
vector<int> sig(pivot + 1, 0);
vector<int> rem(pivot + 1, 0);
vector<int> pw(pivot + 1, 0);
vector<int> sum_p(pivot + 1, 0);

mu[1] = 1;
sig[1] = 1;
rem[1] = 1;
pw[1] = 1;
sum_p[1] = 1;

for (int i = 2; i <= pivot; i++)
{
if (lp[i] == 0)
{
lp[i] = i;
primes.push_back(i);
mu[i] = -1;
pw[i] = (i % MOD);
sum_p[i] = (1 + (i % MOD)) % MOD;
rem[i] = 1;
sig[i] = sum_p[i];
}
for (int p : primes)
{
ll x = 1LL * i * p;
if (x > pivot)
break;
lp[x] = p;
if (i % p == 0)
{
mu[x] = 0;
rem[x] = rem[i];
pw[x] = (ll)pw[i] * p % MOD;
sum_p[x] = sum_p[i] + pw[x];
if (sum_p[x] >= MOD)
sum_p[x] -= MOD;
sig[x] = (ll)rem[x] * sum_p[x] % MOD;
break;
}
else
{
mu[x] = -mu[i];
rem[x] = sig[i];
pw[x] = p % MOD;
sum_p[x] = 1 + (p % MOD);
if (sum_p[x] >= MOD)
sum_p[x] -= MOD;
sig[x] = (ll)sig[i] * sum_p[x] % MOD;
}
}
}

F_small.assign(pivot + 1, 0);
D_small.assign(pivot + 1, 0);

ll f = 0, d = 0;
for (int i = 1; i <= pivot; i++)
{
f = (f + 1LL * i * mu[i]) % MOD;
d = (d + sig[i]) % MOD;
F_small[i] = norm(f);
D_small[i] = norm(d);
}
}

int get_F(ll n)
{
if (n <= pivot)
return F_small[n];
auto it = F_cache.find(n);
if (it != F_cache.end())
return it->second;

ll qn = int_square(n);
ll w = n / (qn + 1);

ll res = 1 % MOD;

for (ll k = 2; k <= w;)
{
ll t = n / k;
ll k2 = n / t;
ll coef = norm(tri(k2) - tri(k - 1));
res = norm(res - coef * (ll)get_F(t));
k = k2 + 1;
}

for (ll t = 1; t <= qn; t++)
{
ll a = n / t;
ll b = n / (t + 1);
ll coef = norm(tri(a) - tri(b));
res = norm(res - 1LL * F_small[t] * coef);
}

return F_cache[n] = res;
}

int get_D(ll n)
{
if (n <= pivot)
return D_small[n];
auto it = D_cache.find(n);
if (it != D_cache.end())
return it->second;

ll qn = int_square(n);
ll w = n / (qn + 1);

ll res = n % MOD;

for (ll k = 2; k <= w;)
{
ll t = n / k;
ll k2 = n / t;
ll coef = norm((ll)F_small[k2] - (ll)F_small[(k - 1)]);
res = norm(res - coef * (ll)get_D(t));
k = k2 + 1;
}

for (ll t = 1; t <= qn; t++)
{
ll diff = norm((ll)get_F(n / t) - (ll)get_F(n / (t + 1)));
res = norm(res - 1LL * D_small[t] * diff);
}

return D_cache[n] = res;
}

ll solve(ll n)
{
ll q = int_square(N);
int P = (q * 10);
precompute(P);

ll w = N / (q + 1);
ll ans = 0;

for (ll k = 1; k <= w;)
{
ll t = N / k;
ll k2 = N / t;
if (k2 > w)
k2 = w;

ll coef = norm((ll)F_small[k2] - (ll)F_small[(k - 1)]);
ll dv = get_D(t);
ll add = coef * ((dv * dv) % MOD) % MOD;
ans = norm(ans + add);

k = k2 + 1;
}

for (ll t = 1; t <= q; t++)
{
ll dt = D_small[t];
ll diff = norm((ll)get_F(N / t) - (ll)get_F(N / (t + 1)));
ll add = (dt * dt) % MOD * diff % MOD;
ans = norm(ans + add);
}

return ans % MOD;
}

int main()
{
cout << solve(N) << "\n";
return 0;
}