Project Euler 433

Project Euler 433

题目

Steps in Euclid’s algorithm

Let \(E(x_0, y_0)\) be the number of steps it takes to determine the greatest common divisor of \(x_0\) and \(y_0\) with Euclid’s algorithm. More formally:

\(x_1 = y_0, y_1 = x_0 \bmod y_0\)

\(x_n = y_{n-1}, y_n = x_{n-1} \bmod y_{n-1}\)

\(E(x_0, y_0)\) is the smallest \(n\) such that \(y_n = 0\).

We have \(E(1,1) = 1, E(10,6) = 3\) and \(E(6,10) = 4\).

Define \(S(N)\) as the sum of \(E(x,y)\) for \(1 \le x,y \le N\).

We have \(S(1) = 1, S(10) = 221\) and \(S(100) = 39826\).

Find \(S(5\cdot 10^6)\).

解决方案

注意到:\(E(x,x)=1\),并且若 \(x\ne y\),则 \(E(x,y)\)\(E(y,x)\) 并不相同,但在全体 \((x,y)\in[1,N]^2\) 上求和时,可以按“主对角线 + 上三角 + 下三角”拆分。不过关键的是:下三角 \((x<y)\) 可以通过一次 Euclid 变换和上三角联系起来,从而最终只需处理上三角的总贡献。因此结果可以整理为经典形式:

\[ S(N)=\frac{N(N+1)}2 + 2\sum_{1\le y<x\le N}E(x,y) \]

其中第一项是对角线 \((x=y)\) 的贡献以及下三角多出来的1次贡献,第二项是严格上三角贡献的两倍。因此核心变成计算:

\[ A(N)=\sum_{1\le y<x\le N}E(x,y) \]

\(x>y>0\),欧几里得算法的商序列正是连分数展开:\(\dfrac{x}{y}=[a_0;a_1,a_2,\dots,a_{k-1}]\)。每做一步取模,就相当于“剥掉一个商”,直到余数为 0 为止,因此:\(E(x,y)=k\)。也就是说,\(E(x,y)\) 就是 \(\dfrac{x}{y}\) 的连分数长度。

Stern–Brocot 树把所有既约分数按照连分数前缀组织成一棵树,深度正好对应连分数长度。因此,固定分母/固定范围内的步数总和可以转化为“树上节点深度求和”的计数问题。

\(\displaystyle{F(x)=\sum_{1\le y<x}E(x,y)}\),通过 Stern–Brocot 树计数(等价于连分数深度计数),论文给出

\[ F(x)=\left\lfloor\frac{3(x-1)}2\right\rfloor+2\widetilde r(x) \]

其中 \(\widetilde r(k)\) 是一个纯计数项:它统计如下方程的解的数量\(xx'+yy'=k\):在正整数约束下\(x>y>0,x'>y'>0,\gcd(x,y)=1\)。并且 \((x',y')\) 不再强制互素

\(x\le N\) 的所有行加起来,就得到:

\[ A(N)=\sum_{x=2}^{N}\left\lfloor\frac{3(x-1)}2\right\rfloor + 2R(N) \]

其中\(R(N)=\#\{(x,y,x',y'):x>y>0,x'>y'>0,\gcd(x,y)=1,xx'+yy'\le N\}\)。于是原问题变成:

\[ S(N)=\frac{N(N+1)}2 + 2\sum_{x=2}^{N}\left\lfloor\frac{3(x-1)}2\right\rfloor + 4R(N) \]

其中第二项级数的闭式为

\[ \sum_{k=1}^{N-1}\left\lfloor\frac{3k}{2}\right\rfloor =\begin{cases} \dfrac{3N^2-4N}{4} & N\equiv 0\pmod 2,\\ \dfrac{3N^2-4N+1}{4} & N\equiv 1\pmod 2. \end{cases} \]

接下来我们尝试计算\(R(N)\)。定义不带互素限制的计数:\(Q(n)=\#\{(x,y,x',y'):x>y>0,x'>y'>0,xx'+yy'\le n\}\),那么要计算的 \(R(N)\) 只对第一对 \((x,y)\) 要求互素,可以用 Möbius 反演:

\(x=d\bar x,y=d\bar y\),其中 \(\gcd(\bar x,\bar y)=1,d=\gcd(x,y)\)。于是\(xx'+yy' = d(\bar x x' + \bar y y')\le N\iff\bar x x' + \bar y y' \le \left\lfloor \dfrac{N}{d}\right\rfloor.\)因此

\[ Q\left(\left\lfloor\frac{N}{d}\right\rfloor\right) =\sum_{\substack{x>y>0\\d\mid \gcd(x,y)}} \#\{x'>y'>0:xx'+yy'\le N\} \]

对“\(d\mid\gcd(x,y)\)”做 Möbius 反演,得到:\(\displaystyle{R(N)=\sum_{d\ge 1}\mu(d)Q\left(\left\lfloor\frac{N}{d}\right\rfloor\right)}\),并且注意到最小可行解是 \(x=x'=2,y=y'=1\Rightarrow xx'+yy'=5\),所以当 \(\lfloor N/d\rfloor<5\)\(Q=0\),求和实际上只需要到 \(d\le \lfloor N/5\rfloor\)

同时,\(\left\lfloor\dfrac{N}{d}\right\rfloor\) 只有 \(O(\sqrt N)\) 个不同值,可以用整除分块:设 \(t=\left\lfloor \dfrac{N}{l}\right\rfloor\),则当 \(l\in[l,r]\) 时该值不变,且\(r=\left\lfloor\dfrac{N}{t}\right\rfloor.\)于是\(\displaystyle{R(N)=\sum_{\text{blocks}} Q(t)\cdot\left(M(r)-M(l-1)\right)}\),其中 \(\displaystyle{M(n)=\sum_{i=1}^n\mu(i)}\) 是 Möbius 前缀和。

我们接下来计算\(Q(n)\)。一个关键不等式是:由于 \(x\ge y+1,x'\ge y'+1\),有\(xx'\ge (y+1)(y'+1)=yy'+y+y'+1\ge yy'+3\),于是\(xx'+yy'\ge 2yy'+3\le n\Rightarrow yy'\le \left\lfloor\dfrac{n-3}{2}\right\rfloor\)

因此 \(yy'\) 一定不超过 \(\left\lfloor\dfrac{n-1}{2}\right\rfloor\),这让很多求和边界变得紧凑可控。

接下来采用对称 + 截断的计数套路:对称性来自交换 \((x,y)\leftrightarrow(x',y')\) 不改变不等式。

\(m=\lfloor\sqrt n\rfloor\)。我们先数满足 \(x\le m\) 的所有解数,记为 \(C\)。则总数满足:\(Q(n)=2C - C_b\),其中 \(C_b\) 是同时满足 \(x\le m,x'\le m\) 的解数(避免重复计数)。

对固定的 \((x,y)\)(满足 \(2\le x\le m,1\le y<x\)),我们需要数:\(\#\{(x',y'):x'>y'>0,xx'+yy'\le n\}\)。对给定 \(x'\),允许的 \(y'\) 数为:

\[ \min\left(x'-1,\left\lfloor\frac{n-xx'}{y}\right\rfloor\right) \]

这里出现“最小值”,用一个分界点处理:令\(k=\left\lfloor\dfrac{n}{x+y}\right\rfloor\)

  • \(2\le x'\le k\) 时,必有 \(\left\lfloor\dfrac{n-xx'}{y}\right\rfloor\ge x'-1\),因此贡献 \(x'-1\)
  • \(x'>k\) 时,贡献为 \(\left\lfloor\dfrac{n-xx'}{y}\right\rfloor\)

于是该 \((x,y)\) 的贡献为:\(\displaystyle{\sum_{x'=2}^{k}(x'-1)+\sum_{x'=k+1}^{\lfloor n/x\rfloor}\left\lfloor\frac{n-xx'}{y}\right\rfloor}\),那么第一段是等差数列:\(\dfrac{k(k-1)}2\),第二段是形如\(\displaystyle{\sum_{u=1}^{s}\left\lfloor\frac{m+pu}{q}\right\rfloor}\)的整除下取整和,可以用“广义欧几里得法”在 \(O(\log)\) 时间计算。

接下来我们利用广义欧几里得法计算整除下取整和。定义\(\displaystyle{F(p,q,m,s)=\sum_{u=1}^{s}\left\lfloor\frac{m+pu}{q}\right\rfloor, q>0,s\ge 0}\),允许 \(p,m\) 为负数。

基本化简:

  1. \(m\) 分解为 \(m=q\cdot t+m'\)\(\left\lfloor\dfrac{m+pu}{q}\right\rfloor=t+\left\lfloor\dfrac{m'+pu}{q}\right\rfloor\),这时\(F\)增加\(t\cdot s\)
  2. \(p\) 分解为 \(p=q\cdot t+p'\)\(\left\lfloor\frac{m+p'u+qtu}{q}\right\rfloor=\left\lfloor\dfrac{m+p'u}{q}\right\rfloor+tu\),这时\(F\)增加\(t\cdot \dfrac{s(s+1)}{2}\)

经过这一步可保证 \(0\le p'<q,0\le m'<q\)。接着使用几何计数对偶得到递推:

\[ F(p,q,m,s) =s\left\lfloor\frac{m+ps}{q}\right\rfloor- F\left(q,p,-m-1,\left\lfloor\frac{m+ps}{q}\right\rfloor\right) \]

这一步会把参数规模降到类似 \(\gcd\) 的复杂度,因此整体是 \(O(\log(q+|p|))\)

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
#include <bits/stdc++.h>
using namespace std;

using ll = long long;
int N = 5000000;

static inline ll floordiv(ll a, ll b)
{
if (a >= 0)
return a / b;
return -(((-a) + b - 1) / b);
}

vector<int> mobius_prefix(int n)
{
vector<int> mu(n + 1, 0), pref(n + 1, 0), primes;
vector<unsigned char> is_comp(n + 1, 0);
mu[1] = 1;
for (int i = 2; i <= n; i++)
{
if (!is_comp[i])
{
primes.push_back(i);
mu[i] = -1;
}
for (int p : primes)
{
ll v = 1LL * i * p;
if (v > n)
break;
is_comp[v] = 1;
if (i % p == 0)
{
mu[v] = 0;
break;
}
mu[v] = -mu[i];
}
}
int s = 0;
for (int i = 1; i <= n; i++)
{
s += mu[i];
pref[i] = s;
}
return pref;
}

pair<ll, ll> floor_sum_signed2(ll p, ll q, ll m, ll s1, ll s2)
{
ll res1 = 0, res2 = 0;
ll sign = 1;
while (true)
{
ll t = floordiv(m, q);
m -= t * q;
res1 += t * s1 * sign;
res2 += t * s2 * sign;

t = floordiv(p, q);
p -= t * q;
res1 += t * (s1 + 1) * s1 / 2 * sign;
res2 += t * (s2 + 1) * s2 / 2 * sign;

if (p == 0)
return {res1, res2};

t = floordiv(p * s1 + m, q);
res1 += s1 * t * sign;
s1 = t;

t = floordiv(p * s2 + m, q);
res2 += s2 * t * sign;
s2 = t;

swap(p, q);
m = -m - 1;
sign = -sign;
}
}

ll calc_Q(int n)
{
if (n < 5)
return 0;
int mx = floor(sqrt((long double)n));
ll res1 = 0, res2 = 0;
for (int x = 2; x <= mx; x++)
{
int vx = n / x;
for (int y = 1; y < x; y++)
{
int k = n / (x + y);
ll a = 1LL * k * (k - 1) / 2;
ll s1 = (ll)vx - k;
ll s2 = (k < mx) ? (ll)mx - k : 0;
auto [b1, b2] = floor_sum_signed2(-x, y, (ll)n - 1LL * k * x, s1, s2);
res1 += a + b1;
if (k >= mx)
res2 += 1LL * mx * (mx - 1) / 2;
else
res2 += a + b2;
}
}
return 2 * res1 - res2;
}

ll sum_floor_3k2(ll N)
{
ll n = N - 1;
ll m = n / 2;
if (n % 2 == 0)
return 3 * m * m + m;
return 3 * m * m + 4 * m + 1;
}

ll solve(int N)
{
int lim = N / 5;
auto mu_pref = mobius_prefix(lim);

unordered_map<int, ll> q_cache;
q_cache.reserve(1 << 15);

auto Q = [&](int x) -> ll
{
auto it = q_cache.find(x);
if (it != q_cache.end())
return it->second;
ll v = calc_Q(x);
q_cache.emplace(x, v);
return v;
};

ll R = 0;
int l = 1;
while (l <= lim)
{
int t = N / l;
int r = N / t;
if (r > lim)
r = lim;
R += Q(t) * (ll)(mu_pref[r] - mu_pref[l - 1]);
l = r + 1;
}

ll ans = 1LL * N * (N + 1) / 2;
ans += 2 * sum_floor_3k2(N);
ans += 4 * R;
return ans;
}

int main()
{
cout << solve(N) << "\n";
return 0;
}