Project Euler 810

发表于 2026-01-21 分类于 Project Euler 阅读次数：本文字数： 4.5k 阅读时长 ≈ 4 分钟

Project Euler 810

题目

XOR-Primes

We use \(x\oplus y\) for the bitwise XOR of \(x\) and \(y\).

Define the XOR-product of \(x\) and \(y\), denoted by \(x \otimes y\), similar to a long multiplication in base \(2\), except that the intermediate results are XORed instead of the usual integer addition.

For example, \(7 \otimes 3 = 9\), or in base \(2\), \(111_2 \otimes 11_2 = 1001_2\):

\[\begin{align*} \phantom{\otimes 111} 111_2 \\ \otimes \phantom{1111} 11_2 \\ \hline \phantom{\otimes 111} 111_2 \\ \oplus \phantom{11} 111_2 \phantom{9} \\ \hline \phantom{\otimes 11} 1001_2 \\ \end{align*}\]

An XOR-prime is an integer \(n\) greater than \(1\) that is not an XOR-product of two integers greater than \(1\). The above example shows that \(9\) is not an XOR-prime. Similarly, \(5 = 3 \otimes 3\) is not an XOR-prime. The first few XOR-primes are \(2, 3, 7, 11, 13, \dots\) and the \(10\text{th}\) XOR-prime is \(41\).

Find the \(5\,000\,000\text{th}\) XOR-prime.

解决方案

令\(M=5000000\)。把一个非负整数 \(n\) 的二进制展开写成\(\displaystyle{n=\sum_{i\ge 0} a_i 2^i, a_i\in\{0,1\}.}\)定义与之对应的多项式（系数在 \(\mathbb{F}_2=\{0,1\}\) 上）：\(\displaystyle{\Phi(n)=\sum_{i\ge 0} a_i x^i\in \mathbb{F}_2[x].}\)可见：

\(\Phi\)在加法上有良好的定义。 \(\mathbb{F}_2\) 中系数加法就是模 \(2\)，因此\(\Phi(u\oplus v)=\Phi(u)+\Phi(v).\)
\(\Phi\)在乘法上也有良好的定义。对两个整数 \(u,v\)，设\(\displaystyle{\Phi(u)=\sum_i a_i x^i, \Phi(v)=\sum_j b_j x^j.}\)则在 \(\mathbb{F}_2[x]\) 中\(\displaystyle{\Phi(u)\Phi(v)=\sum_{k\ge 0}\left(\sum_{i+j=k} a_i b_j\right)x^k,}\)括号内的系数是在 \(\mathbb{F}_2\) 中求和，即对所有满足 \(i+j=k\) 的位对进行“与”并对结果做异或，这与二进制长乘法“移位后异或累加”完全一致，因此\(\Phi(u\otimes v)=\Phi(u)\Phi(v).\)

换句话说：整数集合在运算 \((\oplus,\otimes)\) 下，与多项式环 \(\mathbb{F}_2[x]\) 在运算 \((+,\cdot)\) 下是同构的。

在环 \(\mathbb{F}_2[x]\) 中，唯一的单位元是常数多项式 \(1\)。因此一个整数 \(n>1\) 可以写成\(n=a\otimes b, a>1,b>1\)

当且仅当 \(\Phi(n)\) 在 \(\mathbb{F}_2[x]\) 中可分解为两个次数都至少为 \(1\) 的多项式之积。于是：\(n\) 是 XOR-prime 当且仅当 \(\Phi(n)\) 是 \(\mathbb{F}_2[x]\) 中的不可约多项式。

设 \(u>0\) 的最高位是第 \(d\) 位（从 \(0\) 开始），即\(2^d\le u<2^{d+1},\)则\(\deg \Phi(u)=d.\)若 \(a,b>1\)，设\(\deg\Phi(a)=d_a, \deg\Phi(b)=d_b,\)

因为在 \(\mathbb{F}_2[x]\) 中最高次项相乘产生 \(x^{d_a+d_b}\) 且系数为 \(1\)，不会被抵消，所以\(\deg\Phi(a\otimes b)=\deg(\Phi(a)\Phi(b))=d_a+d_b.\)因此\(d_a+d_b\ge d_a+1\Rightarrow a\otimes b \ge 2^{d_a+1} > a.\)同理 \(a\otimes b>b\)。这得到非常重要的结论：对任意 \(a,b>1\)，都有 \(a\otimes b>\max(a,b)\)。

这保证了可以像普通素数一样做筛：XOR-composite一定在其最小因子的后面出现。

我们接下来估计我们的搜索上限。令 \(I(m)\) 表示 \(\mathbb{F}_2[x]\) 中次数恰为 \(m\) 的不可约首一多项式个数。这里：

“首一”表示最高次项系数为 \(1\)；
“不可约”表示不能写成两个次数都至少为 \(1\) 的多项式之积。

我们可以直接给出一个结论：

\[ I(m)=\frac{1}{m}\sum_{d\mid m}\mu(d)2^{m/d}, \]

其中 \(\mu\) 是 Möbius 函数。为了推导它，关键是下面这个恒等式：

\[ x^{2^m}-x=\prod_{d\mid m}\prod_{f\in\mathcal{I}_d} f(x), \]

其中 \(\mathcal{I}_d\) 表示所有次数恰为 \(d\) 的首一不可约多项式的集合，因此 \(|\mathcal{I}_d|=I(d)\)。这条等式可以这样理解：在有限域 \(\mathbb{F}_{2^m}\) 中，对任意元素 \(a\) 都有 \(a^{2^m}=a\)，因此 \(\mathbb{F}_{2^m}\) 的全部 \(2^m\) 个元素恰好都是多项式 \(x^{2^m}-x\) 的根。另一方面，若 \(f(x)\in\mathbb{F}_2[x]\) 是次数为 \(d\) 的首一不可约多项式，则它的任意一个根 \(\alpha\) 都会生成一个大小为 \(2^d\) 的最小扩域 \(\mathbb{F}_2(\alpha)\cong\mathbb{F}_{2^d}\)；因此 \(\alpha\) 会落在 \(\mathbb{F}_{2^m}\) 中，当且仅当这个最小域 \(\mathbb{F}_{2^d}\) 能嵌入到 \(\mathbb{F}_{2^m}\)，而有限域的基本结构告诉我们这等价于 \(d\mid m\)。

于是，恰好所有次数 \(d\) 能整除 \(m\) 的不可约多项式 \(f(x)\) 都会在 \(\mathbb{F}_{2^m}\) 中拥有根，从而必整除 \(x^{2^m}-x\)；并且由于在特征 \(2\) 下\((x^{2^m}-x)'=1,\)该多项式没有重根，所以这些不可约因子在分解中都只出现一次，从而得到上面的乘积分解。

最终 \(x^{2^m}-x\) 在 \(\mathbb{F}_2[x]\) 中的不可约分解，正好就是右侧对所有 \(d\mid m\) 的“全体不可约多项式”的乘积，从而得到上述恒等式。

接下来对该恒等式两侧取次数。左侧显然有\(\deg(x^{2^m}-x)=2^m.\)

右侧中，对每个固定的 \(d\mid m\)，集合 \(\mathcal{I}_d\) 里有 \(I(d)\) 个多项式，每个次数都是 \(d\)，因此这一层总次数贡献为 \(d\cdot I(d)\)。把所有约数层加起来得到\(\displaystyle{2^m=\sum_{d\mid m} dI(d).}\)

最后一步是对这个约数和关系作 Möbius 反演，最终得到上面的结论。

用这个公式累加 \(\displaystyle{s(m)=\sum_{k=1}^m I(k)}\) 可以得到，我们需要找到最小的\(m\)使得\(s(m)\ge M\)。那么我们就知道了我们的搜索上界就是\(2^m\)。（题目中给出的\(M\)对应着\(m=27\)）。

设上界\(L=2^{27}.\)我们维护一个布尔数组，标记每个奇数是否为 XOR-composite（偶数除 \(2\) 外必含因子 \(2\)，无需参与扫描）。

当扫描到一个尚未被标记的数 \(p\)（且 \(p>1\)），它就是 XOR-prime；随后要标记所有形如\(p\otimes q, q>1\)且结果 \(p\otimes q<L\) 的数为XOR-composite。

用次数控制范围最方便。若\(\deg\Phi(p)=d,\)为了保证\(\deg\Phi(p\otimes q)\le m-1,\)必须有\(\deg\Phi(q)\le m-1-d.\)

也就是说 \(q\) 的位长至多为 \(m-d\)，因此 \(q\) 的取值范围可以写为\(0\le q<2^{m-d}.\) 于是对每个 XOR-prime \(p\)，我们只需枚举所有 \(q\) 满足 \(2\le q<2^{27-d}\) 来标记 \(p\otimes q\)。其整体复杂度与普通筛同阶，为\(O(m\cdot 2^m)\)。（常数取决于计算 XOR-product 的方式）。

我们希望把\(\otimes\)运算降低到均摊\(O(1)\)的时间复杂度。

关键观察：在 \(\mathbb{F}_2[x]\) 里乘法对第二个因子是线性的。

若把 \(q\) 看作一个二进制向量，每翻转 \(q\) 的某一位 \(2^k\)，对应多项式就是把 \(x^k\) 这一项从 \(0\) 改成 \(1\) 或从 \(1\) 改回 \(0\)。此时\(p\otimes(q\oplus 2^k)=(p\otimes q)\oplus (p\ll k).\)

因此如果我们用 Gray code 顺序遍历所有 \(q\)，使得相邻 \(q\) 恰好只翻转一位，那么每一步都可以用一次异或更新乘积值。

标准 Gray code 为\(g(t)=t\oplus (t\gg 1), t=0,1,\dots,2^m-1,\) 且 \(g(t)\) 与 \(g(t-1)\) 的差异位正好是 \(t\) 的最低位 \(1\)。令\(\ell(t)=2^{v_2(t)}=1\ll v_2(t)\)表示\(t\)的最低位 \(1\)。

那么从 \(q=g(t-1)\) 到 \(q'=g(t)\) 的更新只需\(q\leftarrow q\oplus \ell(t),(p\otimes q)\leftarrow (p\otimes q)\oplus (p\cdot \ell(t)).\)这样对固定 \(p\)，枚举全部 \(q\) 的每一步更新就是常数操作。

代码

#include <bits/stdc++.h>
using namespace std;

const int M = 27;
const int LIM = 1 << M;
const int HALF = LIM >> 1;
const int N = 5000000;

int find_xor_prime(int N)
{

    vector<uint8_t> comp(HALF, 0);

    int cnt = 1;
    if (N == 1)
        return 2;

    for (int i = 1; i < HALF; i++)
    {
        int p = (i << 1) | 1;
        if (comp[i])
            continue;

        cnt++;
        if (cnt == N)
            return p;

        int d = 31 - __builtin_clz((unsigned)p);
        int m = M - d;
        int lim_q = 1 << m;

        int q = 0;
        int r = 0;

        for (int t = 1; t < lim_q; t++)
        {
            int lsb = t & -t;
            q ^= lsb;
            r ^= p * lsb;
            if (q > 1 && (q & 1))
                comp[r >> 1] = 1;
        }
    }
    return -1;
}

int main()
{
    cout << find_xor_prime(5000000) << "\n";
    return 0;
}