Project Euler 384

发表于 2026-01-13 分类于 Project Euler 阅读次数：本文字数： 3.8k 阅读时长 ≈ 3 分钟

Project Euler 384

题目

Rudin-Shapiro sequence

Define the sequence \(a(n)\) as the number of adjacent pairs of ones in the binary expansion of n (possibly overlapping).

E.g.: \(a(5) = a(101_2) = 0, a(6) = a(110_2) = 1, a(7) = a(111_2) = 2\)

Define the sequence \(b(n) = (-1)^{a(n)}\).

This sequence is called the Rudin-Shapiro sequence.

Also consider the summatory sequence of \(b(n)\): \(s(n) = \sum \limits_{i = 0}^{n} {b(i)}\).

The first couple of values of these sequences are:

\(n\)	\(0\)	\(1\)	\(2\)	\(3\)	\(4\)	\(5\)	\(6\)	\(7\)
\(a(n)\)	\(0\)	\(0\)	\(0\)	\(1\)	\(0\)	\(0\)	\(1\)	\(2\)
\(b(n)\)	\(1\)	\(1\)	\(1\)	\(-1\)	\(1\)	\(1\)	\(-1\)	\(1\)
\(s(n)\)	\(1\)	\(2\)	\(3\)	\(2\)	\(3\)	\(4\)	\(3\)	\(4\)

The sequence \(s(n)\) has the remarkable property that all elements are positive and every positive integer \(k\) occurs exactly \(k\) times.

Define \(g(t,c)\), with \(1 \le c \le t\), as the index in \(s(n)\) for which \(t\) occurs for the \(c\)’th time in \(s(n)\).

E.g.: \(g(3,3) = 6, g(4,2) = 7\) and \(g(54321,12345) = 1220847710\).

Let \(F(n)\) be the fibonacci sequence defined by:

\(F(0)=F(1)=1\) and

\(F(n)=F(n-1)+F(n-2)\) for \(n>1\).

Define \(GF(t)=g(F(t),F(t-1))\).

Find \(\sum GF(t)\) for \(2\le t\le45\).

解决方案

为了方便处理前缀和，约定\(s(-1)=0,s(n)=s(n-1)+b(n)(n\ge 0).\)

现在我们从 \(a(n)\) 推出 \(b(n)\) 的二倍递推

把 \(n\) 左移一位得到 \(2n\)，末尾补 0，不会新增相邻 “11”，因此\(a(2n)=a(n)\Rightarrow b(2n)=b(n).\)

把 \(n\) 左移末尾补 1 得到 \(2n+1\)。只有当 \(n\) 的最低位为 1（即 \(n\) 为奇数）时，新末尾会和原末尾形成一个新的 “11”。于是\(a(2n+1)=a(n)+(n\bmod 2)\)。从而有\(b(2n+1)=(-1)^{a(2n+1)}=(-1)^{a(n)}\cdot(-1)^{n\bmod 2}=b(n)\cdot(-1)^{n\bmod 2}.\)总结如下：

\[ \begin{aligned} b(0)&=1,\\ b(2n)&=b(n),\\ b(2n+1)&= \begin{cases} b(n),&n\equiv 0\pmod 2,\\ -b(n),&n\equiv 1\pmod 2. \end{cases} \end{aligned} \]

接下来关键在于：\(s(n)\) 的图像（把 \(n\) 当横轴，\(s(n)\) 当纵轴）是一个步长为 \(\pm1\) 的折线，而其“斜率序列”正是 \(b(n)\)。由上面的递推，斜率序列具有强烈的自相似，于是 \(s(n)\) 也会出现自相似结构。

注意到\(4^k=2^{2k}.\)当我们把下标范围扩展到长度为 \(4^k\) 的区间时，会自然出现 \(2^k\) 这样的高度尺度；这就是后面递推里出现\(h^2\)的根源。

令\(h=2^k.\)我们关注长度为 \(2h^2=2\cdot 4^k\) 的前缀区间 \(n\in[0,2h^2-1]\)。在这个区间内，\(s(n)\) 的值域恰好覆盖到 \(1,2,\dots,2h\)，并且出现次数呈现非常规整的分布：

在前 \(h^2\) 项（\(n\in[0,h^2-1]\)）里：\(1,2,\dots,h\)分别出现\(1,2,\dots,h\)次，\(h+1,h+2,\dots,2h-1\)分别出现\(h-1,h-2,\dots,1\)次，也就是“先递增再递减”的三角形分布，且最大值不会到 \(2h\)。
在前 \(2h^2\) 项（\(n\in[0,2h^2-1]\)）里：\(1,2,\dots,2h-1\)分别出现\(1,2,\dots,2h-1\)次，\(2h\)出现\(h\)次。

这直接推出题面给的全局性质：每个正整数 \(k\) 恰好出现 \(k\) 次，因为当 \(h\) 翻倍，这个“完成到 \(2h-1\)”的范围也持续向上推进。

直观上，可以把长度 \(2h^2\) 的前缀看作由 4 个长度为 \(h^2/2\) 的块拼接而成；每一块要么是小一号图像的平移（加常数），要么是关于某条水平线的镜像（\(x\mapsto \text{const}-x\)）。这会让“某个高度 \(t\) 的出现位置集合”能够递归映射到更小的高度集合上。这正是构造 \(g(t,c)\) 递推的基础。

对任意 \(t\ge 1\)，我们令\(h=2^{\lfloor \log_2 t\rfloor},d=t-h,\)即 \(h\le t<2h\)，并写成\(t=h+d,0\le d<h.\)

核心思想是在尺度为 \(h\) 的自相似分块下，值 \(t\) 的出现位置会落在若干个大块里，每个大块内部又对应到一个更小的“目标值”去找第若干次出现，因此得到递归。

下面给出 \(g(t,c)\) 的完整递推。当 \(t=1\) 时，\(s(0)=1\)，所以\(g(1,1)=0.\)当 \(d=0\)（即 \(t\) 为 2 的幂）时，值 \(t\) 的出现位置在自相似结构中分成两段：前一段等价于找 \(t/2\)，后一段等价于继续找 \(t\) 的剩余次数。于是：

\[ g(t,c)= \left\{ \begin{aligned} &0 &&\text{if}\quad t=1,\\ &\dfrac{t^2}{4}+g\left(\dfrac{t}{2},c\right) &&\text{if}\quad d=0\land c\le \dfrac{t}{2},\\ &\dfrac{t^2}{2}+g\left(t,c-\dfrac{t}{2}\right) &&\text{if}\quad d=0\land c>\dfrac{t}{2},\\ &h^2+g(2h-d,c-h) &&\text{else if}\quad d>0\land c>h,\\ &h^2+g(d,c+d-h) &&\text{else if}\quad d>0\land c>h-d,\\ &\dfrac{h^2}{2}+g(d,c) &&\text{else if}\quad d>0\land c\le d,\\ &\dfrac{h^2}{2}+g(2h-t,c) &&\text{else}. \end{aligned} \right. \]

这里最后一行的 \(2h-t=h-d\)，表示镜像对应的“互补高度”。把前缀区间按尺度 \(h\) 切成四段，每段长度为 \(\dfrac{h^2}{2}\)，分界点恰好是\(\dfrac{h^2}{2},h^2,\dfrac{3h^2}{2}.\)

对 \(t=h+d\)（\(d>0\)）而言：

在最前的 \([0,h^2/2)\) 中，\(s\) 的取值不超过 \(h\)，因此不会出现 \(t>h\)。所以 \(t\) 的首次出现一定在 \(\ge h^2/2\) 处，这解释了递推里频繁出现的位移 \(h^2/2\)。
在 \([h^2/2,h^2)\) 这块里，图像由一个“平移块”和一个“镜像块”拼接：平移块把小值 \(d\) 映射成 \(h+d\)，镜像块把小值 \(h-d\) 映射成 \(h+d\)。因此：
- 若 \(c\) 很小（落在平移块覆盖的那部分），就对应到 \(g(d,c)\)；
- 否则就对应到 \(g(h-d,c)\)。
在 \([h^2,2h^2)\) 的后半部分里，\(t\) 的剩余出现次数会继续被拆成两类：一类等价于找 \(d\)（次数要扣掉前半段已经出现的 \(h-d\) 次，所以出现 \(c+d-h=c-(h-d)\)）；另一类等价于找镜像高度 \(2h-d\)，对应最后的 \(c-h\)。

这样四种情况就对应上了递推的四个分支（再加上 \(t\) 为 2 的幂时的特殊拆分）。

代码

from functools import lru_cache
M = 45

@lru_cache(None)
def g(t: int, c: int) -> int:
    """
    g(t,c): s(n) 中值 t 第 c 次出现时的下标 n（0-based）
    约束：t>=1, 1<=c<=t
    """
    if t == 1:
        return 0

    h = 1 << (t.bit_length() - 1)
    d = t - h

    # t 为 2 的幂
    if d == 0:
        if c <= t // 2:
            return t * t // 4 + g(t // 2, c)
        else:
            return t * t // 2 + g(t, c - t // 2)

    # t = h + d, 0<d<h
    if c > h:
        return h * h + g(2 * h - d, c - h)
    if c > h - d:
        return h * h + g(d, c + d - h)
    if c <= d:
        return h * h // 2 + g(d, c)
    return h * h // 2 + g(2 * h - t, c)  # 2h-t = h-d

def solve() -> int:
    F = [1, 1]
    ans = 0
    for t in range(2, M + 1):
        F.append(F[-1] + F[-2])
        ans += g(F[t], F[t - 1])
    return ans

print(solve())