Advent of Code 2025 - Day 2
A mathematical approach
Hi guys, today was Day 2 of AoC 2025, and boy it was fun one
Puzzle 1
Deduced problem: Given a set of ranges in the form [a, b], we had to find sum of all possible invalid IDs in these ranges.
Invalid ID is an ID which is made only of some sequence of digits repeated twice. For example, 55(5 twice), 6464 (64 twice), and 123123 (123 twice) would all be invalid IDs.
Intuition and Thoughts
Hmm, so the invalid numbers(say $N$) would be of the form \(N \rightarrow xx\), where $x$ is a say $k$ digits number, so $N$ will have $k$ digits and the value of $N$ is given by
$$N = x + 10^k \cdot x = (10^k + 1) \cdot x$$
So, for $N$ must be a multiple of \(10^k + 1\)
Approach
For a fixed $k$, $N$ is a $2k$ digit number which is a multiple of \(m = 10^k + 1\)
So if we find all $x$ such that \(x \cdot m\) is in the range, then we found the invalid ids which can be deduced from the following conditions
$$a \leq N \leq b \implies a \leq x \cdot m \leq b$$
which is from the problem only and,
$$10^{k-1} \leq x \leq 10^k - 1$$
since $x$ is $k$ digit number
Solving the problem
Combining the two inequalities we get,
$$\max(10^{k-1}, \frac{a}{m}) \leq x \leq \min (10^k - 1, \frac{b}{m})$$
Since this we are dealing with integer calculation, I will make it more precise
$$\max(10^{k-1}, \lceil \frac{a}{m} \rceil) \leq x \leq \min (10^k - 1, \lfloor\frac{b}{m}\rfloor)$$
Implementation
I used __int128 data type for storing the sums, the values and everything, just because I wanted to be paranoid — definitely not needed
I used a precomputed table to compute the masks value
To avoid data loss in extreme cases due to conversion between integers and doubles and ceiling them I instead do
(a + b - 1) / bto compute \(\lceil\frac{a}{b}\rceil\)
Steps:
Early Exit on Useless Cases:
If (a > b), exit early as there are no valid ranges.
If there are no even-length numbers within the range, exit early.
Consider Each Valid (k): Identify each
ksuch that a2kdigit number lies within the range[a, b].Compute
min_xandmax_x: Calculatemin_xandmax_xusing the inequalitiesCheck Existence of Numbers: Ensure numbers exist by checking
min_x <= max_x.Generate (N): N can be generated by
xin[min_x, min_x + 1, ..., max_x]).Sum the Values: Sum all
xusing basic arithmetic progression results and multiply bymaskto get the final value.
__int128 find_invalid(__int128 a, __int128 b) {
if (a > b)
return 0;
int digits_1 = count_digits(a); // We count the number of digits in a & b
int digits_2 = count_digits(b);
if (digits_1 == digits_2 && (digits_1 & 1))
return 0;
__int128 ans = 0;
for (int k = (digits_1 + 1) / 2; 2 * k <= digits_2; ++k) {
int twice_k = 2 * k;
__int128 mask = 1 + pow10_128[k]; // mask = 10^k + 1
__int128 loN = max(a, pow10_128[2 * k - 1]);
__int128 hiN = min(b, pow10_128[2 * k] - 1);
if (loN > hiN)
continue;
__int128 min_x = (loN + mask - 1) / mask;
__int128 max_x = hiN / mask;
if (min_x > max_x) continue;
__int128 sum_x = (max_x - min_x + 1) * (min_x + max_x) / 2;
ans += sum_x * mask;
}
return ans;
}
Complexity Analysis
The time complexity is \(\Theta(d_b - d_a)\) or \(\Theta(\log \frac{b}{a})\) as the loop runs that many times
Puzzle 2
Deduced Problem: Same as last problem but definition of Invalid ID changes.
An ID is invalid if it is made only of some sequence of digits repeated at least twice.
Intuition and Thoughts
Using the same notations as above, here \(N \rightarrow xxxx(\dots n\text{ times})\), so digits of $N$, \(d_N = k \cdot n\), where $N$ is given by
$$N = x \cdot (1 + 10^k + 10^{2k} \dots 10^{(n-1)k}) = x \cdot M(n, k)$$
where $M(n, k) $ is a repunit-like number, which is used to repeat a $k$ digit number $n$ times in order to get a \(d= k \cdot n\) digit number.
By concept of geometric progressions,
$$M(n, k) = \dfrac{10^{nk} - 1}{10^k - 1}$$
So to generate a \(d_N\) digit $N$, the only thing we need is few $k$ digit numbers $x$, where \(k \mid n\)($k$ divides $n$) and \(k \ne n\) and they will generate $N$
So for a fixed \(d_N\) and $k$, we calculate the possible values of $x$(just like we did last time), but this time mask is replaced by a more general term repunit (mask is repunit for n = 2)
So this is simply solved like the previous one only the mask value is swapped.
Gotcha!!!
This time there is an issue, the issue of duplicates, consider the following example
\(d_N = 6\), \(k = 1\), \(n = 6\), generates $111111$
which is also generated by \(k = 2, n = 3\) and \(k = 3 , n = 2\)
So there are solutions to this problem, I will share 3 and discuss only one and the simplest one and leave the other two as extensions for you to solve
Solution
Implementation quirks:
I stored the proper factors of numbers upto 20 in a table named
DIVCreated a hash set for
__int128type(easier than it sounds)
Steps:
Use the same procedure as the last puzzle
Instead of summing all the numbers at once, we will store each of the generated number in a hash set, so that we can deduplicate the duplicates
After going through all possible digits, we sum at last
__int128_t find_invalid(__int128 a, __int128 b) {
if (a > b)
return 0;
int digits_1 = count_digits(a); // We count the number of digits in a & b
int digits_2 = count_digits(b);
unordered_set<__int128, Hash128, Eq128> invalid_nums;
for (int digits = digits_1; digits <= digits_2; ++digits) {
for (const int k : DIV[digits]) {
__int128 repunit = (pow10_128[digits] - 1) / (pow10_128[k] - 1);
__int128 loN = max(a, pow10_128[digits - 1]);
__int128 hiN = min(b, pow10_128[digits] - 1);
__int128 min_x = (loN + repunit - 1) / repunit;
__int128 max_x = (hiN) / repunit;
if (min_x > max_x)
continue;
for (int i = min_x; i <= max_x; ++i) {
invalid_nums.insert(i * repunit);
}
}
}
__int128_t ans = 0;
for (const __int128_t invalid_num : invalid_nums) {
ans += invalid_num;
}
return ans;
}
Performance Analysis
Outer loop runs \(\Theta(d_b - d_a)\) just like last puzzle
First Inner loop runs \(\Theta(f)\)times where $f$ is the sum of number of proper factors of all \(d \in [d_a, d_b]\)
Second Inner loop runs \(\Theta(s)\) in total where $s$ is the number of solutions(Invalid IDs) and so does the sum computing loop
Total time complexity: \(\Theta(d_b - d_a +f +s)\)
Improvement 1 — primitive block
My initial idea for improvement is phrased by an LLM as:
Define a k-digit block x as primitive if it is not itself a repetition of a smaller block. For a given (k,n), you should only accept x that are primitive; then every final N has a unique representation (with k minimal).
Improvement 2 - counting primitives without enumerating x (Möbius)
I found the following from ChatGPT and haven’t explored much, so I would love to hear more about this from anyone of you
If you need to count how many primitive k-digit x lie in the interval [A, B] (where A and B are arbitrary, e.g. x_low..x_high), you can use inclusion-exclusion / Möbius inversion on divisors of k. Rough sketch:
- Let
F(t)be number ofk-digit numbers that are repetition of a block of lengthtwheret | k(andt<k). Thosexare of the formyrepeatedk/ttimes. Counting those ≤ X reduces to countingy ≤ something. Use Möbius to invert and get count of primitive numbers. Implementation is trickier because of the arbitrary interval endpoints but doable if you need asymptotic speed andkcan be large.
For those of you who want the solution with improvements: https://gist.github.com/rounakkumarsingh/d8ed1af840c7ad995329e647672cc3e9. (I neither have written this nor have read this, this is generated by ChatGPT)
