Skip to main content

Command Palette

Search for a command to run...

Advent of Code 2025 - Day 2

A mathematical approach

Updated
7 min read

Hi guys, today was Day 2 of AoC 2025, and boy it was fun one

Puzzle 1

Deduced problem: Given a set of ranges in the form [a, b], we had to find sum of all possible invalid IDs in these ranges.

Invalid ID is an ID which is made only of some sequence of digits repeated twice. For example, 55(5 twice), 6464 (64 twice), and 123123 (123 twice) would all be invalid IDs.

Intuition and Thoughts

Hmm, so the invalid numbers(say $N$) would be of the form \(N \rightarrow xx\), where $x$ is a say $k$ digits number, so $N$ will have $k$ digits and the value of $N$ is given by

$$N = x + 10^k \cdot x = (10^k + 1) \cdot x$$

So, for $N$ must be a multiple of \(10^k + 1\)

Approach

For a fixed $k$, $N$ is a $2k$ digit number which is a multiple of \(m = 10^k + 1\)

So if we find all $x$ such that \(x \cdot m\) is in the range, then we found the invalid ids which can be deduced from the following conditions

$$a \leq N \leq b \implies a \leq x \cdot m \leq b$$

which is from the problem only and,

$$10^{k-1} \leq x \leq 10^k - 1$$

since $x$ is $k$ digit number

Solving the problem

Combining the two inequalities we get,

$$\max(10^{k-1}, \frac{a}{m}) \leq x \leq \min (10^k - 1, \frac{b}{m})$$

Since this we are dealing with integer calculation, I will make it more precise

$$\max(10^{k-1}, \lceil \frac{a}{m} \rceil) \leq x \leq \min (10^k - 1, \lfloor\frac{b}{m}\rfloor)$$

Implementation

I used __int128 data type for storing the sums, the values and everything, just because I wanted to be paranoid — definitely not needed

  1. I used a precomputed table to compute the masks value

  2. To avoid data loss in extreme cases due to conversion between integers and doubles and ceiling them I instead do (a + b - 1) / b to compute \(\lceil\frac{a}{b}\rceil\)

Steps:

  1. Early Exit on Useless Cases:

    • If (a > b), exit early as there are no valid ranges.

    • If there are no even-length numbers within the range, exit early.

  2. Consider Each Valid (k): Identify each k such that a 2k digit number lies within the range [a, b].

  3. Compute min_x and max_x: Calculate min_x and max_x using the inequalities

  4. Check Existence of Numbers: Ensure numbers exist by checking min_x <= max_x.

  5. Generate (N): N can be generated by x in [min_x, min_x + 1, ..., max_x]).

  6. Sum the Values: Sum all x using basic arithmetic progression results and multiply by mask to get the final value.

__int128 find_invalid(__int128 a, __int128 b) {
  if (a > b)
    return 0;
  int digits_1 = count_digits(a); // We count the number of digits in a & b
  int digits_2 = count_digits(b);
  if (digits_1 == digits_2 && (digits_1 & 1))
    return 0;
  __int128 ans = 0;
  for (int k = (digits_1 + 1) / 2; 2 * k <= digits_2; ++k) {
    int twice_k = 2 * k;
    __int128 mask = 1 + pow10_128[k]; // mask = 10^k + 1
    __int128 loN = max(a, pow10_128[2 * k - 1]);
    __int128 hiN = min(b, pow10_128[2 * k] - 1);
    if (loN > hiN)
      continue;
    __int128 min_x = (loN + mask - 1) / mask;
    __int128 max_x = hiN / mask;
    if (min_x > max_x) continue;
    __int128 sum_x = (max_x - min_x + 1) * (min_x + max_x) / 2;
    ans += sum_x * mask;
  }
  return ans;
}

Complexity Analysis

The time complexity is \(\Theta(d_b - d_a)\) or \(\Theta(\log \frac{b}{a})\) as the loop runs that many times

Puzzle 2

Deduced Problem: Same as last problem but definition of Invalid ID changes.

An ID is invalid if it is made only of some sequence of digits repeated at least twice.

Intuition and Thoughts

Using the same notations as above, here \(N \rightarrow xxxx(\dots n\text{ times})\), so digits of $N$, \(d_N = k \cdot n\), where $N$ is given by

$$N = x \cdot (1 + 10^k + 10^{2k} \dots 10^{(n-1)k}) = x \cdot M(n, k)$$

where $M(n, k) $ is a repunit-like number, which is used to repeat a $k$ digit number $n$ times in order to get a \(d= k \cdot n\) digit number.

By concept of geometric progressions,

$$M(n, k) = \dfrac{10^{nk} - 1}{10^k - 1}$$

So to generate a \(d_N\) digit $N$, the only thing we need is few $k$ digit numbers $x$, where \(k \mid n\)($k$ divides $n$) and \(k \ne n\) and they will generate $N$

So for a fixed \(d_N\) and $k$, we calculate the possible values of $x$(just like we did last time), but this time mask is replaced by a more general term repunit (mask is repunit for n = 2)

So this is simply solved like the previous one only the mask value is swapped.

Gotcha!!!

This time there is an issue, the issue of duplicates, consider the following example
\(d_N = 6\), \(k = 1\), \(n = 6\), generates $111111$
which is also generated by \(k = 2, n = 3\) and \(k = 3 , n = 2\)

So there are solutions to this problem, I will share 3 and discuss only one and the simplest one and leave the other two as extensions for you to solve

Solution

Implementation quirks:

  1. I stored the proper factors of numbers upto 20 in a table named DIV

  2. Created a hash set for __int128 type(easier than it sounds)

Steps:

  1. Use the same procedure as the last puzzle

  2. Instead of summing all the numbers at once, we will store each of the generated number in a hash set, so that we can deduplicate the duplicates

  3. After going through all possible digits, we sum at last

__int128_t find_invalid(__int128 a, __int128 b) {
  if (a > b)
    return 0;
  int digits_1 = count_digits(a); // We count the number of digits in a & b
  int digits_2 = count_digits(b);
  unordered_set<__int128, Hash128, Eq128> invalid_nums;
  for (int digits = digits_1; digits <= digits_2; ++digits) {

    for (const int k : DIV[digits]) {
      __int128 repunit = (pow10_128[digits] - 1) / (pow10_128[k] - 1);
      __int128 loN = max(a, pow10_128[digits - 1]);
      __int128 hiN = min(b, pow10_128[digits] - 1);
      __int128 min_x = (loN + repunit - 1) / repunit;
      __int128 max_x = (hiN) / repunit;
      if (min_x > max_x)
        continue;
      for (int i = min_x; i <= max_x; ++i) {
        invalid_nums.insert(i * repunit);
      }
    }
  }
  __int128_t ans = 0;
  for (const __int128_t invalid_num : invalid_nums) {
    ans += invalid_num;
  }
  return ans;
}

Performance Analysis

Outer loop runs \(\Theta(d_b - d_a)\) just like last puzzle
First Inner loop runs \(\Theta(f)\)times where $f$ is the sum of number of proper factors of all \(d \in [d_a, d_b]\)
Second Inner loop runs \(\Theta(s)\) in total where $s$ is the number of solutions(Invalid IDs) and so does the sum computing loop

Total time complexity: \(\Theta(d_b - d_a +f +s)\)

Improvement 1 — primitive block

My initial idea for improvement is phrased by an LLM as:

Define a k-digit block x as primitive if it is not itself a repetition of a smaller block. For a given (k,n), you should only accept x that are primitive; then every final N has a unique representation (with k minimal).

Improvement 2 - counting primitives without enumerating x (Möbius)

I found the following from ChatGPT and haven’t explored much, so I would love to hear more about this from anyone of you

If you need to count how many primitive k-digit x lie in the interval [A, B] (where A and B are arbitrary, e.g. x_low..x_high), you can use inclusion-exclusion / Möbius inversion on divisors of k. Rough sketch:

  • Let F(t) be number of k-digit numbers that are repetition of a block of length t where t | k (and t<k). Those x are of the form y repeated k/t times. Counting those ≤ X reduces to counting y ≤ something. Use Möbius to invert and get count of primitive numbers. Implementation is trickier because of the arbitrary interval endpoints but doable if you need asymptotic speed and k can be large.

For those of you who want the solution with improvements: https://gist.github.com/rounakkumarsingh/d8ed1af840c7ad995329e647672cc3e9. (I neither have written this nor have read this, this is generated by ChatGPT)

Advent Of Code 2025

Part 4 of 4

In this series, I will be sharing my approach and the logic behind those along with some C++(or maybe something else sometimes) code

Start from the beginning

Advent of Code 2025 — Day 5: Fresh Ingredients & Interval Logic

Turning messy ingredient ranges into clean answers with set checks and interval merging.