The Dan Numbers

The Dan Numbers

(code for this post can be downloaded from GitHub)

Background #

Bucharest is an interesting city with quirks that you may never find in other places. Among them, it has a mayor, Mr. Nicusor Dan, who has a PhD in mathematics and was a bona-fide mathematician before being elected mayor. During an appearance on a TV call-in show, a listener gave him a math puzzle to solve saying it is a grade 4 problem. Mr Dan struggled for a bit, gave up and promised a solution shortly. Here is the grade 4 problem:

Find the smallest number with the property that, when you take the last digit and put it in front (like changing from “2315” to “5231”), the resulting number is twice as big as the original.

The next day he posted a YouTube video (in Romanian) with two different solutions and a variant to one of the solutions:

I’ll get back to one of the solutions latter on in this writeup but, before going any further, let me remark that seeing the mayor of a large city in front of a blackboard solving a math problem is not an everyday sight. It turns out the solution is an 18-digit number: 105,263,157,894,736,842

When someone told me the story, my reaction was to say that I’m no mathematician, but sure I’m a programmer, so I’ll write a program to find the solution. Let the competition begin between brute force and smarts.

Testing the water #

I started with a simple solution: an array of digits (each one fits nicely in a byte) and doing the required transformation. The function that checks if we found the number is:

// Parameters:
//  num - array of 0 to 9 digits
//  digits - number of digits in the array
bool check(char* num, int digits)
{
  long long sum1 = 0;
  for (int i = 0; i < digits ; i++)
    sum1 = sum1*10 + num[i];

  long long sum2 = num[digits-1];
  for (int i = 0; i < digits-1; i++)
    sum2 = sum2 * 10 + num[i];

  return sum2 == 2*sum1;
}

Our array of digits can be viewed as the coefficients of a polynomial that has to be evaluated for x=10. To evaluate it we use Horner’s scheme just as I was showing in another article.

The only other function needed is a function to increment our array of digits:

void inc(char* num, int digits)
{
  digits--;
  do {
    num[digits] = (num[digits] + 1) % 10;
  } while (num[digits] == 0 && --digits >= 0);
}

Translated in English: starting from the least significant digit, increment the digit and, if there is a rollover, move to the more significant digit and do the same.

The main program checks successive numbers and, if it a solution, it prints it. Otherwise it increments the number and keeps going. There is also a timer borrowed from my UnitTest framework:

int main(int argc, char** argv)
{
  char num[20];
  int digits = 8;
  while (digits < 20)
  {
    memset (num, 0, sizeof(num));
    UnitTest::Timer t;
    t.Start();

    num[0] = 1;
    while (num[0] != 0)
    {
      if (check(num, digits))
      {
        std::cout << "stop" << std::endl;
        print(num, digits);
        exit(0);
      }
      inc(num, digits);
    }
    double tsec = t.GetTimeInMs() / 1000.;
    std::cout << "Finished " << digits << " digits in " 
      << std::setprecision(3) << tsec << " sec" << std::endl;
    ++digits;
  }
  return 0;
}

When the inner loop rolls over (num[0] == 0), the outer loop shows the timing result and increments the number of digits. Here are some results:

Finished 8 digits in 0.701 sec
Finished 9 digits in 7.57 sec
Finished 10 digits in 85.5 sec

Wow! Each additional digit takes 10 times as long as the previous one. This is to be expected as there are 10 times more numbers to check, but if 9 digits can be checked in 8 seconds, for 18 digits we will need $8\times10^9$ seconds. This is more than 250 years!

Going parallel #

If I look at my CPU utilization: it barely touches 25% because my program is single-threaded and all those cores and logical processors aren’t used at all. Luckily I have all the multi-threading and synchronization tools I need.

We will have a number of worker threads receive a range of numbers to check from a central dispatcher. When a thread finishes its assigned work, it picks a new one. A producer-consumer queue will be fed by the dispatcher and emptied by worker threads.

//unit of work
struct task {
  int start_digit;
  int digits;
};

mlib::async_queue<task> work(NUMTHREADS);

int main(int argc, char** argv)
{
  mlib::thread* workers[NUMTHREADS];
  chrono.Start();

  for (int i = 0; i < NUMTHREADS; i++)
  {
    workers[i] = new mlib::thread(worker);
    workers[i]->start();
  }

  int digits = 8;
  while (digits < 20)
  {
    for (int i = 1; i <= 9; ++i)
      work.produce({i, digits});
    ++digits;
  }
  return 0;
}

A unit of work (task) is something like “all 8-digits numbers starting with 1”. The central dispatcher, in the main program, places all these work units in the produce-consumer queue where worker threads pick them up:

int worker()
{
  char num[20];
  while (1)
  {
    task todo;
    work.consume(todo);
    memset(num, 0, sizeof(num));

    num[0] = todo.start_digit;
    while (num[0] == todo.start_digit)
    {
      if (check(num, todo.digits))
      {
        std::cout << "stop" << std::endl;
        print(num, todo.digits);
        exit(0);
      }
      inc(num, todo.digits);
    }
    double tsec = chrono.GetTimeInMs() / 1000.;
    std::cout << "Finished job " << todo.start_digit << '-' << todo.digits 
      << " at " << std::setprecision(3) << tsec << " sec" << std::endl;
  }
}

The worker gets a unit of work and starts verifying the numbers in the range assigned. If it finds a solution, it prints it and abruptly stops (no cleanup – damn the torpedoes!). Otherwise, when finished, picks a new range of numbers after printing a progress message.

With 9 threads the CPU utilization is perfect:

Finished job 7-8 at 0.142 sec
Finished job 1-8 at 0.144 sec
Finished job 5-8 at 0.145 sec
Finished job 4-8 at 0.147 sec
Finished job 2-8 at 0.149 sec
Finished job 8-8 at 0.151 sec
Finished job 6-8 at 0.154 sec
Finished job 9-8 at 0.156 sec
Finished job 3-8 at 0.159 sec
Finished job 5-9 at 1.4 sec
Finished job 1-9 at 1.48 sec
Finished job 7-9 at 1.49 sec
Finished job 4-9 at 1.49 sec
Finished job 3-9 at 1.5 sec
Finished job 8-9 at 1.51 sec
Finished job 2-9 at 1.51 sec
Finished job 6-9 at 1.54 sec
Finished job 9-9 at 1.56 sec
Finished job 5-10 at 16.1 sec
Finished job 4-10 at 16.2 sec
Finished job 9-10 at 16.5 sec
Finished job 3-10 at 16.7 sec
Finished job 7-10 at 16.8 sec
Finished job 6-10 at 16.9 sec
Finished job 2-10 at 17.1 sec
Finished job 1-10 at 17.1 sec
Finished job 8-10 at 17.1 sec

We need only 2 seconds for 9 digits so estimated time went down from $8\times10^9$ to $2\times10^9$ (only 63 years!).

Incremental Improvements #

Time to be a bit smarter: maybe we don’t need to evaluate the number polynomial every time. If our number is “4351”, the modified one will be “1435”. It cannot be the double of the original because “1” is smaller than “4”. The same goes for “4356” and “6435” because “6” is smaller than 2*4.

The modified verification function is:

bool check(char* num, int digits)
{
  //eliminate impossible cases
  if (num[digits - 1] < 2*num[0])
    return false;

  long long sum1 = 0;
  long long sum2 = num[digits - 1];
  for (int i = 0; i < digits; i++)
  {
    sum1 = sum1 * 10 + num[i];
    if (i < digits - 1)
      sum2 = sum2 * 10 + num[i];
  }
  return sum2 ==  2*sum1;
}

Results:

Finished job 5-8 at 0.054 sec
Finished job 8-8 at 0.056 sec
Finished job 7-8 at 0.058 sec
Finished job 6-8 at 0.06 sec
Finished job 9-8 at 0.064 sec
Finished job 4-8 at 0.076 sec
Finished job 3-8 at 0.088 sec
Finished job 2-8 at 0.104 sec
Finished job 1-8 at 0.128 sec
Finished job 5-9 at 0.572 sec
Finished job 6-9 at 0.585 sec
Finished job 7-9 at 0.597 sec
Finished job 8-9 at 0.608 sec
Finished job 9-9 at 0.631 sec
Finished job 4-9 at 0.701 sec
Finished job 3-9 at 0.857 sec
Finished job 2-9 at 1.01 sec
Finished job 1-9 at 1.29 sec
Finished job 5-10 at 5.67 sec
Finished job 6-10 at 5.75 sec
Finished job 7-10 at 5.9 sec
Finished job 8-10 at 6.05 sec
Finished job 9-10 at 6.34 sec
Finished job 4-10 at 7.17 sec
Finished job 3-10 at 8.76 sec
Finished job 2-10 at 10.5 sec
Finished job 1-10 at 13.1 sec

We are now down to an estimated time of $1.3\times10^9$ seconds (about 41 years). Some further tweaks to the increment function reduced it to 38 years.

Using integer math #

Until now I’ve used a byte array to hold the digits and assembled those digits when I needed to do math. Maybe it was the wrong approach; maybe we should keep integer numbers and separate digits when we need them. With this idea the verification function becomes:

bool inline check(long long num, long long exp)
{
  int msd = (int)(num / exp);
  int lsd = num % 10;

  if (lsd < 2*msd)
    return false;

  long long swapped = num - (lsd - msd) + (lsd - msd) *exp;
  return swapped ==  2*num;
}

The increment function disappears completely because incrementing an integer is a primitive operation.

The worker() thread function is also simpler:

int worker()
{
  while (1)
  {
    task todo;
    work.consume(todo);
    long long exp = (long long) pow(10, todo.digits-1);
    long long num = todo.start_digit * exp;
    long long lim = (todo.start_digit + 1) * exp;
    while (num < lim)
    {
      if (check(num, exp))
      {
        std::cout << "stop " << num << std::endl;
        exit(0);
      }
      ++num;
    }
    double tsec = chrono.GetTimeInMs() / 1000.;
    std::cout << "Finished job " << todo.start_digit << '-' << todo.digits 
      << " at " << std::setprecision(3) << std::setiosflags (std::ios::fixed) 
      << tsec << " sec" << std::endl;
  }
}

The resulting times are:

Finished job 2-8 at 0.038 sec
Finished job 3-8 at 0.041 sec
Finished job 6-9 at 0.273 sec
Finished job 9-9 at 0.288 sec
Finished job 7-9 at 0.296 sec
Finished job 5-9 at 0.304 sec
Finished job 4-9 at 0.310 sec
Finished job 2-9 at 0.316 sec
Finished job 3-9 at 0.326 sec
Finished job 8-9 at 0.332 sec
Finished job 1-9 at 0.350 sec
Finished job 6-10 at 2.851 sec
Finished job 9-10 at 2.859 sec
Finished job 8-10 at 2.871 sec
Finished job 1-10 at 2.885 sec
Finished job 7-10 at 2.904 sec
Finished job 5-10 at 2.956 sec
Finished job 4-10 at 2.977 sec
Finished job 3-10 at 3.031 sec
Finished job 2-10 at 3.121 sec

We have reduced our waiting time to $3\times10^8$ seconds, or “only” 9.5 years. Looking at the whole road I traveled, from 250 years to 9.5 years, it is a significant reduction and probably a beefier computer would be able to solve the problem by brute force. However, I feel it is now time to stop and look at the intelligent solution.

A mathematician’s solution #

This is the last solution presented by Mr. Dan, and to me it seems the easiest to explain and understand. Let us write the addition like 4-th grade students:

$$ \begin{align} \overline{a_n a_{n-1}\dots a_1 a_0} &+& \cr \overline{a_n a_{n-1}\dots a_1 a_0} &=& \cr \overline{a_0\;a_n\;\;\dots a_2 a_1} \end{align} $$
where $a_0\dots a_n$ are digits 0 to 9. $a_0$ cannot be 1 because it is the sum of 2 numbers $a_n+a_n$. We can now try successive values for values for $a_0$. Taking $a_0=2$ our addition becomes:
$$ \begin{align} \overline{a_n a_{n-1}\dots a_1 2} &+& \cr \overline{a_n a_{n-1}\dots a_1 2} &=& \cr \overline{2\;a_n\;\;\dots a_2 4} \end{align} $$
hence $a_1 = 4$. Replacing $a_1$ we further obtain:
$$ \begin{align} \overline{a_n a_{n-1}\dots 4 2} &+& \cr \overline{a_n a_{n-1}\dots 4 2} &=& \cr \overline{2\;a_n\;\;\dots a_2 4} \end{align} $$
hence $a_2=8$.

The process has to be continued until we obtain the digit 2 (our choice for $a_0$). In the end that gives

$$ \begin{align} 1 0 5 2 6 3 1 5 7 8 9 4 7 3 6 8 4 2 &+& \cr 1 0 5 2 6 3 1 5 7 8 9 4 7 3 6 8 4 2 &=& \cr 2 1 0 5 2 6 3 1 5 7 8 9 4 7 3 6 8 4 \end{align} $$

We would have to repeat the process for other choices for $a_0$ and verify that this is indeed the smallest number. That leads to an interesting next question: find all the numbers that have this property (I’m going to call them “Dan numbers”). Moreover, we can extend a bit the concept and define $DAN(n)$ as numbers that, if the least significant digit is moved in the most significant position, the resulting number is $n$ times the initial number.

I wrote a short program to do that, and here are the results:

$$ \eqalign { DAN(2) &= \{ 105263157894736842, 210526315789473684, 315789473684210526, 421052631578947368\} \cr DAN(3) &= \{ 1034482758620689655172413793, 2068965517241379310344827586, 3103448275862068965517241379 \} \cr DAN(4) &= \{ 102564, 205128\} \cr DAN(5) &= \{ 102040816326530612244897959183673469387755 \} \cr DAN(6) &= \{ 1016949152542372881355932203389830508474576271186440677966 \} \cr DAN(7) &= \{ 1014492753623188405797 \} \cr DAN(8) &= \{ 1012658227848 \}\cr DAN(9) &= \{ 10112359550561797752808988764044943820224719 \} } $$