Evaluation of Polynomials and Powers #

The Problem #

I saw somewhere the following fragment that caught my eye:

for (j = 0; j <= Grado; j++)
  zy[j] += Val[i][0] * pow(Val[i][1], j);

If we call Val[i][1] a and Val[i][0] b, we can see that the author evaluates b *(1+a+a²+a³+…) in a remarkably inefficient way.

It reminded me of a quote from Numerical Recipes in C where the authors, in their opinionated style, were saying: ¹

We assume that you know enough never to evaluate a polynomial this way:
p = c[0] + c[1] * x + c[2] * x * x + c[3] * x * x * x + c[4] * x * x * x * x;
or (even worse!),
p = c[0] + c[1] * x + c[2] * pow(x,2.0) + c[3] * pow(x,3.0) + c[4] * pow(x,4.0);
Come the (computer) revolution, all persons found guilty of such criminal behavior will be summarily executed, and their programs won’t be!

The Code #

We still don’t know when the (computer) revolution envisaged by Dr. Press et al. will come but, in the meantime, you can save yourself by using the following template functions:

/*!
  Evaluate a polynomial using Horner's scheme
  \param x Evaluation point
  \param coeff polynomial coefficients in order from lowest power (coeff[0]) to
               highest power (coeff[N-1])
  \param n size of coefficient's array.

  \return Polynomial value in point x
          coeff[n-1]*x^(n-1) + coeff[n-2]*x^(n-2) + ... + coeff[1]*x + coeff[0]
*/
template <typename T>
T poly (T x, const T* coeff, int n)
{
  T val = coeff[n - 1];
  for (int i = n - 2; i >= 0; i--)
  {
    val *= x;
    val += coeff[i];
  }
  return val;
}

///  Evaluate a polynomial using Horner's scheme
template <typename T, size_t N>
T poly (T x, std::array<T, N> coeff)
{
  return poly (x, coeff.data (), (int)N);
}

///  Evaluate a polynomial using Horner's scheme
template <typename T>
T poly (T x, std::vector<T> coeff)
{
  return poly (x, coeff.data (), (int)coeff.size());
}

There are three variants of the poly function: the first one accepts a “classical” C array while the second and the third accept a C++ array or, respectively, a vector.

Below are some examples of how to use them:

//coeffs for (X+1)^3
int cube[] { 1,3,3,1 };

int v = poly (2, cube, _countof(cube));
// v is 27
// ...

// f(x) = x^4 + 2x^3 + 3x^2 + 4x + 5
// f(2) = 57
v = poly (2, std::array<int, 5>{ 5, 4, 3, 2, 1 });

// Coefficients for sine approximation using Taylor series
// sin(x) = x - x^3/3! + x^5/5! - x^7/7! + ...
std::vector<double> a(8);
double fact = -1;
for (int i = 1; i <= 7; i++)
{
  if (i % 2)
  {
    fact *= -i;
    a[i] = 1 / fact;
  }
  else
    fact *= i;
}

double vv = poly (M_PI/2, a);
// vv is very close to 1.

Integer Powers #

Now that I, hopefully, convinced you not to use the pow function to compute polynomials, let me show you a function that can be more efficient for computing integer powers:

//integer exponentiation function
template <typename T>
T ipow (T base, int exp)
{
  T result = 1;
  while (exp)
  {
    if (exp & 1)
      result *= base;
    exp >>= 1;
    base *= base;
  }
  return result;
}

Do not use this function for polynomial evaluation. You can use it in other cases where you have to compute some integer powers and you want to show off to friends and family.

In case they ask you if this function is more efficient than classical pow function, here are some numbers from my computer (no speed monster this one). The results are for 1000000 calls to pow and ipow functions with exponent 2 and 32 respectively.

Pow 2 results (usec):
 ipow - integer base, integer power - 1355
 ipow - double base, integer power - 1685
 pow  - double base, integer power - 21623
 pow  - double base, double power - 24605
 pow  - integer base, integer power - 27731

Pow 32 results (usec):
 ipow - integer base, integer power - 4666
 ipow - double base, integer power - 5014
 pow  - double base, integer power - 26776
 pow  - double base, double power - 24592
 pow  - integer base, integer power - 32638

In C++, the pow function is highly overloaded and, for small exponents, it is quite efficient. For larger exponents however, our ipow function wins the battle.²

Even faster (small) integer powers #

In some particular cases, we can do even better by unrolling the loop for small exponents. Due to their prevalence in mathematical formulas, such “optimization” seems justified. Below are the two functions added to the mix:

/// Return squared value of argument: base²
template <typename T>
inline T squared (T base)
{
  return base * base;
}

/// Return the cubed value of argument: base³
template <typename T>
inline T cubed (T base)
{
  return base * base * base;
}

Timing results show that, indeed, squared and cubed functions perform better that both ipow and pow functions:

Squared results (usec):
 squared - integer base - 336
 squared - double base - 544
 pow  - double base, integer power - 21630
 pow  - double base, double power - 22306
 pow  - integer base, integer power - 26777

Cubed results (usec):
 cubed - integer base - 756
 cubed - double base - 567
 pow  - double base, integer power - 29267
 pow  - double base, double power - 28058
 pow  - integer base, integer power - 28008

The following table summarizes these results:

Base	Exponent	squared	cubed	ipow	pow
integer	2	0.3	-	1.4	27.7
	3	-	0.7	2.9	28.6
	32	-	-	4.6	32.6
double	2	0.5	-	1.7	24.6
	3	-	0.6	3.2	30.2
	32	-	-	5.0	24.6

Table 1 — Timing results - time in milliseconds for one million exponentiations

Show me the code #

All the code is part of MLIB library.

polynomial evaluation function is in poly.h.
integer exponentiation functions are in ipow.h
timing test code is part of the test suite and can be found in tests_ipow.cpp

History #

June 19, 2020: Revised timing results; added timing code
October 25, 2024: Added squared and cubed functions

William H. Press … [and others]. Numerical Recipes in C : the Art of Scientific Computing. Cambridge [Cambridgeshire] ; New York :Cambridge University Press, 1992. page 173 ↩︎
These results are from 2024 using MSVC 2022. It is interesting to compare those with the original values from 2020 using MSVC 2019 (and a significantly slower processor):
```
Pow 2 results (usec):
ipow - integer base, integer power - 1219
ipow - double base, integer power - 1267
pow  - double base, integer power - 1010
pow  - double base, double power - 26184

One second has 1002864 usec
Pow 32 results (usec):
ipow - integer base, integer power - 9178
ipow - double base, integer power - 2614
pow  - double base, integer power - 24876
pow  - double base, double power - 24718
```
Note that there isn’t much difference between the speed of ipow and pow if exponent is integer 2. Actually pow is a bit faster (1.01 ms) than ipow (1.27 ms). In the results set from 2024 this is no longer the case: pow time is 21.6 ms while ipow time is 1.7 ms. One possible explanation might be that MSVC 2019 runtime library was optimized for this particular case. ↩︎