Euler-Maclaurin summation via the Basel problem

I recently started playing with the book 📖 “Riemann’s Zeta Function” by H.M. Edwards. In chapter 6 he introduces Euler-Maclaurin summation, a concept that has allowed Danish mathematician Jørgen Pedersen Gram to find the first list of 15 non-trivial zeros of the Riemann Zeta function in 1903. I will repeat parts of Edwards’ arguments here and use this blog post to introduce Euler-Maclaurin summation in my own words (in fact I will heavily quote from the book). I will not yet connect this with calculating zeros of the Zeta function, it might be the topic of a future post. I will, however, add some pictures and reference the Python code that I’ve used to generate those pictures (https://mybinder.org/v2/gh/ahartel/rzf.git/HEAD).

As the initial example, Edwards choses the sum S = (\frac{1}{10})^2 + (\frac{1}{11})^2 + \ldots + (\frac{1}{100})^2 which is part of the series \zeta(2), whose elusive evaluation was known as the Basel problem before Euler solved it in 1735, discovering exactly Euler-Maclaurin summation on the way.

A first, crude approximation

Evaluating efficiently (and approximately) certain infinite sums, like for example the Riemann Zeta function, is exactly what Euler-Maclaurin summation is useful for. For our concrete example S as defined above, we might first observe that we can approximate the sum by an area under the curve of f(x)=(1/x)^2. To get a feeling for this, let’s start by plotting it.

Figure 1: The sum S vs. the integral of f(x) = 1/x^2

The bars in Figure 1 are clearly overestimating the area under the red curve, i.e., the sum is overestimating the integral of f(x). Or, to put the other way around, the integral is underestimating our sum. But, as a next step, we can observe that \int_a^b f(x) dx \approx \sum_{i=1}^{n} \frac{f(x_i) + f(x_{i-1})}{2} (x_i - x_{i-1}) which goes by the name “trapezoidal rule“. Here’s what it looks like:

Figure 2: The trapezoidal rule vs. the integral of f(x) = 1/x^2

The new way to choose the height of the bars is certainly a better approximation to the integral. But remember that our goal is not to approximate the integral but rather to have the integral approximate our sum. To connect the integral, the trapezoidal rule and our initial sum S, we can observe the following:

\displaystyle S = \sum_{}^{} \left(\frac{1}{i}\right)^2 = \frac{1}{2} (\frac{1}{10})^2 + \frac{1}{2} ((\frac{1}{10})^2 + (\frac{1}{11})^2) + \frac{1}{2} ((\frac{1}{11})^2 + (\frac{1}{12})^2) + \ldots + \frac{1}{2} ((\frac{1}{99})^2 + (\frac{1}{100})^2) + \frac{1}{2} (\frac{1}{100})^2 \approx \frac{1}{2} (\frac{1}{10})^2 + \int_{10}^{100} (\frac{1}{x})^2 dx + \frac{1}{2} (\frac{1}{100})^2

Which we obtain by applying the trapezoidal rule and then adding the “fringe” terms. Let’s write this down in a more general form:

\displaystyle \sum_{n=M}^M f(x) \approx \int_M^M f(x) dx + \frac{1}{2}(f(M) + f(N)) \ \ \ \ \ \ \ \ (1)

This gives a first approximation of

\displaystyle S \approx \int_{10}^{100} (\frac{1}{x})^2 dx + 0.5 \times (0.01 + 0.0001) = \left[-\frac{1}{x}\right]_{10}^{100} + 0.005 + 0.00005 = 0.09505.

The real value is slightly higher, but we’ll get to that later.

Defining the error term

Now, looking again at Figure 2, we might try to find a formula for the shark-fin-shaped differences between the 1/x^2 curve and the bars. These differences alter between positive and negative sign because first the red curve starts higher than the first bar, then crosses it to run through the bar, just to repeat that pattern with smaller amplitude with the next bar and so on.

It turns out that the missing term in equation (1) is the last term in the following equation:

\displaystyle \sum_{n=M}^{N} f(n) = \int_M^N f(x)dx + \frac{1}{2} \left[f(M) + f(N)\right] + \int_M^N (x - \left\lfloor x \right\rfloor - \frac{1}{2}) f'(x)dx\ \ \ \ \ \ (2)

I must admit that I find Edwards’ derivation of this term slightly hard to follow but here’s how we can convince ourselves that it makes sense once we know it:

\displaystyle \int_M^N (x - \left\lfloor x \right\rfloor - \frac{1}{2}) f'(x)dx = \sum_{n=M}^{N-1} \int_{n}^{n+1} (x - \left\lfloor x \right\rfloor - \frac{1}{2}) f'(x)dx = \sum_{n=M}^{N-1} \int_{0}^{1} (t-\frac{1}{2}) f'(n+t)dt = \sum_{n=M}^{N-1} \left[(t-\frac{1}{2}) f(n+t)|_{0}^{1} - \int_{0}^{1} f(n+t)dt  \right] = \sum_{n=M}^{N-1} \left[\frac{1}{2}f(n+1) + \frac{1}{2} f(n)\right] - \sum_{n=M}^{N-1} \int_{0}^{1} f(n+t)dt = \frac{1}{2}f(M) + f(M+1) + f(M+2) + \ldots + f(N-1) + \frac{1}{2} f(N) - \int_M^N f(x)dx

Now we can apply equation (2) to our series S. Visually this looks like the following picture:

Figure 3

Now, to make any use of this new term for our estimation of the value of S, we must observe the following: As can be seen from equation (2) and as can hardly be seen in Figure 3, the first positive fin’s area is slightly larger than the first negative fin’s area which has itself a larger (but negative) area than the third fin. Thus, the sum of all fins but the first one will be something negative and, therefore, the value of the integral in the last term of (2) is certainly smaller than the area under the first positive fin:

\displaystyle \int_{10}^{10.5} \frac{(x - \left\lfloor x \right\rfloor - 1/2)}{x^3} dx = \int_0^{1/2} \frac{1-2t}{(10+t)^3} dt \leq \int_0^{1/2} (1-2t)dt = 0.00025

Thus, S lies between our previous estimate of 0.09505 and 0.09530.

Expanding the error term

In this last part of this article, we will derive a series that approximates the error term that we have introduced in the last section. Our starting point will be an intermediate result of last section’s derivation of the error term:

\displaystyle \int_M^N (x - \left\lfloor x \right\rfloor - \frac{1}{2}) f'(x)dx = \sum_{n=M}^{N-1} \int_{n}^{n+1} (x - \left\lfloor x \right\rfloor - \frac{1}{2}) f'(x)dx = \sum_{n=M}^{N-1} \int_{0}^{1} (t - \frac{1}{2}) f'(n+t)dt\ \ \ \ \ \ (3).

And from there, repeated integration by parts will yield the final Euler-Maclaurin summation formula.

The final formula will contain the Bernoulli numbers, that I have described in a previous post. To see how these arise, we first need to introduce the Bernoulli polynomials. These are defined via \int_x^{x+1} B_n(t)dt = x^n. I.e., they are polynomials that when integrated from x to x+1 give x raised to the n-th power. If we just assume that for example B_3(x) is an arbitrary polynomial in x of degree 3 then this definition allows us to find its coefficients and thus B_3(x) = x^3 - \frac{3}{2} x^2 + \frac{1}{2} x.

By differentiating the definition, we can find the following relations which then allow us to find B_0 to B_2 from B_3. These relations are:

B_n(x+1) - B_n(x) = n x^{n-1} and \int_x^{x+1} \frac{1}{n} B'_n(t)dt = x^{n-1} which shows that B'_n(x)/n satisfies the definition of B_{n-1}(x) and hence that B'_n(x) = nB_{n-1}(x).

These allow us to find B_2(x) = x^2 - x - \frac{1}{6}, B_1(x) = x - \frac{1}{2} and B_0(x) = 1.

Now that we have a basic definition of the Bernoulli polynomials, we can identify B_1(x) in equation (3) and repeatedly integrate by parts. Due to the relations we found for the Bernoulli polynomials, this turns out to yield integrals of increasing Bernoulli polynomials times increasing derivatives of f(x). One step as an example:

\displaystyle \int_M^N (x - \left\lfloor x \right\rfloor - \frac{1}{2}) f'(x)dx = \sum_{n=M}^{N-1} \int_{n}^{n+1} (x - \left\lfloor x \right\rfloor - \frac{1}{2}) f'(x)dx = \sum_{n=M}^{N-1} \int_{0}^{1} (t - \frac{1}{2}) f'(n+t)dt = \sum_{n=M}^{N-1} \int_0^1 B_1(t) f'(n+1) dt = \sum_{n=M}^{N-1} \left[\frac{B_2(t)}{2} f'(n+t)|_0^1 - \frac{1}{2} \int_0^1 B_2(t) f''(n+t) dt \right] = -\frac{B_2(0)}{2} f'(M) + \frac{B_2(1)}{2} f'(M+1) - \frac{B_2(0)}{2}f'(M+1) + \frac{B_2(1)}{2} f'(M+2) - \frac{B_2(0)}{2}f'(M+2) + \ldots + \frac{B_2(1)}{2} f'(N) - \frac{1}{2} \int_M^N B_2(x-\left\lfloor x \right\rfloor) f''(x)dx \ \ \ \ \ \ (4)

The sum collapses and what remains is the following: \displaystyle \frac{B_2(0)}{2} f'(x)|_M^N - \frac{1}{2} \int_M^N B_2(x-\left\lfloor x \right\rfloor)f''(x)dx.

Stopping here, gives us a term that is equal to \displaystyle \frac{B_2(0)}{2}f'(x)|_M^N = \frac{1/6}{2}\frac{-2}{x^3}|_{10}^{100} = \frac{1}{6} \left[10^{-3} - 10^{-6}\right] = 1.665 \cdot 10^{-4}.

The remaining term \displaystyle \frac{1}{2} \int_M^N B_2(x-\left\lfloor x \right\rfloor) f''(x)dx is less than \displaystyle 6.25\cdot 10^{-7} because of the following estimate:

\displaystyle -\frac{1}{2} \int_M^N B_2(x-\left\lfloor x \right\rfloor) f''(x)dx = -\frac{B_3(0)}{2\cdot 3} f''(x)|_M^N + \frac{1}{2\cdot 3}\int_M^N B_3(x-\left\lfloor x \right\rfloor) f'''(x)dx = \frac{(-2)(-3)(-4)}{2\cdot 3}\int_{10}^{100} \frac{B_3(x - \left\lfloor x \right\rfloor)}{x^5}dx \leq \frac{4}{10^5} \int_{10}^{10.5} B_3(x - \left\lfloor x \right\rfloor) dx = \frac{4}{10^5} \int_0^{1/2} B_3(t) dt = \frac{4}{10^5} \left[\frac{1}{4}t^4 - \frac{1}{2} t^3 + \frac{1}{4} t^2\right]|_0^{1/2} = \frac{4}{10^5}\left[\frac{1}{64} -\frac{1}{16}+ \frac{1}{16} \right] = \frac{1}{10^5 \cdot 16} = 6.25 \cdot 10^{-7}

Here, we have again used the declining-alternating-shark-fin-area argument from last section and have estimated the term by its first positive bump. See the next figure for a plot of the integrand:

Figure 4: B_3(x-\lfloor x \rfloor)/x^5

As a last estimate of S, we can now say, with our last estimate, that S = 0.05505 + 1.665\times 10^{-4} - 4\int_{10}^{100}\frac{B_3(x-\lfloor x\rfloor)}{x^5}dx and therefore 0.095215875 \leq S \leq 0.0952165 which is correct to 6 places.

Repeating this process (and adding in some insights about the Bernoulli polynomials and the closely related Bernoulli numbers which I will not repeat here) we obtain the following Euler-Maclaurin summation formula:

\displaystyle \sum_{n=M}^{N} f(n) = \int_M^N f(x)dx + \frac{1}{2}\left[f(M) + f(N)\right] + \frac{B_2}{2} f'(x)|_M^N + \frac{B_4}{4!} f'''(x)|_M^N + \ldots + \frac{B_{2\nu}}{(2\nu)!} f^{(2\nu-1)}(x)|_M^N + R_{2\nu} \ \ \ \ \ \ (5)

For the error term, it is worth noting that it is not greater than about twice the size of the last term omitted. So as long as the terms in the summation formula are decreasing so is the error term.

As a last step, I want to point out that we can’t estimate \zeta(2) by adding the first 9 terms that we have omitted so far. These are \frac{1}{1^2} + \frac{1}{2^2} + \cdots + \frac{1}{9^2} and they add up to \approx 1.53976773. Thus, our estimate would be 1.634983605 \leq \zeta(2) \leq 1,63498423. But, comparing this to the known value of \zeta(2) = \frac{\pi^2}{6} \approx 1,644934067 we see that the real value does not fall into the boundaries that out naive estimate gives. The reason for why this fails might be the topic of a future blog post. But let me finish by noting that it does not not work because we cannot apply Euler-Maclaurin summation to the zeta function but because we have not chosen our parameters correctly.

Some links

,

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s