Chebyshev’s Almost Prime Number Theorem

To the right, you can see a picture of the Prime Number Theorem. It states that the number of primes up to a real number x is asymptotically equal to \frac{x}{\ln x}.

And this was Pafnuty Lvovich Chebyshev who almost managed to prove it around the year 1850. His almost-proof resulted in a theorem named after him.

I was recently trying to understand the proof of Chebyshev’s theorem:

Theorem 1. There are constants 0 < c_1 < 1 < c_2 such that, for all sufficiently large real numbers X, c_1 \frac{X}{\log X} \leq \pi(X) \leq c_2 \frac{X}{logX}.

Chebyshev’s Theorem

In this post, I will reproduce this proof from [Cheb] together with some comments and tables.

Ben Green’s introductory text on the Prime Number Theorem (PNT) [Green2016] introduces this theorem as a first estimate of the correctness of the PNT with elementary methods. The actual PNT states that c_1 and c_2 in Theorem 1 can be taken arbitrarily close to 1. For some reason, Green does not tell us the name/author of this theorem but Green tells us why this theorem is interesting:

  • the upper bound is saying that \pi(X) = \mathcal{O}\left(X/ \log X\right)
  • the lower bound immediately implies the infinitude of primes (and with a much better bound than Euclid’s proof)
  • the upper bound implies, in particular, that the density of the primes up to X, that is to say \pi(X)/X, tends to zero as X \rightarrow \infty.

But, before one can understand the proof of Chebyshev’s theorem, one will definitely have to understand

Theorem 2: n! contains a prime factor p exactly \nu_p(n!) = \sum_{k\geq 1} \lfloor{\frac{n}{p^k}}\rfloor times. Or, in other words: \nu_p(n!) is the largest power of p that divides n and is called the p-adic valuation of n!.

Legendre’s Theorem

Now, before talking about a proof of Theorem 2, I would like to state the following

Fun Fact 3: Every prime factor of n! is less than or equal to n.

This follows from the fact that if a prime p divides a product, it must divide at least one of its factors, p\vert ab \Rightarrow p\vert a \vee p\vert b. In the case of Fun Fact 3 it implies that if a prime p divides n! then it must divide one of the factors 1\cdot 2 \cdot \ldots \cdot n.

Now, let’s get back to the proof of Theorem 2: What we want to understand is “how often” a certain prime number divides n!. And this quantity will be called \nu_p(n!). Let’s look at an example first: The prime number 2 divides 4!=24=3\cdot 2^3 exactly 3 times. According to the formula that we want to prove \nu_2(4!) = \lfloor\frac{4}{2}\rfloor + \lfloor\frac{4}{2^2}\rfloor + \lfloor\frac{4}{2^3}\rfloor + \ldots = 2 + 1 + 0 + \ldots = 3. The reason why this works is because p divides \lfloor\frac{n}{p}\rfloor factors (2 divides 2 and 4 in the example). p^2 divides \lfloor\frac{n}{p^2}\rfloor factors (2^2 divides only 4 in the example). And so on. Thus, since all numbers that are divisible by p^2 are also divisble by p, they contribute via both terms of the sum. In our example, the number 4 contributes once via the first term and once via the second term. In total it contributes 2 to \nu_2(4!) which makes sense since it is equal to 2^2. This concludes our proof of Theorem 2. \Box

Now that we know Legendre’s Theorem, we can prove the lower bound of Theorem 1: I will reproduce this proof from [Cheb] and add some comments and tables. We will first define a number N := {2n \choose n} = \frac{2n!}{n!(2n-n)!} = \frac{2n!}{(n!)^2}. We can apply Theorem 2 to obtain \nu_p(N) = \sum_{m\geq 1} \left(\lfloor\frac{2n}{p^m}\rfloor - 2\lfloor\frac{n}{p^m}\rfloor\right). All the terms in this sum are either 0 or 1. This becomes evident by

Lemma 4: For all x \in \mathbb{R} we have \lfloor 2x \rfloor - 2\lfloor x \rfloor \in {0, 1}.

Proof: Write x = \lfloor x\rfloor + \{x\}. If the fractional part \{x\} < \frac{1}{2} , then 2x = 2\lfloor x\rfloor + \{2x\}, hence \lfloor 2x \rfloor - 2\lfloor x\rfloor = 0. If \{x\} \geq \frac{1}{2}, then we get \lfloor 2x\rfloor - 2\lfloor x\rfloor = 1.

It is also clear that if m > \frac{\log 2n}{\log p} this means that p^m > 2n and therefore \lfloor\frac{2n}{p^m}\rfloor - 2\lfloor\frac{n}{p^m}\rfloor = 0 because both fractions (inside the floor brackets) are smaller than one. Now that we know the maximum value of m that still contributes something to the sum in \nu_p(N) we can use this together with Lemma 4 (the fact that all terms in the sum are either zero or one) to obtain the bound \nu_p(N) \leq \lfloor \frac{\log 2n}{\log p}\rfloor.

\displaystyle 2n \log 2 - \log 2n \underbrace{\leq}_{1.} \log {2n \choose n} \underbrace{\leq}_{2.}  \sum_{p\leq2n} \lfloor \frac{\log 2n}{\log p}\rfloor \log p \underbrace{\leq}_{3.}  \sum_{p\leq 2n} \log 2n = \pi(2n) \log 2n

At the less-or-equal signs we have used:

  1. \frac{2^{2n}}{2n} \leq {2n \choose n}.
  2. N = \prod p^{\nu_p(N)}
  3. \lfloor x\rfloor \leq x

This yields the lower bound \displaystyle \pi(2n) \geq \log 2 \frac{2n}{\log 2n} - 1. We claim that this implies \displaystyle \pi(x) \geq \frac{\log 2}{2}\frac{x}{\log x} for all x > 2. This inequality can be checked directly for x \leq 16:

x \pi(x)
\frac{\log 2}{2}\frac{x}{\log x}
211
3
2
\approx 0.9464
421
53 \approx 1.0767
63 \approx 1.1606
74 \approx  1.2467
84 \approx  1.3333
94 \approx  1.4196
104 \approx  1.5051
115 \approx  1.5899
125 \approx  1.6737
136 \approx  1.7565
146 \approx  1.8385
156 \approx  1.9197
1662

Hence it is sufficient to prove it for x > 16. Pick an integer n with 16 \leq 2n < x \leq 2n + 2.
Then \displaystyle \frac{2n}{\log 2n} - \frac{n + 1}{\log 2n} = \frac{n - 1}{\log 2n} \geq \frac{7}{4 \log 2} \geq \frac{1}{\log 2}, hence \displaystyle \pi(x) \geq \pi(2n) \geq \log 2 \frac{2n}{\log 2n} - 1 \geq \frac{(n + 1) \log 2}{\log(2n + 2)} \geq \frac{\log 2}{2}\frac{x}{\log x}. This concludes the proof of the lower bound of Theorem 1.

For the upper bound: Let’s first observe 2 facts about the binomial coefficient N={2n \choose n}:

  1. \frac{2^{2n}}{2n} \leq {2n \choose n} \leq 2^{2n}: this comes from the binomial formula (x+y)^n = \sum_{k=0}^n {n \choose k} x^{n-k} y^k where we set x=y=1: (1+1)^{2n} = \sum_{k=0}^{2n} {2n \choose k}.
  2. N is not divisible by any primes p > 2n. See Fun Fact 3.

These properties imply that n^{\pi(2n) - \pi(n)} \leq \prod_{n<p\leq 2n} p \leq {2n \choose n} \leq 2^{2n}.

Taking the log gives \pi(2n)-\pi(n) \leq 2 \log 2 \frac{n}{\log n}. Using induction we now easily see that \pi(2^k) \leq 3\cdot \frac{2^k}{k}.

In fact, this is checked directly for k \leq 5:

k\pi(2^k)3\cdot \frac{2^k}{k}
106
22
6
348
4612
51196/5\approx 19

For k > 5 we find \pi(2^{k+1}) \leq \pi(2^k) + \frac{2^{k+1}}{k} \leq \frac{3 \cdot 2^k}{k} + \frac{2 \cdot 2^k}{k} = \frac{5 \cdot 2^k}{k} \leq \frac{3 \cdot 2^{k+1}}{k + 1}.

Now we exploit the fact that f(x) = \frac{x}{\log x} is monotonely increasing for x \geq e. Thus if 4 \leq 2^k < x \leq 2^{k+1}, then \pi(x) \leq \pi(2^{k+1}) \leq 6\frac{2^k}{k + 1} \leq 6 \log 2 \frac{2^k}{\log 2^k} \leq 6 \log 2 \frac{x}{\log x}.

In the last step we have only onsidered values of x>4. But since \pi(x) \leq 6 \log 2 \frac{x}{\log x} for x \leq  4, the proof is now complete. \Box

Here are some links that I’ve collected on this topic (the first two of which are referenced in this post):


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s