# Chebyshev’s Almost Prime Number Theorem

To the right, you can see a picture of the Prime Number Theorem. It states that the number of primes up to a real number $x$ is asymptotically equal to $\frac{x}{\ln x}$.

And this was Pafnuty Lvovich Chebyshev who almost managed to prove it around the year 1850. His almost-proof resulted in a theorem named after him.

I was recently trying to understand the proof of Chebyshev’s theorem:

Theorem 1. There are constants $0 < c_1 < 1 < c_2$ such that, for all sufficiently large real numbers X, $c_1 \frac{X}{\log X} \leq \pi(X) \leq c_2 \frac{X}{logX}$.

Chebyshev’s Theorem

In this post, I will reproduce this proof from [Cheb] together with some comments and tables.

Ben Green’s introductory text on the Prime Number Theorem (PNT) [Green2016] introduces this theorem as a first estimate of the correctness of the PNT with elementary methods. The actual PNT states that $c_1$ and $c_2$ in Theorem 1 can be taken arbitrarily close to 1. For some reason, Green does not tell us the name/author of this theorem but Green tells us why this theorem is interesting:

• the upper bound is saying that $\pi(X) = \mathcal{O}\left(X/ \log X\right)$
• the lower bound immediately implies the infinitude of primes (and with a much better bound than Euclid’s proof)
• the upper bound implies, in particular, that the density of the primes up to $X$, that is to say $\pi(X)/X$, tends to zero as $X \rightarrow \infty$.

But, before one can understand the proof of Chebyshev’s theorem, one will definitely have to understand

Theorem 2: $n!$ contains a prime factor $p$ exactly $\nu_p(n!) = \sum_{k\geq 1} \lfloor{\frac{n}{p^k}}\rfloor$ times. Or, in other words: $\nu_p(n!)$ is the largest power of $p$ that divides $n$ and is called the p-adic valuation of $n!$.

Legendre’s Theorem

Now, before talking about a proof of Theorem 2, I would like to state the following

Fun Fact 3: Every prime factor of n! is less than or equal to n.

This follows from the fact that if a prime p divides a product, it must divide at least one of its factors, $p\vert ab \Rightarrow p\vert a \vee p\vert b$. In the case of Fun Fact 3 it implies that if a prime $p$ divides $n!$ then it must divide one of the factors $1\cdot 2 \cdot \ldots \cdot n$.

Now, let’s get back to the proof of Theorem 2: What we want to understand is “how often” a certain prime number divides $n!$. And this quantity will be called $\nu_p(n!)$. Let’s look at an example first: The prime number 2 divides $4!=24=3\cdot 2^3$ exactly 3 times. According to the formula that we want to prove $\nu_2(4!) = \lfloor\frac{4}{2}\rfloor + \lfloor\frac{4}{2^2}\rfloor + \lfloor\frac{4}{2^3}\rfloor + \ldots = 2 + 1 + 0 + \ldots = 3$. The reason why this works is because $p$ divides $\lfloor\frac{n}{p}\rfloor$ factors (2 divides 2 and 4 in the example). $p^2$ divides $\lfloor\frac{n}{p^2}\rfloor$ factors ($2^2$ divides only 4 in the example). And so on. Thus, since all numbers that are divisible by $p^2$ are also divisble by $p$, they contribute via both terms of the sum. In our example, the number 4 contributes once via the first term and once via the second term. In total it contributes 2 to $\nu_2(4!)$ which makes sense since it is equal to $2^2$. This concludes our proof of Theorem 2. $\Box$

Now that we know Legendre’s Theorem, we can prove the lower bound of Theorem 1: I will reproduce this proof from [Cheb] and add some comments and tables. We will first define a number $N := {2n \choose n} = \frac{2n!}{n!(2n-n)!} = \frac{2n!}{(n!)^2}$. We can apply Theorem 2 to obtain $\nu_p(N) = \sum_{m\geq 1} \left(\lfloor\frac{2n}{p^m}\rfloor - 2\lfloor\frac{n}{p^m}\rfloor\right)$. All the terms in this sum are either 0 or 1. This becomes evident by

Lemma 4: For all $x \in \mathbb{R}$ we have $\lfloor 2x \rfloor - 2\lfloor x \rfloor \in {0, 1}$.

Proof: Write $x = \lfloor x\rfloor + \{x\}$. If the fractional part $\{x\} < \frac{1}{2}$ , then $2x = 2\lfloor x\rfloor + \{2x\}$, hence $\lfloor 2x \rfloor - 2\lfloor x\rfloor = 0$. If $\{x\} \geq \frac{1}{2}$, then we get $\lfloor 2x\rfloor - 2\lfloor x\rfloor = 1$.

It is also clear that if $m > \frac{\log 2n}{\log p}$ this means that $p^m > 2n$ and therefore $\lfloor\frac{2n}{p^m}\rfloor - 2\lfloor\frac{n}{p^m}\rfloor = 0$ because both fractions (inside the floor brackets) are smaller than one. Now that we know the maximum value of $m$ that still contributes something to the sum in $\nu_p(N)$ we can use this together with Lemma 4 (the fact that all terms in the sum are either zero or one) to obtain the bound $\nu_p(N) \leq \lfloor \frac{\log 2n}{\log p}\rfloor$.

$\displaystyle 2n \log 2 - \log 2n \underbrace{\leq}_{1.} \log {2n \choose n} \underbrace{\leq}_{2.} \sum_{p\leq2n} \lfloor \frac{\log 2n}{\log p}\rfloor \log p \underbrace{\leq}_{3.} \sum_{p\leq 2n} \log 2n = \pi(2n) \log 2n$

At the less-or-equal signs we have used:

1. $\frac{2^{2n}}{2n} \leq {2n \choose n}$.
2. $N = \prod p^{\nu_p(N)}$
3. $\lfloor x\rfloor \leq x$

This yields the lower bound $\displaystyle \pi(2n) \geq \log 2 \frac{2n}{\log 2n} - 1$. We claim that this implies $\displaystyle \pi(x) \geq \frac{\log 2}{2}\frac{x}{\log x}$ for all $x > 2$. This inequality can be checked directly for $x \leq 16$:

 x $\pi(x)$ $\frac{\log 2}{2}\frac{x}{\log x}$ 2 1 1 3 2 $\approx 0.9464$ 4 2 1 5 3 $\approx 1.0767$ 6 3 $\approx 1.1606$ 7 4 $\approx 1.2467$ 8 4 $\approx 1.3333$ 9 4 $\approx 1.4196$ 10 4 $\approx 1.5051$ 11 5 $\approx 1.5899$ 12 5 $\approx 1.6737$ 13 6 $\approx 1.7565$ 14 6 $\approx 1.8385$ 15 6 $\approx 1.9197$ 16 6 2

Hence it is sufficient to prove it for $x > 16$. Pick an integer $n$ with $16 \leq 2n < x \leq 2n + 2$.
Then $\displaystyle \frac{2n}{\log 2n} - \frac{n + 1}{\log 2n} = \frac{n - 1}{\log 2n} \geq \frac{7}{4 \log 2} \geq \frac{1}{\log 2}$, hence $\displaystyle \pi(x) \geq \pi(2n) \geq \log 2 \frac{2n}{\log 2n} - 1 \geq \frac{(n + 1) \log 2}{\log(2n + 2)} \geq \frac{\log 2}{2}\frac{x}{\log x}$. This concludes the proof of the lower bound of Theorem 1.

For the upper bound: Let’s first observe 2 facts about the binomial coefficient $N={2n \choose n}$:

1. $\frac{2^{2n}}{2n} \leq {2n \choose n} \leq 2^{2n}$: this comes from the binomial formula $(x+y)^n = \sum_{k=0}^n {n \choose k} x^{n-k} y^k$ where we set $x=y=1$: $(1+1)^{2n} = \sum_{k=0}^{2n} {2n \choose k}$.
2. N is not divisible by any primes $p > 2n$. See Fun Fact 3.

These properties imply that $n^{\pi(2n) - \pi(n)} \leq \prod_{n.

Taking the log gives $\pi(2n)-\pi(n) \leq 2 \log 2 \frac{n}{\log n}$. Using induction we now easily see that $\pi(2^k) \leq 3\cdot \frac{2^k}{k}$.

In fact, this is checked directly for $k \leq 5$:

 k $\pi(2^k)$ $3\cdot \frac{2^k}{k}$ 1 0 6 2 2 6 3 4 8 4 6 12 5 11 $96/5\approx 19$

For $k > 5$ we find $\pi(2^{k+1}) \leq \pi(2^k) + \frac{2^{k+1}}{k} \leq \frac{3 \cdot 2^k}{k} + \frac{2 \cdot 2^k}{k} = \frac{5 \cdot 2^k}{k} \leq \frac{3 \cdot 2^{k+1}}{k + 1}$.

Now we exploit the fact that $f(x) = \frac{x}{\log x}$ is monotonely increasing for $x \geq e$. Thus if $4 \leq 2^k < x \leq 2^{k+1}$, then $\pi(x) \leq \pi(2^{k+1}) \leq 6\frac{2^k}{k + 1} \leq 6 \log 2 \frac{2^k}{\log 2^k} \leq 6 \log 2 \frac{x}{\log x}$.

In the last step we have only onsidered values of $x>4$. But since $\pi(x) \leq 6 \log 2 \frac{x}{\log x}$ for $x \leq 4$, the proof is now complete. $\Box$

Here are some links that I’ve collected on this topic (the first two of which are referenced in this post):