mostly math?

correctness not guaranteed

proving terrible lower bounds for the prime-counting function

Even if you don’t know it by its name, you’re probably familiar with Euclid’s theorem (and you’ve maybe heard its proof too at some point):

There are infinitely many prime numbers.

Now, less people are familiar with the remarkable prime number theorem:

Let \(\pi(N)\) be the number of primes \(p\) with \(p \leq N\). Then \(\pi(N)\) approaches \(\frac{N}{\log{N}}\) as \(N \rightarrow \infty\)¹².

This tells us a lot more about how many primes there are: not only are there infinitely many primes, they are actually not too rare.³ While this is significantly harder to prove (it took until 1896), we can however get some (much worse) lower bounds for the number of primes up to \(N\) (in other words: upper bounds on the \(n\)th prime) without too much hassle.⁴

baby steps with euclid

But first, let’s recap Euclid’s proof of his theorem: If the primes were finite, you could list them all as \(2, 3, 5, \ldots, p\), where \(p\) is the greatest prime. Now, let \(q = 2 \cdot 3 \cdot 5 \cdots p + 1\). We can see that \(q\) is not divisible by any prime number in our list, so either \(q\) is prime, or there’s another prime between \(p\) and \(q\): a contradiction. In terms of \(\pi(N)\), this shows that \(\pi(N) \rightarrow \infty\).

There’s actually a lower bound hidden in Euclid’s argument: if \(p_n\) is the \(n\)th prime, then the next prime is less than or equal to \(q\). Without an explicit list of primes, we can tell that \[p_{n+1} \leq p_n^n + 1,\] and that equality only holds for \(n = 1\) (so that \(p_2 = p_1^1 + 1 = 2^1 + 1 = 3\)). So, we get

\[\begin{align} p_{n+1} &\leq p_n^n \leq (p_{n-1}^{n-1})^n \leq p_{n-1}^{(n-1)n} \leq p_{n-2}^{(n-2)(n-1)n} \\\\ &\leq \ldots \leq p_1^{n!} = 2^{n!} \end{align}\]

for \(n > 1\) and thus \[p_n \leq 2^{(n-1)!} \quad (n > 2).\] As you might be able to tell, this lower bound is pretty terrible: while \(p_7 = 17\), we have \(2^{6!} \approx 5.52 \cdot 10^{216}\) (and it only gets worse…).

We can actually do a little better yet: we can prove \(p_n < 2^{2^n}\) by induction: For \(n = 1\), we have \(p_n = p_1 = 2 < 2^{2^1} = 2^{2^n}\). Now, let’s assume \(p_n < 2^{2^n}\) for all \(n \leq N\). By Euclid’s argument, we get

\[\begin{align} p_{N+1} &\leq p_1 p_2 p_3 \cdots p_N + 1 \\\\ &< 2^2 \cdot 2^4 \cdot 2^8 \cdots 2^{2^N} + 1 = 2^{2 + 4 + \cdots + 2^N} + 1 \\\\ p_{N+1} &< 2^{2^{N+1}}. \end{align}\]

So, we can conclude that \[p_n < 2^{2^n}.\] While this bound is still pretty terrible, it is miles better: while before we had \(p_7 \leq 2^{6!} \approx 5.52 \cdot 10^{216}\), we now get \(p_7 < 2^{2^7} = 2^{128} \approx 3.40 \cdot 10^{38}\).

If we wanna think in terms of \(\pi(N)\), we have \(\pi(2^{2^n}) = n\), which after solving \(N = 2^{2^n}\) for \(n\) gets us \[\pi(N) > \log_2\log_2{N}.\]⁵ To compare again: while \(\pi(10^{10}) = 455,052,511\), this gets us \(\pi(10^{10}) > \log_2\log_2{10^{10}} \approx 5.05\). Getting there!

one-(1)-upping ourselves: using fermat numbers

Since we’re spending time with terrible bounds anyway, we might as well just try to add one (\(1\)) to our last lower bound for \(\pi(n)\).⁶

For that, we need Fermat numbers: the \(n\)th Fermat number is defined as \(F_n = 2^{2^n} + 1\). Now, we need a small theorem which Wikipedia assures me is called Goldbach’s theorem:

No two Fermat number have a common divisor greater than 1.

The proof isn’t too hard: let’s say we have two Fermat numbers, \(F_n\) and \(F_{n+k}\), who have a common divisor in \(m\). Let’s first check check if \(F_n\) divides \(F_{n+k} - 2\) (this helps us, I swear!):

\[\begin{align} \frac{F_{n+k} - 2}{F_n} &= \frac{2^{2^{n+k}} + 1 - 2}{2^{2^n} + 1} = \frac{(2^{2^n})^{2^k} - 1}{2^{2^n} + 1} \\\\ &= \frac{x^{2^k} - 1}{x + 1} \quad (\mathrm{with}\ x = 2^{2^n}) \\\\ &= x^{2^k - 1} - x^{2^k - 2} + \cdots - 1. \end{align}\]

Now, since \(m \mid F_n\) and \(F_n \mid F_{n-k} - 2\), we also get \(m \mid F_{n+k} - 2\), and since we also have \(m \mid F_{n+k}\), we get \(m \mid 2\)! But from their definition, we can see that all Fermat numbers are odd, so \(m\) must be odd, too: so, \(m\) must be 1, and we have proved what we set out to prove.

Now, since no two Fermat numbers share a common factor, there must be at least \(n\) odd prime numbers less than or equal to \(F_n\). With our one even prime number two, we get \[p_n \leq F_{n-1} = 2^{2^{n-1}} + 1,\] which is a square root better than our previous upper bound:⁷ when we had \(p_7 \lessapprox 3.40 \cdot 10^{38}\) before, we now have \(p_7 \lessapprox 1.84 \cdot 10^{19}\). Still pretty terrible!

In terms of \(\pi(N)\), this didn’t help much, though: solving \(N = 2^{2^{n-1}} + 1\) for \(n\) gets us \[\pi(N) \gtrapprox \log_2\log_2{N} + 1,\] which is the added one (1) I promised earlier⁸.

from loglog to log: erdős to the rescue!

As the heading suggests, we’re able to drop from \(\log\log\) to \(\log\) with an argument due to Erdős.⁹

We let \(N_j(x)\) be the function counting all \(n \leq x\) that aren’t divisible by a prime greater than \(p_j\). We can express any such \(n\) as \(n_1^2 m\) where \(m\) is a square-free number, which means that it’s not divisible by any perfect square other than 1, so that we can write \[m = 2^{b_1} \cdot 3^{b_2} \cdots p_j^{b_j}\] where all the \(b\)s are either 0 or 1.¹⁰ Because each number only has one prime factorization, there are only \(2^j\) possible \(m\)! Now, since \(n_1 \leq \sqrt{n}\) and \(n \leq x\), we have \(n_1 \leq \sqrt{n} \leq \sqrt{x}\), so there are only up to \(\sqrt{x}\) possible values of \(n_1\)!

We can thus conclude that there can only be up to \(2^j \sqrt{x}\) such \(n\), so that \[N_j(x) \leq 2^j \sqrt{x}.\] Now comes the trick: we let \(j = \pi(x)\), so that \(p_{j+1} > x\) and thus \[N_j(x) = x\] by definition. This gets us \[x = N_j(x) \leq 2^{\pi(x)} \sqrt{x}.\] Now we solve for \(\pi(x)\):

\[\begin{align} x &\leq 2^{\pi(x)} \sqrt{x} \\\\ \sqrt{x} &\leq 2^{\pi(x)} \\\\ \frac{\log{x}}{2} &\leq \pi(x)\log{2} \\\\ \pi(x) &\geq \frac{\log{x}}{2\log{2}}. \end{align}\]

When we were at \(\pi(10^{10}) > \log_2\log_2{10^{10}} + 1 \approx 6.05\), we now have a whopping bound of \(\pi(10^{10}) \gtrapprox 16.61\)! Still ways to go to \(455,052,511\) though…

Substituting \(x = p_n\) so that \(\pi(x) = n\), we get \[p_n \leq 4^n\] (I’ll spare you the details), which improves our previous upper bound \(p_7 \leq 1.84 \cdot 10{19}\) to \(p_7 \leq 16384\), which is a number you can actually hold in your head!

conclusions

Well, while our bounds are still quite terrible, they at least have improved significantly from the ones we got from Euclid’s argument. Technology!

Jokes aside, I hope you learned something, and even if you didn’t, I hope you still had fun! Until next time~~

Unless indicated otherwise, all \(\log\)s are always natural logarithms. ↩
This is also often written as \(\pi(N) \sim \frac{N}{\log{N}}\). ↩
There is a much better approximation of the number of primes up til \(N\): the so-called logrithmic integral \[\operatorname{Li}(x) = \int_2^x \frac{dt}{\log(t)}.\] As far as I understand and/or remember, assuming the Riemann hypothesis, it has been proven that the error of this approximation is as if the primes were distributed truly pseudorandomly! (I could be wrong though!) ↩
So what we’re basically doing is improving Euclid’s theorem over and over! ↩
As far as I can tell, number theorists don’t really like having non-natural logarithms, but we can fix that: for all \(N \geq 2\), we have \(\log_2\log_2{N} > \log\log{N}\), and these two terms only differ by a constant factor, so the bound \(\pi(N) > \log\log{N}\) is almost as good. ↩
This argument is due to Pólya, as far as I know. ↩
Since \(\sqrt{2^{2^n}} = 2^{\frac{2^n}{2}} = 2^{2^{n-1}}\). ↩
The \(\gtrapprox\) is due to me not wanting to be bothered with the \(1\) in \(2^{2^{n-1}} + 1\) ↩
What’s a math post without Erdős, anyway? ↩
There are no prime factors bigger than \(p_j\) because we only look at \(n\) that are not divisible by any larger primes. ↩

categories: math tags: math number-theory