Abstract
This paper explores the depth of the Prime Number Theorem and proves the theorem using an array of arithmetical functions such as the Möbius function, Von Mangoldt function, Chebyshev Functions and Dirichlet Multiplication. This account attempts to simplify and smoothen rather temperamental (or fluctuating) functions, by replacing them with their integrals or using big-oh approximation. Inspired by previous works in the field, this is an adaptation of various elementary accounts that culminates in a proof accessible to even high school students. The proof walks readers through each of the arithmetical functions and their natures and gradually begins proving crucial theorems and lemmas, namely Selberg’s Identity, Mertens Theorem as well as several inequalities that describe some of these functions even further, until these can be used to describe the nature of prime number distribution itself. In its essence, the paper captures intriguing relationships between functions that count divisors, prime factors and their logarithms from their behaviours and limits in order to display the growth behaviour of prime numbers on the number line.
Introduction
In the vast scope of number theory, prime numbers have long been the focus of mathematical intrigue. Their unpredictable distribution along the number line has challenged mathematicians for centuries. As mathematicians dealt with the puzzling patterns of the prime numbers as we move along the number line, a rich tapestry of mathematical inquisition unfolded, setting the stage for the Prime Number Theorem. From the early findings of reverent mathematicians Euler and Legendre to the significant contributions of Riemann and Dirichlet in making their own functions towards proving this theorem, the narrative unfolds with focus on the collaborative efforts that brought the PNT to fruition. The Prime Number Theorem states that the number of prime numbers not exceeding is asymptotic to , as
Several articles and papers, notably Newman’s Short Proof1, seek to demonstrate this theorem through the use of the Riemann’s infinite series, , and finding its real roots for Re. Its rigorous use of complex analysis and logarithmic integrals demands a strong background. While this adaptation of findings from Norman Levinson2 and other elementary proofs appears long and technical, its thorough and exhaustive explanation of every function and notation used makes this an extremely comprehensive account of the proof.
Defining Functions
Prime Number Counter
The concept of prime numbers, such as 2, 3, 5, 7, 11, 13, dates back to ancient times, and Euclid provided a proof demonstrating their infinitude. The count of prime numbers less than or equal to a given value is denoted as . This function can be represented by:
(1)
where runs over all prime numbers less than .
Theorem 2.1: The Fundamental Theorem of Arithmetic states that every can be uniquely expressed in the form , where {} are distinct primes and are powers of primes.
Mobius Function
The Möbius function, denoted as , has an alternating behavior, taking the value of 1 if is a square-free positive integer with an even number of distinct prime factors, -1 if it has an odd number of distinct prime factors, and 0 if has a squared prime factor. In essence:
(2)
An important theorem relating to the Möbius function involves its divisor sum, the partial sum of the function over the divisors of . Divisor sums will be expressed as a sum with over throughout the paper.
Theorem 2.2:
(3)
Von Mangoldt Function
The von Mangoldt function, denoted as , is defined such that equals the natural logarithm of a prime power when is a power of a prime and is zero otherwise. This can be shown as:
(4)
Theorem 2.3: An important formula related to the partial sum of the function states that:
(5)
Due to the Fundamental Theorem of Arithmetic, that states that every can be expressed in the form , the partial sum of the von Mangoldt function can be expressed as (5).
Dirichlet Multiplication
The Dirichlet product is a mathematical operation defined for two arithmetic functions, returning another arithmetic function, typically denoted as , where is a positive integer3. It is computed by:
(6)
Theorem 2.4: Some important identities that will be used throughout this paper in relation to Dirichlet multiplication, or Dirichlet convolution are:
1. For an identity function, , n , where
it can be stated that:
(7)
2. Dirichlet multiplication is commutative as well as associative:
(8)
(9)
3. For every arithmetical function, , for which , there exists a function , the Dirichlet inverse of , where
(10)
Theorem 2.5: Mobius Inversion
If and are arithmetical functions, such that
Then
(11)
By (6), we have that for a unit function, for all ,
Taking the Dirichlet product of both sides with , we obtain
The convolution of and can be written as:
Now substituting this result back into the original ,
which proves the inversion formula.
Asymptotic Relationships and Elementary Results
Big-Oh Notation and Asymptotic Equality
Definition 3.1: For an arithmetical function, , if there exists a constant, , for which another function, can be expressed as
for , any constant, then the relationship between and is said to be:
(12)
This essentially states that the function grows faster than or as fast as . In this paper, we use big-oh to approximate one function as equal to another, to obtain smoother relationships and functions. We establish these from the idea that as we approach on the number line, these functions are virtually identical. This paper deals with functions that share these relations as asymptotics.
Definition 3.2: If
Then the relationship between and can be written as
(13)
Identities and Properties of Big-Oh
Some properties of the big-oh function that will be useful looking through this paper are listed below:
1. For arithmetical functions , where ,
(14)
2. Since the function gives us an idea of the growth rate of , any slower growing function, than can also take the form , or in summary, if
Then:
(15)
Although this is a worse approximation of the function, , we may need to use such relationships in order to obtain “nicer” functions to handle in our proof.
3. Adding or subtracting two big-oh functions just yields the dominating function as follows:
Then:
(16)
Order of complexity of functions
The below table represents the order of complexity of different functions, from lowest to highest, or fastest to slowest-growing functions. Knowledge of this sequence will help in replacing more complex functions with less complex ones for the sake of approximation.
for .
Euler’s Summation Formula
Euler summation is a technique used to approximate the partial sum of a function by connecting it to its integral. It is going to be a very useful tool in our proof, where we deal extensively with partial sums.
Lemma 3.1: (Abel’s Summation Formula) Let have a continuous derivative, , for . Let , be constants and let . Then
(17)
and since is a step function, as . Thus if ,
Since is constant for ,
Adding this to the previous equation and transposing the integral on the left side to the right proves (17).
Theorem 3.1: (Euler’s Summation Formula). Let be a function with Riemann integrable derivative defined on the interval . Then:
(18)
This result follows from Abel’s Summation Formula. If , , and so, (17) leads to:
Integrating the first integral on the right by parts proves (18).
We will use these findings to deduce information about the asymptotic tendencies of specific arithmetic functions.
Theorem 3.2:
(19)
where is a constant (Euler’s constant).
Applying Euler Summation to
If
(20)
then , where
since 1 is bounded, which proves (19).
Chebyshev’s Functions
In this subsection, we introduce 3 closely-related functions and analyse their asymptotic behaviours. These functions are similar to the prime-number counter and will be used extensively in this proof of the Prime Number Theorem.
Definition 3.3: The Chebyshev-Theta function, , is defined as:
(21)
where runs over all primes not exceeding .
Definition 3.4: The Chebyshev-Psi function, , is defined as:
(22)
Since unless is a prime power we can write the definition of as follows:
where runs over all positive integers.
The sum on is actually a finite sum. In fact, the sum on is empty if , that is, if , or if
Therefore, we have:
The above formula for can be restated using the function as:
(23)
Lemma 3.2: For we have
(24)
From (23) we find
From the definition of , we obtain the following inequality:
This implies that:
Now divide by to obtain the theorem.
Note: This inequality implies that
In other words, if one of or tends to a limit then so does the other, and the two limits are equal.
Definition 3.5: Rewriting (5) from Chapter 2, we can get:
We define a third Chebyshev function, , that is expressed as:
(25)
or, using (5),
(26)
If the double sum (26) is treated as a repeated sum, summing first on j,
This identity was discovered by Chebyshev and will be rewritten as
(27)
Chebyshev Functions and Asymptotics
Theorem 3.3: The above identity is a transform relationship that takes the form similar to in the below definition:
According to Möbius inversion (11), the above relationship can be inverted as:
Replacing with and with , we can obtain the inversion formula:
(28)
Applying Euler’s summation formula to :
(29)
Lemma 3.3: For large ,
(30)
Using (27),
because since is a nondecreasing function.
From (29), , for a constant . Applying the above with replaced by ,
so long as which implies . Recalling that , and adding the above for ,
Since is a constant this proves (29), showing that is bounded by a function that is equivalent to .
Lemma 3.4:
(31)
We rearrange the double sum , summing over first and then , we obtain:
This can be rewritten as:
(32)
Moreover, we can infer
(33)
from (30). Now using (29),
The terms can be contained in , which proves the lemma.
Theorem 3.3: The following 4 statements are equivalent:
(34)
(35)
(36)
(37)
Note: Throughout the rest of this paper, the function is denoted by and will be known as remainder function.
We prove the four statements of the theorem later on in chapter 5.
Definition 3.6:
(38)
Lemma 3.5:
(39)
so that (36) is equivalent to the prime number theorem, since grows slower than . The remainder of this paper seeks to obtain a relationship between and so that we can tie back to this statement.
Using the and function definitions,
(40)
where the sums for above are zero when and when .
Hence, we approximate using the inequality:
(41)
From the definition of , this gives
Since for large x, the above statement yields
By (40),
Since the partial sum is greater than its lower limit,
Since
and
this gives:
which can then be rewritten as
(42)
Using Lemma 3.3, we know that , for a constant, . Replacing this inequality in (42),
The second and third terms on the RHS can be contained in . This proves the Lemma.
Selberg’s Identity
Theorem 4.1: (Selberg’s Asymptotic Formula). For all positive we have the following:
(43)
This entire section will be a proof of this asymptotic formula, extracted from one of Selberg’s lecture series4
However, in this application, it is modified to use big-oh approximation instead of the more complex little-oh function.
Lemma 4.2: Let be a function defined on and let be given by:
For this relationship, the following statement is true:
(44)
We first rewrite as a sum using (7). We have:
(45)
Now, we can use the Möbius inversion formula of (5) in Chapter 2 to say that:
Using both forms of (5) in (43):
(46)
Adding (45) and (46) we have:
This is equal to:
Plugging this back into (45):
In the last sum, we write and obtain:
And by our initial definition of , we can write the RHS of the above as:
as needed.
We now have to prove Theorem 4.1 using this Lemma.
Taking and , the respective would be:
which, according to (29) and, gives
(47)
and the corresponding can be expressed as:
(48)
Using Theorem 3.2:
by containing all and constant terms in .
Now that we have and in similar terms, we use a weak estimate of their difference
(49)
Using Lemma 4.2, we can break this down to:
(50)
and for :
Subtracting RHS of (48) and (49),
Using our estimate for the difference as our argument, we have:
Using Lemma 4.2 for , we have:
(51)
Rearranging terms and using Lemma 3.4 we find that
proving Selberg’s identity.
With , Abel’s Summation and Lemma 3.3 yield:
(52)
Also
(53)
Thus if
(54)
then (50) and (51) in (41) yield as an equal to Selberg’s formula. By (28), . Combining the above two inequalities,
(55)
and
Proving the Prime Number Theorem
We use Selberg’s Identity to prove the prime number theorem. We will mainly be working with the function as it is more convenient to obtain inequalities that demonstrate its decreasing nature.
Lemma 5.1: Selberg’s identity can be restated as:
Definition 5.1: Unfortunately, due to the nature of tends to fluctuate, and does is the quotient, . For convenience, we will use the smoother function:
(56)
Lemma 5.2: There exists a constant such that
(57)
and
(58)
For this condition, is considered Lipschitz.
Moreover a consequence of (42) is
(59)
From (29), for large . Hence
suggesting that is bounded for finite . Hence, there must exist a constant such that
From the definition of , except at where is discontinuous. By the above statement then:
Hence, first for the case where the interval contains no , (58) is true.
However since is continuous, it allows (58) to be extended for all and . The result (57) follows from (58) with .
Since , (58) yields
(60)
To prove (59), divide the LHS of Lemma 5.1 by and integrate to get
If we use integration by parts, we get
which proves (59).
Lemma 5.3: With (54) and a constant
(61)
Replace in (59) by , multiply by and sum to get:
Setting in the second sum and summing on , and setting in the first sum and replacing this latter by ,
where (30) is used to get the RHS. The first sum above is now replaced by use of (59) to give
Replacing all terms in the sum on the right by their absolute values from the Lipschitz inequalities gives (61).4
Lemma 5.4: There exists a constant such that:
(62)
We define a second remainder function , as given by:
(63)
Now, recall the function defined by:
We see that:
Substituting this in (62), we have:
(64)
We can ignore all terms with as for those terms. Through examination, we can re-index the RHS as:
Applying the inequality , we have:
Because is Lipschitz, we have a constant , such that:
(65)
Moreover, we can bound by some linear function of as by (55). So, making large enough and substituting (64) we have:
Factoring the and simplifying on RHS, we have:
We now obtain:
Combining the above with our expression for proves the lemma.
Lemma 5.5: There is a constant such that:
(66)
Given that is increasing:
On the RHS, use the inequality
(67)
for
Due to being Lipschitz, we can bound this by:
Simplifying and bounding the final integral on the LHS we have:
Because we have:
Now, returning to our original expression, we see:
(68)
Using and applying (68) to Lemma 5.4, we have the desired result.
The inequality of (66) assumes a simpler form with a change of variables. Letting and letting , we can rewrite (66) as:
(69)
Definition 5.2: We now introduce another function, , for which:
(70)
From this, we obtain
Getting the statement:
(71)
The transformations above were done to yield a function which is essentially dominated by a weighted average of . We now prove two lemmas about .
Lemma 5.6:
returns the value for which is bounded by as approaches
We also define another constant, (NOT Euler’s constant), as
By the statement we know that:
By definition of , it follows that:
From this, it is clear that
Lemma 5.7: Let
Then
(72)
which can be verified by reversing the order of integration. Note that:
Thus we have the integral on the RHS of the form:
And this is bounded by , the Lipschitz constant from lemma 5.2, hence:
Thus, similar to the approach taken by Tom M. Apostol5, if we fix , and take any , we have:
as we separate the integrals at . Given the constant , if we take a sufficiently large , we have by the definition of that:
for all . Thus, substituting into our inequality for we have:
Thus, for large , we have by (71) that:
For , we have , and the inequality holds as it is true for a constant .
Now, we attempt to show
Lemma 5.8: For ,
(73)
Since
Hence, . This leads to (72)
Lemma 5.9: If for then there exists a constant such that:
(74)
We see through an application of Abel Summation formula (Lemma 3.1) that:
Or, using the remainder function:
(75)
Taking , we do another double integral decomposition:
Using and (74) , it follows that . By changing variables with we have:
Writing the same for and and subtracting the integrals with these as endpoints, we have that the result integral is bounded, and thus there exists an such that:
Lemma 5.10: A function must in fact have .
Take . Then from the definition of there exists an such that
If for all large it follows from (74) that and hence that . Suppose then that has arbitrarily large zeros. Let and be successive zeros of for .
Case 1: . By (74), since
(Hence the average of on is less than .)
Case 2: . In this case it follows from (72) that if the graph of rises as rapidly as possible going right from and left from , it cannot lie above a triangle with altitude and hence
Case 3: . Reasoning as in Case 2 for a distance from each endpoint and using (76) otherwise,
Since and since , the above is valid in Cases 1 and 2 also. If is the first zero of to the right of and the largest zero to the left of , then (73) and the above imply that
Dividing by and noting that ,
Letting , and since ,
(76)
Since this holds for all it must hold for . Hence , and since , this implies . Since , this implies that as . Hence if given , if is large enough,
Thus , or
Since and is nondecreasing,
Hence . Similarly for large enough leads to .
End of the Proof: Since is an arbitrary constant, by integrating and eliminating , we can prove that (37) is true. As established in Theorem 3.3, (37) is equivalent to (36). Now that the asymptotic relationship of has been obtained, we can directly state that (34) is true by (39). By definition, (34) is indeed the prime number theorem and thus, it can be concluded that the prime number theorem does hold true.
Acknowledgements
I express my sincere gratitude to everyone who has contributed to the completion of this project. Special thanks to my mentor, Lukas Kofler, for his guidance, support, and invaluable insights throughout the research process. I am also thankful to Lumiere Education, for providing this platform to me, and my program manager, Srishti Bansal, for organising and managing this research experience.
References
- D. Zaiger, ”Newman’s Short Proof of the Prime Number Theorem”, The American Mathematical Monthly, vol. 104, no. 8, pp. 705-708, 1997. [↩]
- N. Levinson, “A Motivated Account of an Elementary Proof of the Prime Number Theorem,” The American Mathematical Monthly, vol. 76, no. 3, pp. 225–245, 1969. [↩]
- J. Binder, “Analytic Number Theory and Dirichlet’s Theorem” REUPapers, 2008. [↩]
- A. Selberg, Selberg’s Lecture Series on the Analytic Theory of the Prime Numbers, Hong Kong, 1998. [↩] [↩]
- T. M. Apostol, Introduction to Analytic Number Theory. Undergraduate Texts in Mathematics, Springer-Verlag, 1976. [↩]
This is a very thoroughly written research paper. I learnt a lot by reading this. Well written Nishant!