Goldston & Yildirim's Result

Navigate up

Notes
Notes:  Goldston & Yildirim's Result
April 4th, 2003
 

The world of prime number studies was shaken up considerably in March 2003 by a major new result, obtained by Daniel A. Goldston of San Jose State University (in California) and Cem Y. Yildirim of Bogaziçi University, Istanbul, Turkey.  Several readers of my National Review Online columns have asked me to explain the result, so here goes.


I imagine any educated person knows what a prime number is.  It is a positive whole number with no non-trivial factors.  The number 28 has six factors:  1, 2, 4, 7, 14, and 28.  Since 1 is a factor of every whole number, and every whole number is a factor of itself, the first and last of these factors are not interesting.  They are, in a word mathematicians are very fond of, “trivial” factors.  The number 28 has six factors, four of them non-trivial.  The number 29, on the other hand, has only two factors, both trivial.  It has no non-trivial factors.  It is therefore a prime number, while 28 is the opposite thing:  a “composite” number.

For as long as numbers have been written about in a systematically abstract way at all — which is to say, for about 26 centuries — mathematicians have found great fascination in the primes.  There is now a vast body of mathematical literature on the topic.  A good impression of the sheer scope of inquiries into primes can be got from Paulo Ribenboim’s 1996 production, The New Book of Prime Number Records, which has over 500 pages.  There are also countless web sites devoted to particular topics in prime number theory.

Here is a list of all the primes up to one thousand.

     2     3     5     7    11    13    17    19    23    29    31    37    41    43

    47    53    59    61    67    71    73    79    83    89    97   101   103   107

   109   113   127   131   137   139   149   151   157   163   167   173   179   181

   191   193   197   199   211   223   227   229   233   239   241   251   257   263         

   269   271   277   281   283   293   307   311   313   317   331   337   347   349         

   353   359   367   373   379   383   389   397   401   409   419   421   431   433

   439   443   449   457   461   463   467   479   487   491   499   503   509   521

   523   541   547   557   563   569   571   577   587   593   599   601   607   613

   617   619   631   641   643   647   653   659   661   673   677   683   691   701

   709   719   727   733   739   743   751   757   761   769   773   787   797   809

   811   821   823   827   829   839   853   857   859   863   877   881   883   887

   907   911   919   929   937   941   947   953   967   971   977   983   991   997

As you can see, there are 168 of them.  If you want to be argumentative, you can insist on including 1 at the beginning of the list, since it satisfies the technical definition of a prime.  Including 1 in the primes, however, is a major nuisance, and modern mathematicians just don’t, by common agreement.  (The last mathematician of any importance who did seems to have been Henri Lebesgue, in 1899.)  Even including 2 is a nuisance, actually.  Countless theorems begin with:  “Let p be any odd prime...”  However, 2 pays its way on balance, but 1 doesn’t, so we just leave it out.

There are a couple of things to notice about these numbers, from which most of the interesting issues about primes arise.  The first is their irregularity; the second is their tendency to thin out.

Irregularity of the primes.  There is no apparent pattern to the primes.  Sometimes they are clumped together; sometimes they are spread far apart.  The four consecutive primes from 821 to 829 are separated, first to last, by only 8.  (Because 829 minus 821 is 8.)  That is an average spacing — remember that there are three spaces separating four numbers — of 2.67.  The four consecutive primes from 773 to 809, by contrast, are separated by 36, an average spacing of 12. 

These patterns of clumping and spreading continue for as far as the eye can see, at all scales.  The ten consecutive primes from 90,279,457 to 90,279,493 have an average spacing of 4;  the ten consecutive primes from 98,674,973 to 98,675,461 have an average spacing of more than 54.

The smallest possible difference between two consecutive primes (once you get past the 2, 3 pair) is 2, as with 17 and 19, or 881 and 883, or 100,710,977 and 100,710,979.  It is believed, though no-one has been able to prove it, that this minimal “prime pair” situation occurs infinitely often.  This is the Prime Pairs ConjectureSome tremendously large cases have been found:  the current record is a prime pair with 51,090 digits each. 

There is no largest possible distance between two consecutive primes.  That is to say, no matter how big a number you name — a million, a trillion — theory guarantees that there are two consecutive primes further apart than that.

The thinning-out of the primes.  Between 1 and 100 there are 25 primes.  Between 401 and 500 there are 17.  Between 901 and 1,000 there are only 14.  The number of primes in any block of 100 numbers seems to decline as you go higher up in the number system.  This is indeed the case.  In the last hundred-block below a million — that is, the block from 999,901 to 1,000,000 — there are only eight primes.  In the last hundred-block below a trillion there are only four.

The question naturally arises:  Do the primes eventually thin out to nothing?  If I count up to a sufficiently high level, up into the trillions of trillions of trillions and beyond, will I eventually rach a point after which there are no more primes at all?  The answer to that one was found by Euclid around 300 BCE.  No, the primes never thin out to nothing.  There are always more.

Having had that settled so early in the history of math, the next question to occur was:  How, exactly, do the primes thin out?  Is there a rule, a formula, to tell us how many primes we can expect to find at various levels? 

The first person to arrive at any conclusions about this was Carl Friedrich Gauss, one of the three or four greatest mathematicians that ever lived.  When still in his teens, around 1792 or 1793, Gauss noticed that the frequency of primes was inversely proportional to the logarithm.  To put it mathematically:  In the neighborhood of a large number N, the frequency of primes is about one in log N.  (This is the “natural log,” commonly shown on pocket calculators with the symbol “ln.”)  The log of a trillion, for example, is about 27.631021; so in the neighborhood of a trillion, roughly one number in 28 is prime.  You would therefore expect about 3.6 primes per hundred in this neck of the woods, which agrees nicely with the four that actually occur in that last hundred-block below a trillion.

This rule Gauss discovered was so striking to mathematicians it became known as “the Prime Number Theorem,” or PNT for short.  Gauss could not prove the PNT, and proving it became one of the great mathematical challenges of the 19th century.  The proof was finally accomplished at the very end of that century, in 1896, by two mathematicians working independently:  Jacques Hadamard of France and Charles de la Vallée Poussin of Belgium.  Their proofs used techniques first set out in one of the greatest of all mathematical papers, Bernhard Riemann’s epochal “On the Number of Prime Numbers Less Than a Given Quantity,” presented to the Berlin Academy in August 1859.

Average distance between two primes.  The PNT can be stated in a number of different ways, all equiponderant (that is, having the same weight).  If one of the following statements is true, all the others can be deduced from it.

  • In the neighborhood of a large number N, the frequency of primes is about one in log N.

  • The number of primes up to some large number N is about N / log N.

  • The N-th prime is about N times log N.

  • The average distance between two consecutive primes is log N.

(The word "about" in these statements can be given a precise mathematical meaning, by the way.  It means that the proportional difference between the actual reality and the result I have stated dwindles to zero as N goes to infinity.  To put it another way, the ratio of theory to reality gets indefinitely close to 1 as N goes to infinity.  When N is a trillion, for example, N times log N is 27,631,021,115,929.  The trillionth prime is actually 30,019,171,804,121.  This is an 8 per cent error in my third bullet point above.  For a thousand, a million, and a billion, the corresponding percentage errors are 13, 10, and 9.  "About" means that this percentage difference gets indefinitely close to zero as N gets indefinitely large.) 

Now concentrate on that last bullet point.  Because, as I said, the "clumping" of primes continues for ever, and we strongly suspect that there are "prime pairs" — primes separated by just 2, like 17 and 19 — for ever (the Prime Pairs Conjecture), the following question arises.  Let's call the K-th prime number pK, and the difference between the K-th prime and the next prime (that is, the (K+1)-th prime), dK.  So dK = pK+1  — pK.

  • For some large number N, consider all the primes that are bigger than N.  What is the smallest value of dK for all pairs of consecutive primes bigger than N?  What can we say about that smallest value as N gets indefinitely large?

Well, if the Prime Pairs Conjecture is true — if, that is, there are infinitely many prime pairs like 17, 19 — then that smallest value will be 2.  This, as I said, is not known.  In fact, we don't know what happens to that smallest value as N goes to infinity.  It might, for all we currently know, go to infinity itself!

In mathematical jargon, that smallest value — please remember, it's not the overall smallest value, it's the smallest value for consecutive pairs of primes bigger than N — is called "lim inf dK."  The word "lim" stands for "limit," that is, the limit as N goes to infinity.  The word "inf" stands for "infimum," a mathematical term of art defined to mean "the biggest lower bound."

If you understood that question, try this one.

  • What is lim inf (dK / log pK) ?

Translation:

  • Choose some humongous number N.  For all the primes pK that are bigger than N, calculate the difference between the prime and the next prime,  dK = pK+1  — pK.  Divide that by log pK, recalling from one of my earlier bullet points that this is the average distance between two consecutive primes in this range.  Make a note of the minimum result from this calculation, for all primes pK bigger than N.  Now repeat this whole process for yet bigger N, then again for yet bigger, and so on.  What happens to these minimums, as N goes to infinity?

If, for example, the Prime Pairs Conjecture is true, you are always going to have a minimum dK equal to 2.  Since pK, and therefore log pK, increase without limit, 2 / log pK dwindles down indefinitely close to zero, and lim inf (dK / log pK) =0.  

Note, however, that the same thing would be true if the minimum dK was equal to 4, or 6, or a trillion.  So the Prime Pairs Conjecture implies lim inf (dK / log pK) =0, but not vice versa.  The Prime Pairs Conjecture is, as mathematicians say, the stronger result. 

[In fact, it is even possible to imagine the minimum dK increasing steadily for ever, yet still  lim inf (dK / log pK) = 0.  This would be the case, for example, if the minimum dK increased at a rate log (log pK), since the function (log (log x)) / (log x) goes to zero as x goes to infinity.]   

The eccentric Hungarian mathematician Paul Erdös proved in 1940 that lim inf (dK / log pK) < 1.  This started off a competition among mathematicians to try to find smaller values.  In 1965 three Chinese mathematicians proved that lim inf (dK / log pK) <= 29/32, i.e. 0.90625.  A year later, Harold Davenport and Enrico Bombieri proved lim inf (dK / log pK) <= 0.467.  Helmut Maier got that down to 0.248 in 1988.  In other words, at this point we knew that there are infinitely many pairs of consecutive primes that are closer together than one-quarter the average spacing in their neighborhood.  There we got stuck for 15 years.

Goldston and Yildirim's result.  Now you can appreciate the amazing result published by Goldston and Yildirim in March.  It would have been sufficiently wonderful if they had whittled the limit for inf (dK / log pK) down to 0.1, or 0.01.  They did much better than that:  they proved that lim inf (dK / log pK) = 0!

So Goldston and Yildirim proved the following thing:  For any number you name, no matter how tiny — a millionth, a billionth, a trillionth — there are infinitely many pairs of consecutive primes that are closer together than that fraction of the average in their neighborhood.

Another way to say this is:  The shortest gap between two consecutive primes, as compared to the average gap in their neighborhood, shrinks closer and closer to zero as the primes get bigger.

Yes, I agree, it's not a snappy result.  It is, however, terrifically impressive to mathematicians, the more so for being entirely unexpected.

Oops.  Soon after I posted this, Granville and Soundarajaran found an error in Goldston & Yildirim's proof.  For fuller details, see here.  Well, that's why we have peer review!

Top of this page