Slashdot Mirror


New Pattern Found In Prime Numbers

stephen.schaubach writes "Spanish Mathematicians have discovered a new pattern in primes that surprisingly has gone unnoticed until now. 'They found that the distribution of the leading digit in the prime number sequence can be described by a generalization of Benford's law. ... Besides providing insight into the nature of primes, the finding could also have applications in areas such as fraud detection and stock market analysis. ... Benford's law (BL), named after physicist Frank Benford in 1938, describes the distribution of the leading digits of the numbers in a wide variety of data sets and mathematical sequences. Somewhat unexpectedly, the leading digits aren't randomly or uniformly distributed, but instead their distribution is logarithmic. That is, 1 as a first digit appears about 30% of the time, and the following digits appear with lower and lower frequency, with 9 appearing the least often.'"

11 of 509 comments (clear)

  1. Stock market analysis? by MSTCrow5429 · · Score: 4, Interesting

    I am admittedly not a mathematician, but I do have a good understanding of economics and finance, and I am not seeing how a pattern found in prime numbers could have any application to stock market analysis. Where is the interaction between prime numbers and the praxeology of buying and selling securities? Even if you're only focusing on automated buying and selling, those algorithms were still programmed by humans with their own subjective approaches and underlying premises.

    --
    Slashdot: Playing Favorites Since 1997
    1. Re:Stock market analysis? by arth1 · · Score: 4, Interesting

      I've always wondering how I could figure out when someone was trying to pass off a list of fraudulent primes. Glad to see that this problem is finally solved!

      You're jesting, but I imagine that many fields of encryption would benefit from this, like dual key encryption, where the security lies in the ability to trust that the product really is of two primes, and that factoring this would be extremely time consuming.

      Sets with a backdoor inserted may indeed have a different signature, and to be able to quickly see that one set differs would be invaluable. It wouldn't prove anything, but if, say, keys received from a certain company's key generator stood out like a sore thumb in a Benford distribution check, you would have reason to suspect foul play, incompetence or both.

  2. Re:Other bases? by Anonymous Coward · · Score: 4, Interesting

    Benson's Law is actually independent of the number base used. It wouldn't be much of a mathematical property if it wasn't.

    Err, what? The study of representations of numbers is a valid field of mathematics itself.

  3. Re:Other bases? by stonewallred · · Score: 4, Interesting

    Code this have cryptographical uses? IANAMOG, but I know primes play a role in many crypto schemes.

  4. If you're dealing with phone numbers by Ralph+Spoilsport · · Score: 5, Interesting
    It has less to do with math and more to do wit physics: as in how to use a an old school phone. Phone numbers, until comparatively recently would "prefer" lower numbers because they are EASIER TO DIAL. If a company had the phone number (909)999-9009 you would HATE dialing that thing. It would take about half a minute just to dial the damn number.

    Ssssshhhhhhik!
    diggadiggadiggadiggadiggadiggadiggadiggadigga!

    Total pain in the finger.

    1 as a first number was reserved for "other stuff" like international calls, so the lowest possible area codes (first numbers) went to places like New York City (212 - very quick to dial) or LA (213) because millions of people would be dialing that number, so it made for an overall faster dialing experience for (on average) more people.

    This is compared to the relatively few people who lived in more obscure parts of the country, like Saginaw MI (989) or Bryan TX (979).

    So, you have millions of phones in 212, thousands in 979. The result: saved effort in dialing.

    Also, to this end there was a preference for exchanges to have lower numbers as well to save on dialing effort, and phone numbers with lower (but NON-ZERO) values were sought after. You'd see advertisments like "Call RotoRooter - 213 464 1111 !" or "Call us NOW for a free analysis! 201 738 1122 !" etc. and so on.

    So, lower numbers in phone numbers have been a product of primitive dialing technology. Now with touchtone - all that is out the window - but the historic trend is still there and quite powerful - people will pay good money for a 212 area code for the distinction of being in the "real" New York Area code...

    RS

    --
    Shoes for Industry. Shoes for the Dead.
  5. Independent Verification by eldavojohn · · Score: 5, Interesting
    Here's what I got on my own counts using the first million primes:

    1: 415441
    2: 77025
    3: 75290
    4: 74114
    5: 72951
    6: 72257
    7: 71564
    8: 71038
    9: 70320

    Which puts the probabilities at:

    1: 0.415441
    2: 0.077025
    3: 0.07529
    4: 0.074114
    5: 0.072951
    6: 0.072257
    7: 0.071564
    8: 0.071038
    9: 0.07032

    My computer is currently crunching the first fifty million primes and I will post those as a reply to this post later today when it is done.

    These ratios on numbers 2-9 seem far too close in range for this to be a true log scale. Hopefully with more data it will be more logarithmic.

    --
    My work here is dung.
  6. Enron by Anna+Merikin · · Score: 4, Interesting

    was busted by auditors who found the books were "cooked" by applying the law of first numbers described in the /. blurb and TFA. The independent auditors found the first figures were randomly distributed instead of following Benford's law with the number 1 the most plentiful and nine the least -- therefore, the entries were fraudulent.

    Benford's law knocked my out at the time; I thought of how many bogus figures I had entered in my expense accounts over the years....

  7. Re:Other bases? by Anonymous Coward · · Score: 4, Interesting

    But how many would contain all 1s? Answer that, and provide a proof for your answer, and you'll make math history.

  8. Some More Information by eldavojohn · · Score: 5, Interesting

    So I read the comments and see that I need to do this in ranges or 1 to 100, 1 to 1000, etc. Which is fine, I've added another R method and would post the code here if it didn't yell at me for junk characters. So here are your Benford lists:

    All Primes 1-100
    Counted Occurances:
    4, 3, 3, 3, 3, 2, 4, 2, 1
    Frequencies:
    0.160, 0.120, 0.120, 0.120, 0.120, 0.080, 0.160, 0.080, 0.040

    All Primes 1-1,000
    Counted Occurances:
    25, 19, 19, 20, 17, 18, 18, 17, 15
    Frequencies:
    0.149, 0.113, 0.113, 0.119, 0.101, 0.107, 0.107, 0.101, 0.089

    All Primes 1-10,000
    Counted Occurances:
    160, 146, 139, 139, 131, 135, 125, 127, 127
    Frequencies:
    0.130, 0.119, 0.113, 0.113, 0.107, 0.110, 0.102, 0.103, 0.103

    All Primes 1-100,000
    Counted Occurances:
    1193, 1129, 1097, 1069, 1055, 1013, 1027, 1003, 1006
    Frequencies:
    0.124, 0.118, 0.114, 0.111, 0.110, 0.106, 0.107, 0.105, 0.105

    All Primes 1-1,000,000
    Counted Occurances:
    9585, 9142, 8960, 8747, 8615, 8458, 8435, 8326, 8230
    Frequencies:
    0.122, 0.116, 0.114, 0.111, 0.110, 0.108, 0.107, 0.106, 0.105

    All Primes 1-10,000,000
    Counted Occurances:
    80020, 77025, 75290, 74114, 72951, 72257, 71564, 71038, 70320
    Frequencies:
    0.120, 0.116, 0.113, 0.112, 0.110, 0.109, 0.108, 0.107, 0.106

    This is the raw data so to turn that into something visual, I dumped it into a Google spreadsheet and made it public (note the scale on the y axis). Enjoy!

    It seems that the curve is flattening out the more data I collect, but the logarithmic curve may be valid. I have the data for 100,000,000 and will add that to the spreadsheet once it completes.

    --
    My work here is dung.
  9. I Found a Fit! by eldavojohn · · Score: 5, Interesting
    The results for all primes between one and one hundred million:

    Counted Occurances:
    686048, 664277, 651085, 641594, 633932, 628206, 622882, 618610, 614821
    Frequencies:
    0.119, 0.115, 0.113, 0.111, 0.110, 0.109, 0.108, 0.107, 0.107

    So there's some more data for you. I added that to this spreadsheet.

    So I hope that satisfies everyone who replied to my thread first of all. I hope 5,761,455 primes between one and one hundred million satisfies you.

    I used a very simple Non Linear Squares model to solve for a single constant on a log of these values. I think I have a fit. Using Benford's model and the NLS Package in R, I found:

    f(x) = 0.020814 * log(161.147689 * ((x+1)/x))

    To fit quite nicely, here's the summary:

    Formula: y ~ Const1 * log(Const2 * ((x + 1)/x))

    Parameters:
    Estimate Std. Error t value Pr(>|t|)
    Const1 0.020814 0.001940 10.7292 1.343e-05 ***
    Const2 161.147689 80.222081 2.0088 0.08452 .
    ---

    Residual standard error: 0.0010413 on 7 degrees of freedom

    Number of iterations to convergence: 8
    Achieved convergence tolerance: 1.8104e-07

    Here is the list of frequencies next to what my model produced:

    Benford Prime Rates
    0.11907548
    0.11529674
    0.11300704
    0.11135972
    0.11002984
    0.10903600
    0.10811193
    0.10737045
    0.10671280

    NLS Model Results
    0.1202106
    0.11422279
    0.11177125
    0.11042794
    0.10957828
    0.10899193
    0.10856276
    0.10823497
    0.10797641

    I would wager that they are correct. Neat discovery!

    --
    My work here is dung.
  10. Re:Other bases? by Blakey+Rat · · Score: 4, Interesting

    The whole of mathematics is really just a language of form and structure, a system to systematize and decribe structure and forms (relationships are a type of form).

    So... mathematics is the vaguest thing possible?