Gaussian Distribution being questioned

Re:Sceptic in Slashdotia by puppet10 · 1999-09-02 05:54 · Score: 2

I admit the popular journalism of scientific is lacking (from the articles given here it is really impossible to form any informed opinion about the scientific validity/merit of the work reported on and no references to that work are given) I quickly checked the first name in the article (Donald Turcotte) and indeed he has been published in a number of peer reviewed journals including Science, about self organized critical behavior. What I don't understand is how you can judge whether the scientists involved in this research are doing good work or quackery based on one popular press article and not try to examine the facts before jumping to conclusions. I'm not saying what is described in the articles is right or groundbreaking, what I am saying is that these articles alone don't give nearly enough information to form a reasonable opinion (although I don't have enough time to go and do a full literature review to form an informed opinion either)

Don't blame the scientists for poor reporting.

--
-------- This space intentionally left blank --------

Extremely Interesting, looking at God by Wah · 1999-09-02 06:44 · Score: 2

Not to be too crazy, but if this holds up, and others find this curse, it is exceptional. The basic curves of life, and chaos. This is the stuff that explains why a seashell and a universe have the same design. Chaos theory and quantum mechanics both show a certain unpredictability to reality. Science like this shows there is some underlying pattern. At the very least this is extremely interesting, at least for all of us that want our own universe some day.

--
+&x

Ooh, i'm a believer (sung to music) by Wah · 1999-09-02 07:06 · Score: 2

note: this is all dependant on whether this is actual or some disillusioned scientists. I tend to beleive it, mainly because these scientists would most likely not be the type to publish normally, but until I see it from another source I won't totally believe it, that being said, let me argue like I do.

Let's say one night you watch the results of the lottery on TV, and the numbers '1-2-3-4-5-6' come up. Is that a rare occurence? No. That sequence is as likely to occur than your birthday and your girlfriend's birthday combined into esoteric equations.

Example number 2: I'm with this girl one night. I say my astrological sign is Scorpio. "Really!" she exclaims, "I'm Scorpio too!" What are the probabilities of that happening? 1/144? No, just 1/12. At one point (and cryptos will be familiar with this) if you add people, it becomes a rare event that you do not find people with the same sign.

Both of the examples you give here are actually rare occurences, not the number series themselves, but the fact that you recognize them as special series. You note their occurence as extremely rare (the water cooler talk if the lotto was 1-2-3-4-5-6!!) thus in fact making them rare.

These guys were both looking at special curves, in fact random , that turned out to be the same. That is significant in the number of other patterns that can, or cannot, be explained. At the very least this will cause your insurance rates to go up :)

We're 6 billion on this Earth. It's bound to happen to someone. Same thing with winning the grand prize lottery once or twice.

That's what the story said, very rare occurences are more likely. Check out the Drake Equation if you think that couldn't be significant

cold fusion
this is different (so far) in that it was two totally seperate areas of study that found the same thing, not some freaks in the desert.

Cool stuff regardless.
Slashdotia
pronounced Slash-dosh-ya? :)

--
+&x

Oh, another point on Chaos Theory... by Enoch+Root · 1999-09-02 04:26 · Score: 2

Since when is Chaos Theory ridiculed by the scientific community because it's a little wild? And how in hell could the popularity of Jurassic Park ruin the work of Chaos Theoricists?

They're trying to sweeten up the deal by placing the guys behind this as innovators who took on a controversial path. That's just downright silly. I took Chaos Theory grad. courses in college, and let me tell you it's so widely-used that it's like saying electricity is a controversial theory. Let me also tell you that what they're trying to say has absolutely nothing to do with Chaos Theory.

I mean! I hope they never make a movie starring Jeff Goldblum about Newton's life, because we might end up refuting Classical Mechanics (even at non-relativistic speeds) tomorrow, wouldn't we? And those movies 'IQ' and 'Young Einstein' really ruined Relativity for me. Drat.

"There is no surer way to ruin a good discussion than to contaminate it with the facts."

Nothing new about this by jerrytcow · 1999-09-02 04:27 · Score: 2

This is exactly what one sees when plotting exponential distributions on a log scale. If you have a reaction where A -> B at some rate, then plotting, for example, event durations you would get this distibution as long as the x-axis is log. When working with any system where there is a delta G of reaction(s) the distribution is not gaussian and you can see this graphically.

Re:Sceptic in Slashdotia by Wah · 1999-09-02 07:32 · Score: 2

also go here halfway down the page
jump to "turcotte"

--
+&x

I knew it by ch-chuck · 1999-09-02 03:56 · Score: 2

this is just another ruse for the insurance company to raise my flood insurance premiums again!

Chuck
Conspiracy theorist

--
try { do() || do_not(); } catch (JediException err) { yoda(err); }

It is distribution-distribution plot by craw · 1999-09-02 07:39 · Score: 2

The graph is rather confusing. This is my interpretation of it. Go out in the field and count the number of critters and categorize them by their species (id). Then normalize this count by some factor (perhaps total number of critters that were counted). For instance, I counter 1K monkeys, 500 cats, 500 dogs, 480 turnips, 200 rats, 50 snakes, 10 roaches, 5 hippos, 3 programmers, and 2 script kiddies. Now plot this distro.

The monkeys were less rare and therefore plot to the right, while the programmers, and script kiddies are rare and plot to the left. The "mean" value is the dogs and cats; this plots more to the right.

So what they are saying is that there are more species that have a smaller (rarer) number of critters that they could find. The "most common" value corresponds to the "average" number of critters per species.

I guessing now, but if one did a similar survey of the world's population using nationality instead of species, one may get a similar type of distribution.

As we all know... by mattdm · 1999-09-02 03:57 · Score: 3

One-in-a-million chances happen nine times out of ten.

--

Plagiarism by jafac · 1999-09-02 03:58 · Score: 2

Isn't this just Murphy's Law?

"The number of suckers born each minute doubles every 18 months."

--

These are my friends, See how they glisten. See this one shine, how he smiles in the light.

Log Normal Linux by craw · 1999-09-02 08:24 · Score: 2

How can one get a long-tailed statistical distribution as oppose to a symmetrical Gaussian distribution? There is one simple model that will generate this.

Suppose that ppl's programming skills are statistically Gaussian distributed. These ppl then decide to produce a "new" OS called linux. The contribution of these ppl are then plotted up. One would find that the majority of ppl produced a lot of "minor" improvements, smaller programs, scripts, and responses on mailing lists. There would be a smaller group of ppl that contributed a lot of important stuff.

This is the lognormal statistical distribution, IIRC. A bunch of ppl are capable of writing good code in support of this new OS. Unfortunately, only a smaller subset of these ppl have the time to work on the project for a long period of time. Then only a smaller subset of these ppl have the inclination to volunteer their services for this long period of time. Additionally, only a smaller portion of these ppl have the overall skills to do this. The result is that their are only a few ppl that have all of these attributes.

Sorry for this simplistic explanation (it is late and should really be sleeping now). A log normal is really a summation of normal distributions in log space (multiplication in regular space). Another way to view this is to ask yourself a bunch of statistical what if questions (the questions should really generate a set of answers that are Gaussian distributed). When you answer no then you are out of the game. More ppl are eliminated early.

In other words by rde · 1999-09-02 03:59 · Score: 3

The new curve is broader and more gently sloping, suggesting that the rarest events occur more often than predicted by the bell-shaped curve.
Or, as wizzards have known for years, million to one chances happen nine times out of ten.

But seriously, folks. This reminds me a lot in terms of its applicability to pretty much everything of an article in New Scientist that I also found darn interesting.

This says *absolutely* nothing by hawk · 1999-09-02 04:55 · Score: 2

The authors have found things that were mismodeled as gaussian and instead follow another distribution. So what? There are plenty of distributions besides the normal that are assymettric and have fatter tails.

It *may* be that they've found another distribution that appears in multiple fields, but there's not enough here to judge this as a statistician. If it has any parameters beyond mean and variance, I'm not likely to be impressed--I can probably produce a three parameter beta distribution that's close.

hawk,wearing his Ph.D. statistician hat for the moment

Your Supprised?? by BadlandZ · 1999-09-02 05:05 · Score: 2

Come on now people... Forgive me, but this is hardly shocking.

I looked over the articals, and all I can say is "So What?" the Gaussian distribution is based on pure random-ness. Did you expect everything to be a completely random event?

Neither artical seems to go into great detail about how the new curve was calculated, but it's simply a _FACT_ that applying the Gaussian distribution to most events is considered a "simplification" of the problem, assuming it's random. Take away some random-ness, and of course the Gaussian distribution won't fit.

Intelligence (however mesured) will not be purely random, nor will floods, grade distributions, tornados, or anything

What's missing from both of these pieces is an explination behind the way the new curve was built, and on what foundation. The Poisson (spelling is way off there) distribution is frequently used in place of Gaussian because it "fits better," but again, doesn't prove that the events have much to do with the math.

This is a case of "curve fitting gone wild" here, and unless I can see someone spell out in scientific detail the relationship between the events and the distribution, I don't buy it. So, they have a new equasion, and a new curve, it doesn't mean that the events are related to the math directly. If you look for anything hard enouth, you will start to find it everywhere.

I do award them credit for a new curve that better fits some models. If the equasion for thier curve is manageable. If it's a complex equasion, it's worthless, because the whole point is to make some equasion fit a distribution of events. If theirs fits, and it's easy to calculate, it's benificial. But it does not emply a direct coorilation between the functions and the variables in the distribution. How do I explain this in SlashDot terms... (/me get's frusturated).

Ok, take Moore's Law, you all know that right? Processor power doubles every 18 months? Or, the more accurately I believe he stated something to the effect that the number of circuts would double every 18 months. Well, a loosely fit exponential function will almost match this trend (roughly). But then you have to "adjust" the month scale between 12 and 24 untill the curve fits well. Now, that's a "model" but does not prove scientificly that circuts and design engineers are behaving exactly as can be predicted. At some point in the future everyone has predicted Moores Law will fail. See... It's a model! Curve Fitting.... Doesn't PROVE anything about what's going on in developers minds, or much tangable other that the "estimation" that things will get more powerfull in the computing world.

Now, take it a step further, say Moores Law fails right as people develop a new method of increacing computing preformance, like say 3D circuts, or something not yet concieved, and with less "countable circuts" you get more preformance. Suddently, new devices start to a few less circuts, and more power. Now the Moores Law curve goes down, slowly at first, leveling off, and maybe dropping just a tad, and it starts to look like a "bell shaped curve" only half drawn. You could go "Curve fitting crazy" and say "Hey, it's Gaussian, it's going to go down now, and within another 15 years, we will all be back to 8 bit processors!" That's just idiotic.

In short, curve fitting is useful to predict many things, but it can not be assumed that the curve implyes natural phenomona. Any curve that fits data is useful. A curve that fits data does not directly imply complete coorelation of events, or diffinitive proff that God does or doesn't play dice (hope he does personally, has to have fun sometime!). And Furthermore:

For those who continue to doubt that it could all be so simple, Prof Turcotte has a suitably direct response. "People say: 'You can't do it because it's too complicated a problem'," he says. "We say: 'Just look at the data'."

So his data fit, so what? Any reasonable math wiz should be able to come up with a few dozen equasions that fit a line. Doesn't prove a thing.

Forgive my typos, bad grammer, and spelling, I got pretty pissed at tabloid junk science, and I had to vent. Feel free to prove me wrong, I would like to see how you can prove the new equasion and chaos theory is the best "insight into the universe" we have... BTW, if you can prove it, you'll probably be up for a Nobel Prize too.

Re:What the curve looks like. by smale · 1999-09-02 06:03 · Score: 2

from the second article:

"The reason the systems behaved in the same fashion, they agreed, was that they shared a feature known as self-similarity. If an object is self-similar, it means it looks the same when viewed from far away or nearby. One example is the cauliflower: just as it is made up of individual florets, so each floret is made up of still smaller florets. If you were given a picture with no sense of scale, you could not tell if you were looking at a whole cauliflower or just one floret."

I grepped the article for "fractal" and not once was it mentioned. Gee I'm pretty sure thats the term used for what the author describes, or is the target audience so simplistic that the proper terms have to be dumbed down?

Fear the popular press's interpretation of mathematical research data, especially when they need to mention Jurassic Park in the body of the story.

Hurst processes, anyone? by Kaa · 1999-09-02 17:02 · Score: 2

In my own field, the distribution of stock market returns is often taken to be distributed log-normal

You can also start with log returns (instead of "normal" returns). This will give you an approximation to a Gaussian (as opposed to a lognormal distribution), plus they are summable across time. I work almost exclusively with log returns -- they are a pain when you need to calculate portfolios, but nice otherwise.

A new distribution that gives increased weight to rare events would be very useful

There are several (e.g. Cauchy), but the problem is that they are much harder to deal with (analytically) than the Gaussian. And if you don't like any, you can always work with the empirical distribution -- no need to pollute the facts with assumptions about what they should be. However, not much of statistics will be useful to you -- the Bayesians offer some good tools.

Getting back to the original point, I wonder if these guys heard of Hurst and Hurst processes. A persistent Hurst process (sometimes called black noise) will generate something like what they found, and Hurst himself developed his theory on the basis of natural phenomena (he started with the frequency of floods on the Nile which occurred, surprise, more often than should have been expected). Skim through Peters "Fractal market analysis" for more information.

I bet these guys rediscovered Hurst processes.

Kaa

--

Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.

Fat tails by Anonymous Coward · 1999-09-02 06:15 · Score: 2

This is a well know fact in statistics and finance, so called fat tails.

Baloney by ruff · 1999-09-02 03:59 · Score: 3

Just because your data doesn't precisely fit the distribution, it does not mean the distibution is "wrong." What it means is your data doesn't match your distribution.

This appears to be another case where journalists have missed the point.

The Gaussian distribution is not "wrong" in any shape or form.

How does this relate to standard deviations? by Mr+Z · 1999-09-02 04:04 · Score: 4

First, let me say that the graph in the article is poorly labeled (or at least their example poorly chosen), IMHO, since "rarity" is related to the number of standard deviations you are from the mean (whether or not the distribution is symmetrical), whereas their graph has rarity monotonically decreasing from left to right. I guess in this sense ("rarity of a species"), rarity != probability.

This new graph stikes me as a bit odd, since it's not symmetrical. With the bell curve, you only need to know how many standard deviations you are from the mean. With this curve, "above the mean" and "below the mean" are vastly different territories.

This curve brings up two questions for me:

Are there processes/events for which the mirror-image of this curve is the more appropriate distribution?
Whatever happened to the other distributions we know and love, like the Poisson distribution? Not all random events are evenly distributed, and we've known this for a long time.

I guess this new curve is just another way of saying that "Hey, there's a class of 'random' events out there that share a common non-uniform distribution!" While that's useful to know, I don't see it as the ultimate refutation of the Gaussian distribution.

--Joe
--

--
Program Intellivision!

Interesting not exceptional by PG13 · 1999-09-02 04:04 · Score: 5

The use of the gausian curve is based on the assumption that the random variable we are considering is actually gereated as an average of many many independent random variables. It has been shown for all 'reasonable' independent random variables in the limit their average will be a gausian distribution. This is straightforwad mathematics no arguing with this.

As such from a mathematical point of view this has nothing to do with replacing the gausian curve...it is still clearly the most 'natural' mathematical curve. However, what I understand the authors to be claiming is that certain types of real world events are not actually gaussian and are described better by this model. This shouldn't be that surprising as often the 'extreme' cases are not caused by a mere sum of the independent random variables mentioned earlier.

For instance intelligence might be regarded as the influence of a great deal of small random variables (how some genes got arranged upbring etc..) but the truly tale end cases such as mental retardation do not occur because all of these factors go bad, (someone who is retarded is the result of some genetic defect usually not a combination of bad upbringing poor nutrition etc..). This is probably not the kind of thing the distribution describes but it shows that the gaussian really never has been the end all and be all.

So while this is undoubtly a very interesting subbject it really isn't that exciting. Ohh and the claim that the greater incidence of natural disasters disproves the gaussian was really BS, while they may not be gaussian this doesn't appear to be a large enough sample size to make such definitive claims

--
Marriage is the "pseudo-ethics" that cloaks the messy truth of sexuality in the raiment of propriety -- it's "Don't Ask,

Another thought: When one side is near saturation. by Mr+Z · 1999-09-02 04:11 · Score: 2

I happened to think of one possible reason why so many phenomena might fit a lopsided curve better: The bell curve implies the possibility of infinite extension in both directions. If the mean of the distribution is near one physical extreme (for instance, looking at average rainfall levels -- you can't have negative rainfall), then the curve must become lopsided.

Perhaps that's what they've stumbled onto?

--Joe
--

--
Program Intellivision!

Re:Lots of Pratchett fans here? by dillon_rinker · 1999-09-02 18:11 · Score: 2

And if you want the technical exposition (rather than the narrative one he provides in the novels), then pick up Pratchett's Discworld RPG from Steve Jackson Games. :) Hmmm...if Bill Gates lived on the Discworld, who would come for him when he died? And since rare events are much more common under this newly discovered distribution than under a Gaussian distribution, and since the Discworld is said to reside under the far tails of the probability curve, does that mean there are more Discworlds than were previously believed to exist?

Re:Nor is it particularly right... by dillon_rinker · 1999-09-02 18:23 · Score: 2

YEAH! RIGHT ON!

I feel the same way about the "least squares" technique for determining the line of best fit. It is popular precisely because it is easy to do calculus on x^2.

Mmmmm, curves by Kismet · 1999-09-02 04:15 · Score: 2

I personally prefer the more voluptuous curves.

One of the "authors" replies by Anonymous Coward · 1999-09-02 22:09 · Score: 2

September 3, 1999

Wow!

It is interesting to see the response that this "research" article in the financial times generated. I'm a research associate (Bruce Malamud) working closely with Donald Turcotte. A student wrote me about the discussion your web site was having. Donald Turcotte was one of the scientists "quoted" in the financial times article. My research area has been in the areas of "time-series analysis" and also applying ideas of fractals and self-organized criticality to natural hazards. I did my Ph.D. with Donald Turcotte and am now doing a brief stint as a postdoc while I look for a "real" job in the world.

First of all, this Financial Times article was a "quickly" researched article on the part of the person who wrote it. Donald Turcotte was contacted and interviewed by phone on Tuesday/Wednesday, with no contact afterwards from the Financial Times to see how correct they got the overall picture. This is how things are and he and I both gulped when we saw how the article appeared. We quickly prepared a short "response" from him (below) to the deluge of e-mails and telephone calls that he received yesterday.

Bottom line, he was a bit misquoted, but the general idea holds. We are talking about applying the ideas of power-law frequency-size distributions (i.e., fractals) to extreme events, including floods, forest-fires, earthquakes, landslides, etc. Donald Turcotte has been active for many years in the area of applying fractals, self-organized criticality, and chaos theory to the earth sciences, and yes, he knows very well that he did not "invent" the idea, just made many applications (well, a bit more then that, but read his book).

On the most basic level (and no, I'm not trying to be insulting, I'm sure many people on this site know what I'm talking about already as this is basic statistics), at one level the idea is a very simple one. Plot the frequency-size distribution of a set of data and see what curve is that best fits the data, i.e. what might be the underlying distribution. For some sets of data (such as forest-fire burn areas, earthquakes, and many other "natural" data sets) the frequency-size distribution follows a nice straight line in log-log space, i.e. it is follows a power-law (fractal or self-similar) distribution. Although one cannot say for SURE what an underlying distribution is, one can make certain (statistical) guesses as to whether a distribution follows more a Gaussian, log-normal, power-law, etc.

Once on "believes" that a set of data follows a certain distribution, one can then begin to make some guesses as to what an "extension" of that curve might bring in time. If one has 30 years of flood-discharge data, one might then be able to make certain predictions as to the "size" of what the 100-year flood might be. Same with earthquakes. One has a better idea of the probability of having a certain size or greater earthquake, flood, forest-fire, etc. each year. It just happens that many of these events appear to follow power-law distributions, and these are not as "accepted" in the statistical community.

Don just came in and is looking over my shoulder. He adds (to my above comments) that statisticians do not in general recognize power-law distributions because one cannot define a pdf for them. (Although one can define pdf's for certain distributions that are similar distributions to the power-law distributions, such as the Pareto distribution).

So...in terms of the insurance community, they are of course very interested if a given "natural hazards" appears to follow more a power-law distribution vs. log-normal or Gaussian, as the resulting recurrence intervals will be very different. Power-law distributions tend to be very conservative for extreme events, i.e. one would expect more larger events in a given period of time, then say a Gaussian distribution. Others of course interested in this underlying distribution would be engineers trying to decide how big a flood one might expect in a given area in a given amount of time (and yes, we're dealing with extreme events, so the statistics are small and unsure), so as to know where people can build houses, how deep to make the bridge supports, etc. Bottom line is the statistics are unsure because there the data sets are small, but people need some sort of a starting point as a lot of money rides on the answers of what the "underlying" distribution might be.

There are also many scientific implications, ranging from the simple "describing" what distribution a data set best follows, to understanding better (or in a different way) the underlying basic physics or equations that describe a given natural phenomena due to a better understanding of the statistics resulting from the equations vs. the actual data. In addition, many scientists are now beginning to think that the pervasive power-law distribution in nature is a general indication of self-organized critical behavior. One definition of self-organized behavior is when one has a complex system with a small steady input, and a power-law distribution of the "avalanches" (the events). Donald Turcotte and I wrote a paper (in Science, see below) applying this general idea of self-organized criticality to computer models and forest fires. Of the references listed below, this is probably the easiest for people to get.

OK, before I start babbling. Below is the "reply" that Donald Turcotte wrote to many of the e-mails that came in during the last day.

Bruce Malamud

_________________________________________
Wednesday September 2, 1999
Ithaca, NY, USA

Dear Interested Reader:

Due to the large number of e-mails and telephone calls I have received with respect to the articles by Michael Peel, "New Curve Makes Life Predictable" and "Redrawing the Curve Reveals New Pattern of Events", that appeared in the Financial Times, September 2, 1999, I have prepared a short general reply. If you have further questions or comments after reading the below "comment" to the article, please do not hesitate to contact me for further information.

These Financial Times articles emphasize the importance of power-law (also called fractal or fat-tail) distributions in estimating the probability of occurrence of extreme events. It is unfortunate the article implies that I invented the idea of power-law distributions, which have been recognized now for many decades. For instance, earthquake hazard assessment is based mainly on the Gutenberg-Richter relation; which is a power-law distribution of the number of earthquakes as a function of their magnitude [for some papers where I discuss this, see DLT, Annual Review of Earth and Planetary Sciences, Vol. 19, p. 263-281, 1991; DLT, Physics of Earth and Planetary Interiors, Vol. 111, 275-293, 1999].

My work in power-law distributions is based on the concept of fractals, which is due to the pioneering work of Benoit Mandelbrot [for instance, see his book, The Fractal Geometry of Nature, Freeman, San Francisco, 1982]. Mandelbrot, along with many other researchers, have applied the concept of fractals to many phenomena in the natural and "man-made" world, including to financial time series. Other distributions, similar to the power-law, such as the Pareto distributions, have also been used for a long time. A good web page which discusses fractals and has many links is The Spanky Fractal Database (http://spanky.triumf.ca/].

My own contributions have concerned applications to natural hazards and related phenomena. These are set forward in detail in my book [DLT, Fractals and Chaos in Geology and Geophysics, 2nd ed., Cambridge University Press, Cambridge, 1997] and in a major review paper on self-organized criticality [DLT, Reports on Progress in Physics, Vol. 62, 1999, available as a pdf document (preprint) which can be sent upon request].

The principal contributions of my group have been the applications of fractal distributions to:

(1) Fragmentation (by explosions in asteroids, etc.). [DLT, Journal of Geophysical Research, Vol. 91, p. 1921-1926, 1986]

(2) Mineral deposits. [DLT, Economic Geology, Vol. 81, p. 1528-1532, 1986]

(3) Floods. [DLT and L. Greene, Stochastic Hydrology and Hydraulics, Vol. 7, p. 33-40, 1993; DLT Journal of Research NIST, Vol. 99, p. 377-389 1994; B.D. Malamud, DLT, and CC Barton, Environmental and Engineering Geosciences, Vol. 2, p. 479-486, 1996. The last paper is available as a pdf document at http://coastal.er.usgs.gov/barton/pubs_online.html ]

(4) Landslides. [J.D. Pelletier, B.D. Malamud, T. Blodgett, and DLT. Engineering Geology, Vol. 48., p. 255-268, 1997; available as a postscript file at http://www.gps.caltech.edu/~jon/]

(5) Forest Fires. [B.D. Malamud, G. Morein, and DLT. Science, Vol. 281, p. 1840-1842, 1998; available as a pdf document for subscribers of Science, web site: http://www.sciencemag.org/]

Many extreme-value events are directly related to time series that exhibit persistence or memory (for instance, time series of temperature, river discharge, the stock market, etc.). A good reference to applying persistent techniques (and a discussion of how to apply the techniques) is Advances in Geophysics, Vol. 40, B.D. Malamud, J.D. Pelletier, and DLT.

Two other colleagues that have used power-law techniques applied to natural hazards include Dr. Bruce D. Malamud (Cornell University, e-mail: Bruce@Malamud.Com) and Dr. Christopher C. Barton (USGS, e-mail: barton@usgs.gov, home page: http://coastal.er.usgs.gov/barton/).

Again, please do not hesitate to contact me for further questions.

Donald L. Turcotte
Maxwell Upson Professor of Engineering

:::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::
:: Donald L. Turcotte
:: Department of Geological Sciences
:: Cornell University, Snee Hall
:: Ithaca, NY 14853-1504, USA
:: Office: 607-255-7282; Fax: 607-254-4780
:: e-mail: turcotte@geology.cornell.edu
:::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::

Sceptic in Slashdotia by Enoch+Root · 1999-09-02 04:19 · Score: 4

Alright. I don't buy it.

The problem here is how you define and measure a rare occurence. Let me give you an example.

Let's say one night you watch the results of the lottery on TV, and the numbers '1-2-3-4-5-6' come up. Is that a rare occurence? No. That sequence is as likely to occur than your birthday and your girlfriend's birthday combined into esoteric equations.

Example number 2: I'm with this girl one night. I say my astrological sign is Scorpio. "Really!" she exclaims, "I'm Scorpio too!" What are the probabilities of that happening? 1/144? No, just 1/12. At one point (and cryptos will be familiar with this) if you add people, it becomes a rare event that you do not find people with the same sign.

All that graph is showing me is that the guys (I'm hesitating to call them scientists - I mean, they published in "serious papers"? Come on. Names, please) looked purposefully for freak occurences, discarding other "rare" occurences that were perfectly normal. That's why the left side of the graph is wider.

Thing is, the Gaussian curve doesn't come out of nowhere; it's not arbitrary. For instance, in statistical mechanics and quantum mechanics, you get bell curve distributions precisely because of the distribution of particle states.

All these guys are saying is, "rare events are not as rare as we think they are". That's not because the bell curve is wrong, it's because we seem to forget how huge the Earth provides a sample.

What are the odds of being struck by lightning twice? One in a billion? We're 6 billion on this Earth. It's bound to happen to someone. Same thing with winning the grand prize lottery once or twice.

And, again, same thing with floods or tornadoes. Yes, in themselves they're rare. When taken alone they seem improbable. But on the scale of the planet, that's the kind of thing that happens.

Alright, anyone got another article on cold fusion lying around?

"There is no surer way to ruin a good discussion than to contaminate it with the facts."

Re:Having trouble understanding the graph... by Squid · 1999-09-02 04:19 · Score: 2

The graph makes more sense if you relabel its axes: x=number of individuals of a species, y=number of species with exactly that number of individuals.

In other words: we aren't talking about the likelihood that you will encounter an individual of the species, we're talking about counting the species itself. A few really common species, a good spread of "average" species, and a few species represented by few individuals.

'Course I could just be full of it. Wouldn't be the first time...

--
~ radiographite: art by john shepard

Re:Having trouble understanding the graph... by Anonymous Coward · 1999-09-02 04:23 · Score: 2

>In the 1st article, there is a graph about midway that appears to illustrate the notion that, with the new curve, you are more likely to find the rarest creature than the least-rarest creature. I must not be interpreting right, and I tried reading that part a few more times.

It took me a minute, too - I'll try to distill my understanding into english. Assume that the rarity of a species is related to the number of times it is found (duh). The x-axis can be thought of as the number of findings of a given species. The y-axis can be thought of as the number of species that were found X number of times. Using the gaussian distribution, you would expect a symmetric tail-off in both the more-rare and the less-rare directions from the peak value. {Yes, I know you can have asymmetric gaussian distributions.} What this new curve is showing is that the tail-off is much less in the more-rare directions. In other words, assume the peak of the curve is at 100 sightings of a specie, with a standard deviation of 10 sightings. You would expect to some number of species to have 130 sightings (3-sigma). Under the gaussian distribution, you would expect to see the same number of species that only have 70 sightings. This new distribution says that the number of species with only 70 sightings would be much higher than the number of species with 130 sightings.

Fascinating - I will certainly have to explore this further.

Slashdot Mirror

Gaussian Distribution being questioned

28 of 205 comments (clear)