Distributed Computing World Climate Simulation
Burnt Offerings writes: "The BBC reports that scientists at climateprediction.com are nearing the completion and public release in late summer of a distributed computing project that simulates the world's climate from 1950-2050 AD. It seems that each user's simulation will have different initial conditions built into their runtime simulation and a single completed simulation from 1950-2050 AD takes on average eight-months (Doh!), assuming average household computing power. The results will be sent back to the project's team, where they will select the models that resulted in the 'real' climate patterns that have occured since 1950-2000. I presume they will then use these validated models to help extrapolate the world's climate from 2000-2050. Pretty cool (or should I say warm? or hot?)."
It's generally regarded as a Bayesian technique. Actually, there's far more to Bayesian statistics that bootstrapping, but it's the part I spend a lot of time working with. In fact, I suppose that bootstrapping isn't fundamentally a Bayesian process, but it is highly empirical so it appeals to the same "crowd" as more decidedly Bayesian techniques. By the by, "Bayesian" statistics are statistics that make heavy use of Bayes' Rule to incorporate prior knowledge not included in your measured data.
My background - you develop a program to predict something biological. Let us say, to pick a problem on the same order of difficulty as predicting the weather, that you're trying to predict the three dimensional confirmation that proteins assume, based on their sequence.
Now, okay, you have a bunch of known sequences, which other people (personally, I do both the data mining and some crystalography) have attached to known structures. So, what do you do?
Well, you could fiddle with your program until it predicts really well on those sequences, and announce that it was good. This is "Bad Science", as the parent-poster points out, since the criterion are arbitrary - you have a tendency to "discover" random noise in the data, and you have no way of validating your results.
So, second option. Instead, you split the data in half at random (actually into more than 2 pieces, but conceptually in half.) You take one half, and you make the model predict as well as you can on that data. Then, you VALIDATE ON THE OTHER HALF OF THE DATA. You *never* change the model on the basis of the second half of the data - that is arbitrary/bad/cheating. This is called "bootstrapping". It has nothing to do with compiler installation.
So, as far as most scientists (as opposed to mathematicians) are concerned, the important question is - does this work? In the biological sciences, I can say categorically, yes, this bootstrapping technique has a proven track record. It does work. Obviously, you can screw up (using non-representative data is a good start) but the technique, when properly applied, is sound.
So, I assume it would work for predicting the weather, as well. By work I mean - you would know how well your software predicted the weather. Bootstrapping is not a means of predicting the weather in and of itself, merely of honestly evaluating the effectiveness of a weather prediction mechanism you already have.
The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
Perhaps it's not impossible, but no-one has been able to do it yet. That's why they're resorting to this...
Can anybody read between the lines here? They're essentially saying, "Every climate model we have (that predicts global warming) wasn't able to accurately predict the global warming 1900-2000. We're fresh out of ideas so let's run a couple of million models with varying random values. When one of them (inevitably) comes pretty close we can cling to that as "proving" it to be a working model and use its results as convincing evidence that we must cut CO2 production or we will all die."
I'm not giving these jokers a minute of my CPU time. They are guessing. They don't have a workable model so instead of trying to keep thinking they're in a rush to get a "verified" (by passed events) model within a year so they can try to use the results to push their political agenda. The fact that a few of the millions of models they run correctly guesses the last 50 years of climate change is no indication it will predict future climate change unless there is a reasonable belief that the model was based on some logic. These models are based on random guesses at chaotic values.
Trust me, the results are already known. It will show global warming for 2000-2050. Can you imagine the coup if the random model that happened to guess 1950-2000 also showed global cooling of 5 degrees in the next 5 decades? How much you wanna bet that that result would NEVER see the light of day...
Spend your CPU cycles on SETI...
That sounds a little better. I did go to their website, and saw that they were going to use one of their four models, but I didn't dig farther to see that the journalists (as per usual) didn't understand what they were copying into their notebooks.
But what the researchers should be doing first is back-testing by using the first 25 years as calibration and the second 25 as a check on the extrapolation. Then doing it the other way around. Or maybe the distributed software does that, and all the permutations in-between.
At any rate, where it should fall on its ass is in the prediction of weather that actually makes a difference: hurricanes and tornadoes, which have crucial features that won't be well modeled, if at all, by the large differential boxes they selected. It will also run afoul of interference from random volcanic eruptions on a Pinatubo-Mount St. Helens ashfall scale, which happen on a decade or so time scale, the timing and location of which would be critical to the rest of the test run.
So I'm going to stick with my attitude that this is a tragic waste of CPU cycles that might actually go towards developing a drug that might actually save a life.
--Blair
P.S. SETI is likewise a waste; if we do hear a beep in the darkness, our only logical reaction will be to band together 6 billion of us as one to build the biggest, nastiest zero-time-of-flight weapon we can create, then hunker down in the sweaty dark to wait to fire it. Anyone coming that far is going to be wanting to make a buck off of it, taking chunks of the planet or slaves, and they're going to be ready for casual resistance.