Distributed Computing World Climate Simulation
Burnt Offerings writes: "The BBC reports that scientists at climateprediction.com are nearing the completion and public release in late summer of a distributed computing project that simulates the world's climate from 1950-2050 AD. It seems that each user's simulation will have different initial conditions built into their runtime simulation and a single completed simulation from 1950-2050 AD takes on average eight-months (Doh!), assuming average household computing power. The results will be sent back to the project's team, where they will select the models that resulted in the 'real' climate patterns that have occured since 1950-2000. I presume they will then use these validated models to help extrapolate the world's climate from 2000-2050. Pretty cool (or should I say warm? or hot?)."
The end result of the project:
"On 1st January, 2050, it will start rather cloudy with outbreaks of rain, mainly in the north. These will clear up by late afternoon, leaving it warm with mild breezes in most of the country."
graspee
On this day in 1950, it was raining. The rain was as pure as Evian.
On this day in 1980, it was raining. The rain was as pure as the innards of a Duracell battery.
I bet I can take just 48 years without even using a PC and have an ever more correct answer then everyone using the lastest pc.
If it wasn't, we'd have accurate forecasts up a few months in advance. As it is, I find forecasts are routinely wrong about even tomorrow's weather. What happened to the hole "butterfly flapping its wings in Singapore affects the weather in Kansas" thing? I don't see how initial conditions would tell them much, I bet even random quantum events have a very strong influence on weather models over 50 years. I'd put the odds of success for this distributed computing project around the same as SETI.
Websurfing done right! StumbleUpon
Blame the climate changes from 1950 to 2000 on the expanded use of the automobile and unregular industrial waste. Do you think any scientist in 1950 could have known about our current situation? How can we in 2000 know about the new problems that'll creep up between now and 2050?
Spend your extra CPU cycles computing the cure for cancer or finding ET. I doubt this will prove useful.
I'm not going to reprint the page ,HadSM3,HadCM3,HadCM3L)
unless it get's slashdotted, but none of the models (HadAM3
in the simulation take into account the biological factor.
It has been said, that both termites, cars, factories, cows, and Taco
Bell produce huge amounts of greenhouse gas which do attribute to global
warming. How can this lead to an accurate prediction model if these factors
aren't accounted for?
So, let me get this straight: they're going to pick generated results that most closely resemble real, measured results, and then adjust their model to compensate.
Those models wouldn't be "validated" as the poster claims, or would they? It seems to me that without identifying the reasons the computed models differed from the measured results, the selection is damn near arbitrary -- the difference may be something the scientists never considered.
I've been wrong before.... once.
It seems like there is a bit of professional dueling going on between this project and Seti@home looking at their FAQ and the quote by Dr Meyers Allen saying about their project "It's not a stripped down 'toy' version, so the runs take time"
My favorite quote from their FAQ was in response to the possible affect the computers running the client might have on the environment:
"To travel the paths of human imagination you have to be willing to unlearn all you know"
If this thing takes eight months to complete, I sure hope they plan on storing periodic checkpoints of progress for each test in a central location. What happens if my machine gets hosed at four months? Is all that data lost?
The information on their website says the time step is 30 minutes and that their box is 3.75 degrees longitude by 2.25 degrees latitude (or visa versa: BIG, in any event).
Therefore, how do they expect this to work -- additionally absent any outside changes in the environment?
What I mean is, how do they know if they did a good job? Perhaps if the results are all very close to the current day climate, I'd buy that they got it right, but if they have a reasonable distribution of results, how do you decide? I mean, we've been clear-cutting the hell out of forests left and right for years: do they somehow takes this into account? Heck, how do they present the geographic information about the Earth: this bit has forest, this bit is desert. I would think that this would make quite a bit of difference in results (changes in albedo, for instance).
I certainly wish them luck, but they're not getting my PC for that long without something more detailed , informationwise.
Weather is chaotic, but climate is ... well, ok, climate might be chaotic, but we really don't know -- and if it is chaotic, it is still only chaotic on timescales of more than 50 years.
Predicting climate 50 years in the future is a computationally difficult task, but it isn't impossible the way that predicting weather would be.
Tarsnap: Online backups for the truly paranoid
Global warming accelerated by CPU heat as weather enthusiasts simulate climate with computer. Temperature for the next 2 years will rise by 2 degrees
This is cool. Beyond being used to understand the current climate change that is happening, obscure weather phenom could be modeled on a larger scale for a longer time.
Perfect example would be an article out of the latest AMS Bulletin of the American Meteorological Society Earth Interactions that discusses plane contrails. It seems that the lack of air traffic after 9/11 allowed the meteorlogist to work on a long held theory that plane contrails affect weather. Only problem was that the dataset was only over three days, which was just a small time sample.
Using a system such as this, those weather conditions could be recreated over a longer period of time and the results could be realized. Too cool.
Bryan R.
The price of freedom is eternal vigilance, or $12.50 as seen on eBay.....
On their FAQ (dated 5 Oct 2000!), they state they will support Linux initially and are looking for sponsorship to port the client to Windows. Considering the "What's New" page was last updated on 17 Aug 2001, the actual status of ports for different clients is unclear.
"To travel the paths of human imagination you have to be willing to unlearn all you know"
Pretty cool (or should I say warm? or hot?)
You should wait until the results come in.
Got friends?
In simulation A we set the Funding Amount variable to 0$ and the Donating Corporation to NULL. Their results was intense global warming in 2050.
In simulation B we set the Funding Amount variable to 200,000$ and the Donating Corporation to Exxon Mobile. Their result was no global warming at all in 2050.
In simulation C we set the Funding Amount variable to 300,000$ and the Donating Corporation to Amazon Lumber Harvesters. Their result was an actual decrease in green house gases by the year 2050 due to deforestation.
In simulation D...
Outdoor digital photography, mostly in New Engl
Do you have a big RS/6000 or two sitting around, or a sizeable Linux cluster(s) connected via fiber to the National Climatic Data Center in Silver Spring, MD, that you can crunch a few dozen gigabytes of data a couple of times a day to help out with?
Speaking as someone who builds clusters to run mesoscale atmospheric models, the amount of data that's required to be passed back and forth between the compute nodes of a cluster requires gigabit bandwidth to keep decent processors happy. I don't see how a WAN-based distributed computing project without massive bandwidth and nearly isochronous data transmissions are going to be of any use in producing a working forecast. Most atmospheric models I've seen require frequent communication between the nodes to keep the processors busy. In an average run for an area the size of a couple of average states for a 36 hour forecast, the traffic on the network in a five node cluster approaches a terabyte.
What is climate but (basically dumbing it down) taking the average of the last x number of years of weather to define the norm. So, to define what the climate is fifty years into the future, one would have to take a look at the weather for each of those years. I agree that is no small task.
I must take issue with the parent post, though. I agree that weather is a choatic system, very much so. But, all aspects of weather can be parameterized, even the most chaotic ones. The key here is a matter of scale. The mesoscale type systems are extremely hard to model, but you take a global system (long wave patterns), and you will have a much better time of modeling them. How? You throw out the small scale stuff like your butterfly and such. On a global scale, something like that would quickly disappear into the larger scale. That is why global models (like the MRF, NOGAPS, and such) work better out farther (those models run out to 384 hours as opposed to smaller scale models that run out 84). Verification rates are acceptable for those models out that far (numbers I cannot quote off the top of my head). They could do better, but they would require more time to process and would not be useful to the operational meteorologist.
This distributed system will be over eight months and on such a large scale, the results will be useful.
Bryan R.
The price of freedom is eternal vigilance, or $12.50 as seen on eBay.....
I run NO Distributed Computing (DC) project unless it follows these rules:
/. article accusing a DC app of loading in spyware, or a trojan of any sort. But I have faith that it will come.
1. Must Be Non-Profit. If it is for Profit I Must get a cut.
A. example: Seti@Home is run by the University of Berkley.
B. United Devices is for profit (think about it, Drug companies will make money). However, Easynews.com gives me 2 free Gigs of access a month for running it. Hey all I want is a piece, and I am getting it.
2. A DC project must be bug free. This may seem like a bloody obvious sort of thing. But considering the state of software releases nowadays one might think I am asking for a miracle! Seriously I understand the point of Version 2 releases and stuff like that. As long as it is handled competently and professionally I probably will forgive them. But I will have zero patience for a DC project that crashes my machine or keeps me from running ANY app. And that leads me to rule 3...
3. A DC must take a back seat to.. everything. It must also be maintence free.
Does this require any explanation?
4. Finally, it must be controversy free.
I have yet to come across a
It's generally regarded as a Bayesian technique. Actually, there's far more to Bayesian statistics that bootstrapping, but it's the part I spend a lot of time working with. In fact, I suppose that bootstrapping isn't fundamentally a Bayesian process, but it is highly empirical so it appeals to the same "crowd" as more decidedly Bayesian techniques. By the by, "Bayesian" statistics are statistics that make heavy use of Bayes' Rule to incorporate prior knowledge not included in your measured data.
My background - you develop a program to predict something biological. Let us say, to pick a problem on the same order of difficulty as predicting the weather, that you're trying to predict the three dimensional confirmation that proteins assume, based on their sequence.
Now, okay, you have a bunch of known sequences, which other people (personally, I do both the data mining and some crystalography) have attached to known structures. So, what do you do?
Well, you could fiddle with your program until it predicts really well on those sequences, and announce that it was good. This is "Bad Science", as the parent-poster points out, since the criterion are arbitrary - you have a tendency to "discover" random noise in the data, and you have no way of validating your results.
So, second option. Instead, you split the data in half at random (actually into more than 2 pieces, but conceptually in half.) You take one half, and you make the model predict as well as you can on that data. Then, you VALIDATE ON THE OTHER HALF OF THE DATA. You *never* change the model on the basis of the second half of the data - that is arbitrary/bad/cheating. This is called "bootstrapping". It has nothing to do with compiler installation.
So, as far as most scientists (as opposed to mathematicians) are concerned, the important question is - does this work? In the biological sciences, I can say categorically, yes, this bootstrapping technique has a proven track record. It does work. Obviously, you can screw up (using non-representative data is a good start) but the technique, when properly applied, is sound.
So, I assume it would work for predicting the weather, as well. By work I mean - you would know how well your software predicted the weather. Bootstrapping is not a means of predicting the weather in and of itself, merely of honestly evaluating the effectiveness of a weather prediction mechanism you already have.
The good and new comes from no quarter where it is looked for, and is always something different from what is expected.
Climate - The condition of a place in relation to various phenomena of the atmosphere, as temperature, moisture, etc., especially as they affect animal or vegetable life.
Sorry to fall back to dictionary definitions, but this sure sounds like weather to me. Maybe averaged on a longer time scale, but it's still quite obviously a chaotic system. We've found loose correlations with sunspots, deforestation, etc.. but even very large trends like the "little ice age" of 1500AD are unexplained and most likely chaotic. If we can't explain hundreds of years of pronounced trends, I don't see how we can do anything with the relatively uneventful last 50 years.
Websurfing: The Next Generation - StumbleUpon
Microsoft has developed a very similar distributed simulation software package. Last I heard, it would only take 3-5 months to recieve the results, too. A savings of 3 or more months. Rumor is, they plan on using it so that a person can run Office XP. Finally enough cpu power to run it quickly... I'm sorta jealous.
That's what makes weather choatic. If you start with the exact same conditions, you still don't get the same answer at the end. That's why the weather predictions for tomorrow are so often wrong. (i remember when they said 'there's an 80% chance of rain tomorrow', but now they just tell us it'll rain tomorrow).
Now this is a climate model and not a weather model, but I fail to see how the hell that's anything more than a labeling difference.
Coincidence? I think not!
...
Seriously, this sort of modeling will take less time as processors scale bigger and Internet connectivity proliferates. I would like to participate, but it would be nice if I didn't have to run an MS OS to do so. I can, do and probably will, but if they would just release the source
They're starting with different initial conditions and hoping that some subset results in 50 years of weather?
Shouldn't they use the last 50 years of weather as initial conditions and vary parameters of the model instead?
What they're doing is like flipping an imaginary coin 500 times hoping to match the first 250 flips of a real coin to predict the the last 250 flips (albeit in a system with non-independent trials). But then they're taking those 500 flips and matching the first 250 to weather reports (might as well be coin flips) and then imagining the next 250 flips will approximate the future weather reports. What they need to do is fix the initial conditions and modify the model (coin flips vs. rolls of the die vs. LCRNG, etc.) to find a model that approximates the dynamics of the system.
Am I making sense here? How are these bozos not just going to apply their effective innumeracy to waste a few trillion CPU hours that could otherwise have been used to do protein folding or cancer-killing molecule matching?
--Blair
/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$/i
Also these people are entirely too green and liberal for my tastes. At first it is a very thought provoking idea. But these people already have preconcevied conclusions... and that isn't very good science.
On the contrary, scientists first formulate a hypothesis (in other words, a preconceived notion; human activity has led to global warming, for instance) and then perform an experiment to test it. And like it or not, global warming is occurring. The average temperature of the planet is rising, which is all that is meant by global warming. Whether or not this is the result of human action is still being contested. <OPINION>But personally, I would be very shocked if human activity has had NO effect whatsoever on the climate of the planet.</OPINION>
My CPU hours will remain dedicated to searching for a cure for diseases. If you would like to help check out the Folding@Home project that uses distributed computing to model protein folding to find possible cures.
http://www.kubuntu.org/
Why don't we quit wasting time trying to predict major climate change and start taking action to clean up our act?
Have you ever thought of how much garbage the world population puts out, trees we cut down, pollutants we flush, and general mayhem we induce?
Maybe we should be using our excess computing time into working on projects that actually might affect our environment in a positive way, rather than saying we should see what it is going to be like down the road...we all know what is going on here, and I'm not talking about global warming.
Its not the effect of global warming that is our problem right now, but the effect of our blatant misuse of resources and obvious disregard for the earth. Do we not live on this planet with the environment we are destroying...I don't think you need to be a very good scientist to realize that when the environment is decimated, we will be hard pressed to survive...
I guess everyone has some idea that God is going to come and fix everything for us, so we don't have to worry about cleaning up...hey, why don't we all call our mommys and see if they will do our work for us...why don't we own up and say, "Holy shit, I don't want to take the chance that my children are not going to grow up because I ruined their world for them." What is our general purpose in life besides taking up space, making money, and destroying the environment?
The world is a big place, but eventually our actions are going to reach around to spank us, just like our mom's did when we were bad...except it won't be a spanking we live through:/
I invite everyone to spend their 8 months attempting to exact reform in our environmental policies and personal resource use, rather than hoping your computer will somehow figure it out for you.
--"It's Bradford Company, slash your last name, dot your first name"
From the FAQ:
"Many people have complained about the screensaver aspect of the Casino-21 client, and rightfully so. Screensavers only run when a computer has been idle for a period of time, are resource-hungry and place a limit on the platforms that can be supported. A background client will run whenever there is spare processing power, can be made more efficient than a screensaver and will support many more platforms. Following all of your suggestions, the Casino-21 client will be designed to run in the background. An additional client will be provided to view the progress of your climate simulation, and will be able to be run in screensaver mode when applicable."
So...Running the screen saver is not necessary.
As below, so above and beyond, I imagine drawn beyond the lines of reason. Push the envelope. Watch it bend.
It seems that this could make for a real headache, splitting the workload up onto all of these different computers. It's not data like Seti@home where you can distribute out data pieces, is it? All of the information needs to be there to simulate the planet. It sounds like it would be more effective to just get the fastest supercomputer they can get their hands on and start work on a more thorough level, like Japan is doing. Otherwise...
;)
"How's the global climate simulation going?"
"We're still waiting on the data from Australia. We sent it out to 5 people but we haven't gotten anything back yet."
In the meantime, the Earth's atmosphere bursts into flames and makes the whole point moot.
Remember "Bring 'em on"? *sigh
As one poster has pointed out, weather is a chaotic system (and climate is also chaotic by definition).
Chaos is gravely misunderstood though so let me real quick through in my explaination for why this experiment will just generate FUD.
Chaotic equations are chaotic not because of the number of variables involved but because of the interdependency on themselves (each iteration requires the former iteration). This leads to extreme sensitive dependency on initial conditions (a.k.a. the Butterfly Effect). I should have probably emphasized the word extreme because even the slightly deviation will produce dramatically different results.
Even the best climate prediction algorithm would be crap if the initial condition was off by 10^(-20). The fact that we cannot measure temperatures exactly means that we could never feed a perfect initial condition.
Chaotic equations do have a given period before divergence gets extreme when initial conditions are altered. The original equations that Lorenz used (the pioneer of weather forecasting and the father of Chaos theory) showed divergence after about three days (which is why five-day forecasts still suck to this day).
I find it very hard to believe that these folks have developed an equation that doesn't show divergence for 100 years. Not to mention the fact that the number of initial conditions are much larger than the project makes them out to be.
Summary: Some PhD is looking for research money and figures that mixing "scientific" proof for global warming, chaos, and SETI-style distributed computer has to be good for a couple million at least.
int func(int a);
func((b += 3, b));
... or at least the best science has come up with so far, are downloadable from the Intergovernmental Panel on Climate Change (IPCC).
I'd start with the Summaries for Policy Makers, as a way of becoming very well infomrmed in just ~20pp.
AFAIK: It's a UN organization that is the center of research. Their reports are a consensus of almost all the leading scientists from every country on the globe, and their policy statements are approved line-by-line by governments. Even with all that, there are pretty strong statements.Here's better background.
Yet another attempt to model a multi-billion year old climate based on a short data stream.
Let's estimate the average income of everyone in the US over time by looking at people in Rhode Island for the last three days. Same sampling scale, or close.
Useless experiment to hype up the global warming debate again. Gee, I wonder if they'll pick any of the initial conditions that say "things aren't so bad after all". Nope, the only starting conditions that will ever see the light of day are the ones that back up their theory.
Not that the science on the other side is any better. I'm getting tired of the entire debate because, guess what kids, this is supposed to be SCIENCE. Not prognositcation. There is a difference. Come up with a theory, build a series of experiments to prove it, and see if it sticks to the fridge or not. All I'm seeing here is "come up with a theory, pick the data points that will support it, and then publish it in the NY Times".
Thanks for the explaination!
I'm currently working on an application that monitors seemingly random data -- the stock market. I never stopped to consider that there may be statistical techniques above and beyond the standard technincal indicators.
Food for thought!
"...and by 'country' we mean Antarctica."
Snarkiness is inversely proportional to wisdom because it emphasizes feeling right rather than being right.
This silly experiment is a waste of time. Everyone with a time machine already knows that my massive Weather Altering Device (WAD) will come online in 2008 with the sole purpose of ruining the results of this trial...
------
Today's Top Deals
I'm glad they're running a lot of different models.
It will be interesting to see how divergent the predictions for the next 50 years are from the best fits to the past 50 years.
It will also be interesting to see how badly the best fits for the next 50 years fit the past 50 years. (There's gotta be a better way to phrase that)
There's also the long term effects that we have no good means to capture, like what turns off and on the various ocean currents.
...especially considering the nature of distributed computing where participants might sign up on a whim and then drop out a little bit later because they have to reinstall everything or upgrade their system or change work or simply gets tired of the project or it conflicts with some other program or gives their system performance degradation.
I don't know how much amount of immediate data that needs to be stored, but there definitely should still be a mechanism for periodically sending up progress-dumps so that somebody else can take over from wherever you were. This could at least shorten the time for having all the data run since you would notice participant drop-outs earlier and could hand over the rest of the calculations to another participant.
It could also be used to sort out really bad seeds at an earlier stage where the system, for example already after 10 or 20 years discover that you are way off and could hand you another seed instead.
Well, the problem is that they are actually using non-representative data. 1950-2000 is a too small sample by far to even begin forming a model for climate variation, something which varies over periods of centuries or millenia.
They will probably get some form of result. It wont be valid, but it will nonetheless be a result which matches the earlier period.
Of course, this will start breaking down as soon as natural climate variation changes cycle. Likely it would be invalidated even faster if they try to apply the model to known data from the last 20k years (altho if they could get the model to account for the earlier climate variations that far back, I'd tend to accept it as more valid).
And you can get fairly accurate data even further back by studying earlier vegetation, etc.
Maybe they're not interested in data that far back simply because it would be harder to really match it to a model that includes the effects of things like CO2 emissions.
Using the last 50 years makes it easy to get a model that points to human interference. Using or verifying against several centuries or millenias worth of data could indeed make it more accurate, but it would rather point to natural variation from causes like solar radiation output, vegetation changes, etc.
Climate is not necessary chaotic if it is considered to be a moving average of weather. It is entirely possible and indeed quite likely that the non-linear fluctuations which make weather prediction so difficult to predict are in fact damped out over longer time periods. Or to put it in chaos terms, that the fractal dimension of the attractor for weather varies inversely with the sampling frequency.
-- the most controversial site on the Web
I wish I could remember the exact details, but this was the basic idea:
Some branch of the US military was trying to train a neural network to look at a photograph and recognize whether or not there was a tank there.
The people designing the system had pictures of scenes without tanks, and pictures of scenes with tanks. Half of the pictures were sealed away in a safe for later testing. Then, a neural net was trained on the first half of the pictures until it could, with 100% accuracy, correctly identify if there was a tank, or not, in the picture. Finally, the second half of the pictures were presented to the algorithm, and it also correctly identified those pictures as tank/not-tank.
However, when it was tried on another series of pictures, the neural net could only accurately identify about 50% - no better than chance. The engineers who trained the net were dumbfounded, so they went back and started studying exactly what the neural net was trying to use to recognize a tank.
Finally, they found the answer - all the pictures with tanks were taken on an overcast day, and all the pictures without tanks were taken on a sunny day. The million dollar neural net had been trained to differentiate between blue and grey skies! Back to the drawing board...
"I have never let my schooling interfere with my education." - Mark Twain
Climate prediction is not about dynamics,it is about statistics. In other words, it is about identifying the shape of the butterfly, not where the dot happens to be on the butterfly.
So you have just made a much more sophisticated version of the same error that everyone who wants to believe that climate principles are somehow unknowable (ooh, "chaos", so let me keep my SUV) are making.
Tuning the model to generate appropriate statistics is very different than tuning the model to generate very specific dynamics. In the latter case you are limited by chaotic nonlinear dynamics to a few weeks. In the former, you are trying to identify processes that are interacting in complex ways but are fundamentally dissipative and hence predictable in principle.
mt
Given the arithmatic errata most desktop processors have and the cross-platform nature of distributed computing, I'm wondering how anyone can possibly hope to gain accurate results - especially if there's any floating point math involved.
And with this specific project - isn't the earth's climate largely dependent on the amount of solar output, and isn't that amount relatively variable? How are they gonna know the slight variations in solar output over the next 50 years?
These are my friends, See how they glisten. See this one shine, how he smiles in the light.
choas is a relatively complex behavior which is strictly governed by a mathematical algorithm, but, is nonetheless unpredictable due to sensitivity to initial conditions.
I'll give you that, but in weather it still doesn't matter. Given the uncertainty priciple it's impossible to know the inital condtion for a system such as the weather. So even if they have the right mathematically model (which I doubt) this is still all futile.
Have you ever heard this phrase?
The world as you know it changes every three months.
It's a reflection of the fact that each human's understanding of the universe depends on back-testing his current understanding with his understanding of the history, and that it will be invalidated by events that could not have been predicted that add up to a gross revision of the model fairly regularly.
And human brains are uniquely designed to recognize and compare these patterns in gross.
Human societies are as malleable as they are varied. (Because that's how they got to be so varied, see?)
You might think you're creating a predictive model, but it only works to predict within those facets of society for which reality has not yet invalidated the model.
A similar problem exists in using back-testing to tune models to predict the stock markets. It's succinctly summed up by the old brokerage saw:
Past results are no guarantee of future performance.
Which is to say, all "technical trading" is as good as voodoo.
The climate may be more tractable, as it hasn't as yet involved control by something as truly random as a human. But the Global Warming argument indicates that the more paranoid among us at least are finding evidence that weakly correlates human activity with climatic change.
But I still don't think the people doing this particular modelling are using a fine-grained enough model, and are likely rushing to steal cycles from projects that are producing viable results.
--Blair