Slashdot Mirror


Neglect Causes Massive Loss of 'Irreplaceable' Research Data

Nerval's Lobster writes "Research scientists could learn an important thing or two from computer scientists, according to a new study (abstract) showing that data underpinning even groundbreaking research tends to disappear over time. Researchers also disappear, though more slowly and only in terms of the email addresses and the other public contact methods that other scientists would normally use to contact them. Almost all the data supporting studies published during the past two years is still available, as are at least some of the researchers, according to a study published Dec. 19 in the journal Current Biology. The odds that supporting data is still available for studies published between 2 years and 22 years ago drops 17 percent every year after the first two. The odds of finding a working email address for the first, last or corresponding author of a paper also dropped 7 percent per year, according to the study, which examined the state of data from 516 studies between 2 years and 22 years old. Having data available from an original study is critical for other scientists wanting to confirm, replicate or build on previous research – goals that are core parts of the evolutionary, usually self-correcting dynamic of the scientific method on which nearly all modern research is based. No matter how invested in their own work, scientists appear to be 'poor stewards' of their own work, the study concluded."

108 comments

  1. This was understood in Engineering projects too by Anonymous Coward · · Score: 1

    Just ask somebody to figure out how to build a Battleship, or even the guns off one, heck, you'd have trouble finding people who know the process of firing them.

    Or if you prefer, Greek Fire.

    1. Re:This was understood in Engineering projects too by xmundt · · Score: 3, Insightful

      Or as a slight step up....there is NO chance that America could build a Saturn V rocket these days. It was a great workhorse, but so complicated that the loss of a few percent of the drawings, and the number of engineers that worked on it that have retired or died means that reproducing it is impossible now.
                In any case, as for the loss of data...that IS a problem. Back in the Olden Days, before someone decided that the computer, with its amazingly fluid and ever-changing methods of storage were the answer to saving data, much of it was printed on paper and tucked away in libraries. Is that still a workable solution? I do not know, but, I do know that when one is trying to store information for a long time, it HAS to be in the simplest and most durable medium and format available.

      --
      YAB - http://blog.beemandave.com/
    2. Re:This was understood in Engineering projects too by TubeSteak · · Score: 2

      Or as a slight step up....there is NO chance that America could build a Saturn V rocket these days.

      We have at least a couple complete Saturn V rockets lying around if we wanted to reverse engineer 'em.
      I've personally seen the ones in Alabama and Washington D.C.
      http://en.wikipedia.org/wiki/Saturn_V#Saturn_V_displays

      The hardest part of rebuilding old hardware is the metallurgy.
      As long as we can get that right (or use a better quality substitute)
      reverse engineering from existing parts isn't anything we couldn't farm out to China.

      --
      [Fuck Beta]
      o0t!
    3. Re:This was understood in Engineering projects too by countach · · Score: 1

      Are they really complete rockets? Is the documentation available to verify that they are 100% complete?

    4. Re:This was understood in Engineering projects too by volvox_voxel · · Score: 1

      To what extent can meteorology techniques, esp. non-destructive, be used ? E.g. x-ray crystallography ,microscopy to study crystal grains, electron-microscopy, spectroscopy, etc?

    5. Re:This was understood in Engineering projects too by beckett · · Score: 3, Funny

      not 100% complete; they got this small bag of leftover screws.

    6. Re:This was understood in Engineering projects too by HiThere · · Score: 1

      I really doubt that they are complete. The internal electronics are probably broken, plastics deteriorate with age, etc. But the real loss is the skill sets needed to build it. There were a LOT of failures before we got a working version.

      FWIW, I believe that NASA has officially said that they couldn't build another Saturn.

      --

      I think we've pushed this "anyone can grow up to be president" thing too far.
    7. Re:This was understood in Engineering projects too by Anonymous Coward · · Score: 0

      It's not necessary to rebuild the Saturn V exactly as designed. We have complete blueprints for the F1B engine and have recently tested the gas generator for powering the turbopumps. Every other part is reproducible but we would never make something that required that much hand shaping and welding today.

  2. No shit sherlock by Anonymous Coward · · Score: 0

    Is the vernacular.

  3. Scientific Data Disappears At Alarming Rate too! by Anonymous Coward · · Score: 5, Informative
  4. Drinking from the firehose. by Anonymous Coward · · Score: 4, Insightful

    My wife is a wildlife biologist. Her office collects raw field data all year, compiles data, runs stats, writes reports, reads reports, creates a pretty large volume of "product" every year.

    I ask her who exactly reads all the required papers and reports they produce. The federal Fish and Wildlife Service demands product. State demands product. Various agencies with funding ties that would confuse anyone all demand product. The real ass-kicker? Almost none of it is actually READ by those who asked for it. The papers that are read, are rarely read by more than one person.

    In the end, thousands and thousands of offices like hers, producing real scientific data, it is just too much.

    The number of people consuming the product is DWARFED by those producing it. The number of people tasked to archive, organize, store, catalog, and index this torrent of information are even FEWER than those who consume it.

    These are "real life" scientists out there every day. Not throw in academia, including "research academia".

    The bottom line? A true first-world problem. We produce WAY more research than we are prepared to do ANYTHING with.

    1. Re:Drinking from the firehose. by blue+trane · · Score: 1

      Put it on the web. Who knows who may find it useful? The value of the research might not reveal itself for some time, but if google or someone has archived it, it might sit there waiting to unveil secrets, like the Pillars of Ashoka.

    2. Re:Drinking from the firehose. by Obfuscant · · Score: 1

      Put it on the web.

      Who pays for that? Disks and servers and networks cost money. Academics rarely have that just sitting unused.

    3. Re:Drinking from the firehose. by Oligonicella · · Score: 1

      There's nothing in the world preventing you from donating time and hardware to help them do so.

      That's the point. There's really too much raw data that's not really needed, just produced.

    4. Re:Drinking from the firehose. by Anonymous Coward · · Score: 0

      Disks are not exactly expensive. If your project produces 4 KB of data per second, which seems like an awful lot for a project in wildlife biology, it would take you 30 years to fill up a 4 TB disk.

      Networks and servers are more expensive but if the demand for this data is as low as claimed, it's not like you're going to need too much bandwidth....

    5. Re:Drinking from the firehose. by krlynch · · Score: 1

      It ain't the hardware ... it's the people. Who maintains the hardware? Programs it all? Who maintains the networks? Who is charged with tracking what's working and what isn't? Who backs it up, and updates formats, and catalogs it, and indexes it, and tracks the changing methodologies used in the collection? Who translates the old code, and operating systems, and storage formats, and hardware, and whatever else?

      Hardware's cheap ... people (especially those of the knowledgeable and reliable variety) aren't. Digital data archiving and warehousing for the long haul are well understood to be huge problems for which we haven't found a very good set of solutions.

    6. Re:Drinking from the firehose. by blue+trane · · Score: 1

      Who knows what's really needed? The market is too short-sighted to be a reliable judge. Mendel's research was not needed, until after his death. The research that went into the internet was thought to be unneeded by AT&T. The library of Alexandria was thought to be unneeded and burned. Kafka wanted all his unpublished manuscripts burned after his death.

      If you or I can't afford to help researchers publish their data on the internet, the government can and should.

    7. Re:Drinking from the firehose. by blue+trane · · Score: 1

      Let the Fed expand its balance sheet to buy govt bonds that allow for academics to publish data, and keep the loans rolling over forever while returning the interest to the Treasury. Making the research and data available is in the General Welfare. Like libraries...

    8. Re:Drinking from the firehose. by Obfuscant · · Score: 1
      An interesting interpretation of the Constitution, where the general welfare clause is part of the preamble and not a proscriptive statement. And an interesting interpretation of how research grants are awarded, and even the general usefulness of vast quantities of research data.

      Academics already write "publish" into the grants they get, or they ought to do so. "Publish" is not the same as "put all the data up in an organized manner for everyone to come use", however. And even being able to put it all up for casual use can be a costly venture. I'd estimate, ballpark, that it would take 12 months of full time staff to digitize the video tapes I have that are raw data just on the off chance that someone would want to use that data for something. Add several thousand dollars for a multi-Tb disk array to hold it, and another six months to catalog it.

      And then should someone come along to use it, staff time to explain and help the person.

      Printing money is one solution. Not a very good one.

    9. Re:Drinking from the firehose. by Anonymous Coward · · Score: 0

      A lot of it is "just in case" information. If the ecosystem were to be compromised, or a species exhibited erratic behavior or numbers, etc., then having previous data is extremely helpful to try and diagnose the problem and help restore equilibrium.

    10. Re:Drinking from the firehose. by blue+trane · · Score: 1

      General Welfare is mentioned twice, in the Preamble and in Article 1, Section 8.

      It isn't printing money, since no physical greenbacks need be involved, just figures in a virtual ledger book. Banks of course use this trick to expand their balance sheets, by issuing loans or otherwise creating assets. UBS for example booked future expected profits right away on AAA mortgage-backed securities, and paid bonuses on those profits. So these type of accounting practices go on all the time in the private sector.

      Perhaps by the time someone comes across your data, they will be smart enough (or have an AI that's smart enough) to figure it out. Or they could become architectural relics, providing valuable information to future societies. I think you discount your own research unfairly.

    11. Re:Drinking from the firehose. by Anonymous Coward · · Score: 0

      A lot of the data is covered undera all kinds of laws that prohibit it.

      Coordinates of specific populations or individuals of endangered species for example. Now you are talking about taking time to review data for public release, and even more time to redact portions.

      I suggested they post their stuff. Especially raw data, like population counts for various species. Aint. Happening.

    12. Re:Drinking from the firehose. by blue+trane · · Score: 1
    13. Re:Drinking from the firehose. by khallow · · Score: 1

      It isn't printing money, since no physical greenbacks need be involved, just figures in a virtual ledger book.

      There's so much fail in this sentence. "Printing money" is a saying not a literal description of the act. It means that you create currency without creating value. Inflation takes care of that hubris.

      And how can anyone think that "figures in a virtual ledger book" is an adequate solution for anything productive or vital?

      Perhaps by the time someone comes across your data, they will be smart enough (or have an AI that's smart enough) to figure it out. Or they could become architectural relics, providing valuable information to future societies. I think you discount your own research unfairly.

      Like a room with a thousand Madonna portraits. Someone will be interested.

    14. Re:Drinking from the firehose. by khallow · · Score: 1

      If you or I can't afford to help researchers publish their data on the internet, the government can and should.

      This is the outcome of government intervention in the scientific process - the generation of scientific activity which can't have long term value merely because it won't be saved. Maybe if we apply more of the poison, we'll save the victim.

    15. Re:Drinking from the firehose. by nbauman · · Score: 1

      I dunno. There may just be a half a dozen people who are interested in your wife's penguin or whatever, but to them it's really interesting. They might have to make a decision about penguin habitat or whatever.

      And then there's the scientific paper lottery. A few papers turn out to be really important, everybody cites them, and they change the world -- but you can't know in advance which one is going to be important. There were people doing studies of the hearing of fish, and suddenly, when porpoises start going deaf, everybody looks them up.

      I admit it can be frustrating. I admit people have been known to say, "What am I doing this for?"

    16. Re:Drinking from the firehose. by Anonymous Coward · · Score: 0

      It's like photography. Somebody takes a stock photograph of a crowd in a public space today, and it's not more than $10. Look for a stock photograph from the 1970's, and the value is $100. Go back a hundred years and it's even more expensive.

      But then you can do all sorts of interesting comparisons. With aerial photographs and even hand drawn maps, you can instantly see urban growth patterns.

    17. Re:Drinking from the firehose. by Mr.+Slippery · · Score: 1

      government intervention in the scientific process...

      Involvement is not the same as intervention.

      Scientific research is a public good.

      --
      Tom Swiss | the infamous tms | my blog
      You cannot wash away blood with blood
    18. Re:Drinking from the firehose. by khallow · · Score: 1

      Scientific research is a public good.

      Except when it's not.

      But by forcing so much research to be a public good, you also create the usual tragedy of the commons situations of overconsumption of the good, such as researchers who research all sorts of things to consume the available public funds, but have no incentive to actually save their work.

    19. Re:Drinking from the firehose. by blue+trane · · Score: 1

      "figures in a ledger book" is what the financial sector busies itself with. I agree, there's no value added. We'd be better off bypassing the financial sector and simply providing liquidity, from the government or a central bank, when it's needed. The financial sector is mired in all sorts of perverse incentives and moral hazards that cause lots of friction and push prices away from their efficient levels. That's why asset prices bubble and crash, because dealers push them away from their efficient levels.

      As for "printing" vs. "creating" money, the difference is: creating a virtual currency means we don't ever have "wheelbarrows full of paper". Simply add another zero to a bank card, if you must; but there is no physical currency necessary.

      Inflation is mostly psychological. There is no physical necessity for prices to rise if the money supply increases. It's a psychological choice, and can be dealt with by a) exposing the psychology of the choice to raise prices (even when your production costs have not increased) just because there is more money out there, b) indexing everything to inflation in an automatic, seamless manner so that we can ignore inflation and carry on as if it doesn't exist.

      The real focus should be on production capacity, innovation, the advance of knowledge. Economics with its scarcity fetish retards the advance of knowledge by artificially imposing constraints on the money supply. There is a huge demand for liquidity and the financial sector has arisen to meet that demand by creating money out of thin air financial "innovations". When those "innovations" fail, the financial sector is backstopped by central banks, while the consequences of the mistakes are visited upon billions who weren't involved in the mistakes. Instead of repeating that cycle, let the central bank and/or govt simply provide liquidity where its needed, without the intermediary of the financial sector with its ledger books.

    20. Re:Drinking from the firehose. by blue+trane · · Score: 1

      Why not save their work whether they have the incentive, or not? That way it can be checked by anyone who wants to. It's in the public interest to check research, like Rogoff and Reinhart's:

      Thomas Herndon, Michael Ash, and Robert Pollin of the University of Massachusetts, Amherst, have found serious problems with Reinhart and Rogoff's austerity-justifying work.

    21. Re:Drinking from the firehose. by as.kdjrfh+sxcjvs · · Score: 1

      There are groups working on this -- the University of California is trying to do it in a consistent way, with its wealth of historical data -- but it's harder than you'd think. It's not very useful if you don't get the metadata reasonable, and that's skilled work and not something we reward. Institutional support (libraries, machine shops, etc) gets pinched because it's constant overhead and hard to point to single high-status payoffs. It takes one year to kill a library (Canada's superb fisheries and lake science just lost one).

      Even worse, a lot of scientific data is realia -- *stuff* -- and that's a worse metadata problem, and expensive and fragile.

    22. Re:Drinking from the firehose. by khallow · · Score: 1

      Why not save their work whether they have the incentive, or not?

      Because they don't get funded to do that.

    23. Re:Drinking from the firehose. by khallow · · Score: 1

      "figures in a ledger book" is what the financial sector busies itself with. I agree, there's no value added.

      Sometimes you're right. And sometimes those figures represent things of value. A loan to you would be "figures in a ledger book" to the bank. But it'd be a home, a business, or an education to you.

      We'd be better off bypassing the financial sector and simply providing liquidity, from the government or a central bank, when it's needed.

      That's what creates these huge bubbles in recent time. Easy money from the Fed gets dumped into dubious investments by the finance sector.

      Inflation is mostly psychological.

      Then you don't know what inflation is. For example, if the US government were to secretly "print money", that is, buy things with currency that they don't have the backing for, and not tell anyone, we would still see inflationary effects merely because there is more money out there. In fact, the outcome on the currency markets is often how covert inflationary government policies are discovered.

      Economics with its scarcity fetish retards the advance of knowledge by artificially imposing constraints on the money supply.

      Money that isn't scarce can't serve as a store of value. It has a half-life near zero. In comparison, the US dollar has a half-life of about twenty years (at least by official reckoning of inflation). If I lose a dollar in the attic and come across it twenty years later, it still retains about half its value. If I did the same for an "abundant" currency, it would have no value. Hyperinflation would have destroyed its value long ago.

      Instead of repeating that cycle, let the central bank and/or govt simply provide liquidity where its needed, without the intermediary of the financial sector with its ledger books.

      Nonsense. I already noted the big asset bubbles of the past few decades as examples where central bank/government provided liquidity went wrong. But there's also the huge conflict of interest that governments and central banks have. Higher economic activity and growth of the price of assets results in more tax revenue and a less disgruntled voting population.

      As a result, there's huge incentives to sex up a growing economy (that is, obtain more and larger economic activity) and to spend vast amounts of public funds to attempt to restore economies that have stopped growing in that sense. The financial sector just doesn't have that incentive.

      Add in that the financial sector has the knowledge and experience, then there's no reason to throw this on a group that simply doesn't have the knowledge, skills, or experience to run such things.

    24. Re:Drinking from the firehose. by blue+trane · · Score: 1

      Create the money to fund them. It's in the public interest, the General Welfare. We, and our grandchildren, will be better off if we can check research by having access to the data it used.

    25. Re:Drinking from the firehose. by blue+trane · · Score: 1

      The central banks didn't provide the liquidity for the most recent bubble, or for the tech bubble. The Fed was increasing interest rates (which killed dot-com). The credit expansion took place in the private sector, not from the Fed. You're model is deeply flawed, based on an ideology that history doesn't support.

      Financial "innovations" preceding the most recent crash created what private banks thought of as "risk-free" assets. The banks booked future profits from these riskless, AAA-rated, mortgage-backed securities immediately, and paid huge bonuses on them. The central bank was not at all involved in creating these assets. This was a balance sheet expansion of the private banks.

      The Fed served as a backstop for the banks after the crisis, providing needed elasticity at a time when the banks were shrinking the money supply they had expanded, causing problems for billions that were never involved in the banks' innovations.

      If inflation is tied to the money supply, why didn't we see hyperinflation when the Fed expanded its balance sheet by a factor of at least 2 in a week? Why didn't we see hyperinflation when the private sector was expanding its balance sheet by much larger than a factor of two in the run-up to the crash?

      The answer is that inflation is psychological. Why else is hyperinflation generally ended in a very short period, a day or a week? Money supply has nothing to do with it; it's psychology.

    26. Re:Drinking from the firehose. by khallow · · Score: 1

      You still have the problem that nobody reads most of the research and that even if you did pay people to read research, you'd still be nowhere near use of that research. All this blather about "General Welfare" ignores that it's not really in the public interest to pay brilliant people to spin their wheels.

    27. Re:Drinking from the firehose. by khallow · · Score: 1

      The central banks didn't provide the liquidity for the most recent bubble, or for the tech bubble.

      You are wrong here. The US Federal Reserve had low interest rates going into both bubbles and Fed officials did link money policies to the asset bubbles (for example, Greenspan's "irrational exuberance" speech in 1996).

      If inflation is tied to the money supply, why didn't we see hyperinflation when the Fed expanded its balance sheet by a factor of at least 2 in a week? Why didn't we see hyperinflation when the private sector was expanding its balance sheet by much larger than a factor of two in the run-up to the crash?

      Because a mere factor of two isn't hyperinflation. If they were doubling money supply every week for many weeks, then that would result in hyperinflation. And the Fed's "balance sheet" isn't a full measure of inflation since one also has to consider velocity of money which slows greatly during recessions.

    28. Re:Drinking from the firehose. by blue+trane · · Score: 1

      Let's look at some data, shall we?

      http://research.stlouisfed.org/fred2/graph/?g=qip shows that the Fed was in disciplinary mode, raising interest rates, before both the dot-com and the real-estate crash. Greenspan's "irrational exuberance" attitude was what killed dot-com, because, I think, he's an old fool who didn't understand the potential of technology to make obsolete his feudal economic models.

      Regarding velocity of money: http://research.stlouisfed.org/fred2/graph/?g=qiq
      If velocity of money leads to inflation, why wasn't there high inflation during the 1960s, 1990s, and 2000s?

      Regarding the money supply: http://research.stlouisfed.org/fred2/graph/?g=qir
      (Taking M2 because, as investopedia says, "economists like to include the more broadly defined definition for M2 when discussing the money supply, because modern economies often involve transfers between different account types. For example, a business may transfer $10,000 from a money market account to its checking account. This transfer would increase M1, which doesn’t include money market funds, while keeping M2 stable, since M2 contains money market accounts.")

      Why does the money supply increase at an almost constant, exponential rate, with no devastating inflation?

    29. Re:Drinking from the firehose. by blue+trane · · Score: 1

      You don't know what research will be read. Maybe it will become valuable after you're dead. It's value to you in the present may be nothing, but to another it might be great. For example, I like to listen to old jazz tunes on youtube that may have one or two other views. But there's value in them, because value is not a popularity contest. In the same way, research that is not valuable to you, or not popular at this time, can have immense value to the future. Example: piles of trash that are invaluable to archaeologists in reconstructing ancient Troy, say.

    30. Re:Drinking from the firehose. by khallow · · Score: 1

      You don't know what research will be read.

      No, but I have a pretty good idea.

      Maybe it will become valuable after you're dead.

      But it probably won't. As a general rule of thumb, if something doesn't prove itself in the first few decades to have value, then it probably never will.

      For example, I like to listen to old jazz tunes on youtube that may have one or two other views. But there's value in them, because value is not a popularity contest.

      Sure, because you listen to them, those youtube videos have some value.

      Example: piles of trash that are invaluable to archaeologists in reconstructing ancient Troy, say.

      It doesn't have to have universal value to everyone to have value. This is completely irrelevant to my point. Recall we're speaking of research data that will probably never be examined by anyone other than the author and perhaps a few reviewers. You then propose to extend the time that this research is publicly available. All I see here is that you extend the period of time over which this research gets ignored.

      And any research whose sole value is to future archeologists studying the current research culture is a waste of resources in my view.

    31. Re:Drinking from the firehose. by khallow · · Score: 1

      http://research.stlouisfed.org/fred2/graph/?g=qip shows that the Fed was in disciplinary mode, raising interest rates, before both the dot-com and the real-estate crash.

      Interest rates were rather low just the same. Also, your observation is additional support for my argument since the rates were raised just before the asset bubbles crashed. That timing is an important correlation for claiming cause and effect.

      When I look earlier, I see a 2.75% rate in early 90s (lowest since the 60s) and sub 2% rates after the 9/11 attacks (lowest since after the 1957-1958 recession).

    32. Re:Drinking from the firehose. by sjames · · Score: 1

      Since you're just serving up static data, throw it on a Debian server, put the security updates in a cron job and it should run trouble free. Every three years, someone can oversee the dist-upgrade. It doesn't have to have singing and dancing animations with a background just exactly the right shade of mauve or anything.

      If it gets more popular than expected, torrent it.

    33. Re:Drinking from the firehose. by blue+trane · · Score: 1

      I think you're ignoring far more obvious causes for inflation. In the 1970s, it was oil supply shocks. OPEC raised prices not because of economics of supply and demand, but for purely political, or psychological, reasons.

      The interest rate profile for the 1960s is similar to that for the 2000s. But inflation consequences were quite different, because there are much more important psychological factors involved.

      Rates were being raised years before the "bubble" burst. That's a very strange cause theory you have. What was really happening is that the private sector was creating money out of supposedly risk-free mortgage-backed assets. Then a hiccup occured when UBS announced it was writing off over $10 billion in MBSes, and groupthink took over and the traders started an emotion-based sell-off. Interest rates and the money supply had little to do with it. Psychology and emotional overreaction were the main causes.

    34. Re:Drinking from the firehose. by blue+trane · · Score: 1

      I think your approach is like uniformly reporting "negative" on cancer tests, because the incidence of cancer is so low. You can have a very high successful prediction rate (99%, say) by simply saying "no" on every test. But that doesn't help the patients who have cancer. You can boast "I have a great prediction success rate!" but you're not helping anyone.

      In the same way, saying categorically that no research is valuable because a lot of it isn't valuable is silly. It's precisely the cases that are "thrown out with the bathwater", which you can't predict with your "universal no" attitude, that can matter most. Mendel's research was ignored for decades, until after he died. What if it had been publicly available (in an easily-accessible way such as we have now with online data storage) and Darwin had had access to it? Why wouldn't we want to make present-day Mendels' research easily available?

    35. Re:Drinking from the firehose. by blue+trane · · Score: 1

      Also the obvious point in the interest-rate graph is that 8 of 9 recessions immediately followed a rise in interest rates. Discipline caused the recessions, not too much money.

      In the dot-com crash, investors started pulling back because they couldn't keep their loans rolling over at the low interest rates. In the real-estate crash, mortgage rates went up because money was becoming tighter. What if interest rates had not gone up? Let's run a simulation to see if people would have been better off.

    36. Re:Drinking from the firehose. by khallow · · Score: 1

      I think your approach is like uniformly reporting "negative" on cancer tests, because the incidence of cancer is so low.

      That is not the case. My approach would be picking up most, if not all cases of useful research in question. Recall that scientific research which results in useful progress over the long term invariably has some usefulness and value even in the short term. This is a universal feature, not a quirk of market-oriented research.

      In the same way, saying categorically that no research is valuable because a lot of it isn't valuable is silly.

      Then don't say it. I don't say it either.

      I do think that this sort of claim indicates that you don't understand my argument. I'm not arguing that publicly funded research can't have value, but rather that a lot of it doesn't generate research with enough value that someone can be bothered to save the data or read the research at a later date.

      It's precisely the cases that are "thrown out with the bathwater", which you can't predict with your "universal no" attitude, that can matter most.

      Note that you haven't been able to come up with an example that furthers your claim. It's "think of the Trojan trash".

      As I see it, we have a huge misdirection of public funding into activities that actually harm scientific progress on the vague and unsubstantiated theory that we're somehow missing the most important research by only considering research that has value to us in the next few decades ("near future") or by claiming that most research shouldn't have value to us in the near future.

      But if one looks at actual research of the past which resulted in something useful, it resulted in something useful in the near future. It isn't a fluke, but a standard outcome of useful research.

      There's no point to these arguments aside from evading responsibility and accountability for the use of public funds. Keep in mind that research funding isn't unlimited. Even if you go with publicly funded research, you have to make choices and sacrifice some research in favor of other research. Deliberately blinding yourself to near future ways of determining the value of research means you are much more likely to invest in stuff that goes nowhere.

      Even if you fund means to address the current storage of data problem mentioned in the story, you will still continue to run into the other flaws of low value research such as no one actually bothering to look at the research. It's throwing good money after bad and doesn't solve the underlying problem.

    37. Re:Drinking from the firehose. by khallow · · Score: 1

      OPEC raised prices not because of economics of supply and demand, but for purely political, or psychological, reasons.

      As you noted yourself, 70s recessions were triggered by oil shocks, not by OPEC psychology.

      The interest rate profile for the 1960s is similar to that for the 2000s. But inflation consequences were quite different, because there are much more important psychological factors involved.

      No, they weren't. For example, the 60s interest rates didn't stick around the lowest interest rate for any length of time while the lowest points of the 2000s interest rates were maintained for more than a year.

      Then a hiccup occured when UBS announced it was writing off over $10 billion in MBSes, and groupthink took over and the traders started an emotion-based sell-off. Interest rates and the money supply had little to do with it. Psychology and emotional overreaction were the main causes.

      Why did UBS write off anything in the first place? They had a margin call (well, the equivalent for banks which are required to maintain a level of reserves). Higher interest rates provided a lot of pressure for that. Similarly, all the other "emotional" traders felt that same pressure and many of them were forced to sell off (no matter their psychological make up) because their assets became too heavily leveraged and something had to be sold to maintain reserves.

      High interest rates (especially combined with increased reserve requirements which often coincide) result in universal pressure on the entire economy. When the economy is in good shape, that doesn't really cause much trouble. But when it's heavily invested in an overvalued asset bubble, then we see the sort of "psychology" you noticed.

      Just as easy credit increased the availability of funds with which to pursue and maintain reserve for risky financial actions, drying up that credit has the opposite effect.

    38. Re:Drinking from the firehose. by blue+trane · · Score: 1

      OPEC psychology created the oil shocks. There was no production capacity problem. There was a psychological issue.

      Regarding UBS, here's a quote from the Economics of Money and Banking class, Lecture 20 Notes:

      UBS was doing something it called a Negative Basis Trade in which it paid AIG 11 bp for 100% credit protection on a supersenior CDO tranche, and financed its holding of that tranche in the wholesale money market. In its report to shareholders, to explain why it lost so much of their money, it states that this trade netted an apparently riskfree arbitrage profit of 20 bp. Because it was apparently riskfree, they did massive amounts of it. The risk turned out to be liquidity risk, when money market funding dried up and they could not sell their AAA tranche. Their CDS hedge did them no good since they could not use it to raise funding. (To make matters worse, the CDS hedge was typically only against the first 2% loss, leaving UBS exposed for everything more than that.)

      So it wasn't a margin call. The money-market funds just stopped rolling over the loans. UBS was left holding CDOs that they couldn't sell, and that they hadn't insured enough. So they wrote down those assets, even though they were AAA and hadn't experienced enough defaults to make them drop in value; but the negative psychology of the market took over and no one would buy the CDOs anymore, because there was a perception they weren't worth anything. Cahan describes the mood that took over Wall Street in the book "House of Cards". Clearly, it was an emotional, psychological phenomenon. It had very little to do with actual defaults since they weren't enough to affect the AAA-rated top-tranche CDOs that UBS was holding.

      As for interest rates being held low for a year, are you then predicting an imminent crash, since interest rates have been low for a number of years? When will this crash occur, are you able to commit to a time period, since your "theory" is so fully supported?

    39. Re:Drinking from the firehose. by khallow · · Score: 1

      OPEC psychology created the oil shocks. There was no production capacity problem. There was a psychological issue.

      You don't get it. The existence of an effective cartel demonstrates in the first place that there was production capacity problems - namely that production capacity was highly concentrated in the hands of the cartel. And the oil shocks were profitable (in addition to increasing the political power of the OPEC members) - providing a straightforward market advantage for that choice.

      The risk turned out to be liquidity risk, when money market funding dried up and they could not sell their AAA tranche.

      Here we go. Liquidity risk that originated from the easy Fed money no longer being in the market.

      So it wasn't a margin call.

      Then why did they need to "raise funding"? Because they needed to have enough reserves to cover their investments.

      As for interest rates being held low for a year, are you then predicting an imminent crash, since interest rates have been low for a number of years?

      No, you get bubbles with easy credit. The crashes happen once they run out of borrowed funds with which to cover their leveraged bets. I think most of the current easy credit is being used to cover bad debt from the last recession and there are serious uncertainty issues (such as the anti-business environment in the US and the flaky EU nation-level bailouts) that are obstructing long term business activities like investment and hiring.

      I'll note that there are some minor bubbles in green technologies, IT, and bitcoins, for example.

      When will this crash occur, are you able to commit to a time period, since your "theory" is so fully supported?

      No, because timing depends on things that haven't happened yet. First, I don't see a large enough outlet for the easy credit being provided by the central banks. For example, it took a couple of years for real estate loans to emerge as the primary credit sink in the post-911 period. Similarly, the high tech industry didn't become a significant credit sink after the Japanese and US recessions of 1990-1992 until about the middle of the decade.

      Second, there needs to be substantial leverage. In the real estate crisis, this was provided primarily by easing reserve requirements to astoundingly low levels (like 1 part in 50 or worse).

      In the dotcom bubble, there was a certain accounting method for expensing stock options (namely, that stock options were not listed as a liability until they were exercised). This method coincidentally was removed from the stock market in the same month that the stock market peaked. There was also leverage from outright fraud. Many dotcoms "stuffed the channel", that is, they reported goods shipped to retailers (the "retail channel") as "sold", which incidentally was illegal because those goods could and often were returned without generating a profit for the business. But they could pretend that they were generating higher revenue and profits than they actually had under more honest accounting systems.

      There are classic ways to grow businesses using easy credit. For example, borrow against your assets to buy more assets which you then use as collateral in yet more loans. Then sell the business (either as a unit or via an issuing of stock shares) based on the revenue and asset numbers while downplaying the liabilities. As a pyramid scheme, this can last for a number of years before it comes down. Worldcom was one such example. I'd say that Abengoa S.A., a renewable energy provider is a current example of this approach.

      Some amount of superexponential growth is common. When an industry experiences growth rates that actually increase over time, that's a sign that the market is disengaged from any real basis and trading more on short term speculators' ever increasing expectations.

      A significant symptom of the later stag

    40. Re:Drinking from the firehose. by khallow · · Score: 1

      For another model, I view recessions as large corrections of market perception. Recent recessions have been asset bubble driven, but there are other kinds of recessions such as the oil crises of the 70s (where suddenly the developed world realized that OPEC could manipulate oil supply and prices a huge amount and that resulted in all sorts of costly economic adjustments from changes in individual behavior up to national investments in alternative energy approaches).

      In this light, when the central bank sets interest rates, it is actually paying the markets to see interest rates as being in a certain range. This primes the pump for putting money in any available high leverage investments since suddenly there's no low risk investments with good interest payments out there. And once money starts flowing into such a bubble, it develops an attractive short term trend which brings in more money.

      My view is that it is better to just let the recession happen rather than go through all this effort to short circuit it and return to economic growth conditions. Recessions reduce the extent of bubbles (in large part by rewarding parties who didn't participate) and they cull poorly run businesses. If the recession is bad enough that it's going to destroy an entire industry, then limited intervention might be warranted just so a few survivors can pull through. This manipulating of the perception of future risk is IMHO a large part of the reason we currently have these large boom-bust cycles in developed world economies.

    41. Re:Drinking from the firehose. by blue+trane · · Score: 1

      Regarding OPEC: there was no supply and demand problem with oil. In economic terms, the price should not have risen because there was no production capacity problem. The reason prices rose were purely a matter of politics, of psychology, of policy. Not physical necessity. The proof is that prices later dropped to $10/barrel. So there was no production capacity problem. There was only a psychological problem.

      Regarding UBS's liquidity risk: according to Prof. Mehrling's story, UBS was getting funding from money-market funds, which are purely private organizations with private investors. The Fed had nothing to do with it (also this was in Europe so it would be the Eurodollar market, which the Fed has no control over). It was not easy money made possible by the Fed that fueled the "bubble", it was easy money created by the private sector. The Fed's interest rates, in non-crisis times such as for several years before 2008, are set higher than the private rates. That's why the Fed is a lender of last resort, because the banks go to each other first for better rates; only if they can't get anything do they resort to the Fed's discount window, which is dangerous because it can give them a bad reputation if they go to it too often.

      So the Fed was not providing easy money that fueled the "bubble". The private sector was providing the easy money, creating it out of thin air.

      UBS's problem was that they believed they had a risk-free asset, and booked years and years of future profits immediately (and paid huge bonuses on those expected, risk-free profits). Since they believed the CDOs were riskfree, they didn't hedge fully by taking out enough insurance (CDSes). When market emotion and groupthink started to take hold and the riskfree CDO AAA "supersenior" tranches lost market trust (even though defaults were not reducing their value), the trade value of the "riskfree" asset suddenly dropped and no one wanted them. The money-market funds (private entitites) would no longer accept them as collateral. None of this had anything to do with the Fed. This was purely private sector activity. UBS was using CDOs as a replacement for T-bills in fact, maintaining they were as safe with better yields. The banks were creating money without any need for the Fed or govt bonds.

      After the crisis almost took down AIG, the Fed stepped in and backstopped the losses. It wasn't the Fed's easy money that caused the crisis; it was the Fed's response of increasing elasticity during the crisis that prevented it from getting worse.

      "At some point, someone finds a leak in the dike and can apply low interest rate credit in some way to investment in a highly leveraged sector (that is, most of the investment is via borrowed money, and it can be quite extreme such as the 50 to 1 leverage I mentioned earlier). "

      The point is that the "leak" was not a leak to Fed money. It was privately created money. The Fed's monetary policies had nothing to do with it. The regulators didn't even know what was going on, the Fed wasn't keeping track of the shadow banks' activities. The shadow banks were getting their funding from private money-market funds and creating money based on that, not based on Fed money. As I said, the CDOs and MBSes were sold as being better than T-bills because they were riskfree and had high yields. The shadow banks didn't need Fed money. Until market groupthink soured on them and the crisis was well underway.

    42. Re:Drinking from the firehose. by blue+trane · · Score: 1

      "In this light, when the central bank sets interest rates, it is actually paying the markets to see interest rates as being in a certain range. This primes the pump for putting money in any available high leverage investments since suddenly there's no low risk investments with good interest payments out there. And once money starts flowing into such a bubble, it develops an attractive short term trend which brings in more money."

      Your story doesn't take into account that the Fed doesn't set interest rates (except the Discount Rate which is set a fixed amount above the natural private rate). It can try to target rates, but the rates are ultimately negotiated by the private institutions themselves.

      Also, looking carefully at the Fed Funds rate graph, it's clear that the Fed started raising interest rates in 2004, long before the crash in 2008. Your claim that the rates were lower for years than in the 1960s isn't really accurate; the Fed kept the lowest rate in the 2000s for about a year only, not "years". And then the rate of increase was steeper than in the 1960s. But there were very different consequences, supporting my claim that the Fed didn't cause the asset "bubble". It was market psychology, not interest rates, that was the primary cause.

      "My view is that it is better to just let the recession happen rather than go through all this effort to short circuit it and return to economic growth conditions. Recessions reduce the extent of bubbles (in large part by rewarding parties who didn't participate) and they cull poorly run businesses. If the recession is bad enough that it's going to destroy an entire industry, then limited intervention might be warranted just so a few survivors can pull through. This manipulating of the perception of future risk is IMHO a large part of the reason we currently have these large boom-bust cycles in developed world economies."

      I agree, as long as there's a robust safety net. Let the market play, but don't force all of us to play in its game. Provide individuals with the choice of a Basic Income guarantee, so we can pursue our own ideas of how to advance knowledge and technology without taking part in the perverse incentives and moral hazards that capitalism promotes. Encourage individuals to maximize their native-born instinct for wonder and creativity with challenges. When an individual comes up with a great disruptive idea, it can then be turned over to biz to do what it does best: incrementally innovate.

    43. Re:Drinking from the firehose. by khallow · · Score: 1

      Your story doesn't take into account that the Fed doesn't set interest rates (except the Discount Rate which is set a fixed amount above the natural private rate). It can try to target rates, but the rates are ultimately negotiated by the private institutions themselves.

      The Fed has very effective tools for targeting rates. A control system doesn't need to be perfect to be an effective control system.

    44. Re:Drinking from the firehose. by blue+trane · · Score: 1

      The private banking system evolved of its own accord towards a centralized system where clearinghouses played a role similar to the Fed today. There was a need for a central bank that could provide elasticity in times of crises, and it was convenient for all the banks to settle payments once a day at a clearinghouse, instead of many times with each bank someone had written a check on or cashed a check at. A centralized system made sense.

      The problem with the centralized system was that it didn't provide enough elasticity because it was still controlled by a private profit-motivated person, such as J P Morgan for example. Morgan stepped up to help the country in the Panic of 1907, expanding the money supply on his own by issuing clearinghouse certificates. But leaving such control in the hands of a private citizen was recognized by all concerned as a bad idea, because Morgan could choose only to help those he liked for example. That was the motivation for the Fed, to create a more equitable central bank which wouldn't play favorites as a private individual could. Also the Fed is not profit-motivated; it operates in the public interest. It is required to return all interest on Treasury bonds to the Treasury, for example.

      The problem with not having an interest rate target is that, in times of crisis, the banks will raise interest rates and contract credit. And people, who had no part in the making of the crisis, will suffer. The government is mandated by the Constitution to "provide for the General Welfare". I think direct payments to individuals (fiscal policy) is a better way, but the Fed is trying to do what it can. It is a learning process; in the 1929 crash it didn't act fast enough to provide elasticity, for example.

      Once again, I would provide a basic income guarantee for all, and then let institutions fail. I would backstop individuals rather than institutions. Until we get there though, the Fed is doing what it can. It's better than letting the private banks alone control the money supply and interest rates. We tried that and it led to so many panics and so much inconvenience that the private system on its own evolved a centralized bank system.

  5. This article looks familiar by harvestsun · · Score: 1
    1. Re: This article looks familiar by Anonymous Coward · · Score: 1

      Scientific data would not be lost if it was posted on Slashdot... you could just retrieve the next day's dupe.

    2. Re:This article looks familiar by gandhi_2 · · Score: 1

      Because without constant refreshing, this article would disappear!

    3. Re:This article looks familiar by bob_super · · Score: 1

      Don't blame them, the editors really care, given their apparent short-term memory loss and/or schizophrenia.

      (yes I know about varying medical definitions of schizo)

    4. Re:This article looks familiar by Mathinker · · Score: 1

      Gawd, how I wish Slashdot would go back to using SRAM...

  6. Make it publicly available. by Anonymous Coward · · Score: 0

    Make it publicly available instead of DRM controlled publications or services.

    1. Re:Make it publicly available. by tqk · · Score: 1

      Make it publicly available instead of DRM controlled publications or services.

      I suspect those publications and services are among the few things pushing this in the other direction. Multiple reviewers, each with their own copy of the data and, as they'd be in the same field of research as the author(s), more likely to be personally familiar with the authors' current work and location.

      Is that irony? I never have managed to figure that one out.

      --
      "Tongue tied and twisted, just an Earth bound misfit ..." -- Pink Floyd.
    2. Re:Make it publicly available. by blue+trane · · Score: 1

      Do the reviewers have the actual data? Or just the papers?

    3. Re:Make it publicly available. by riverat1 · · Score: 1

      Just the papers. The purpose of peer review is sort of like a spelling and grammar check. Reviewers make sure the paper doesn't have any silly scientific mistakes and that the information is presented clearly enough for other scientists to be able to follow the work. Whether the paper ultimately passes muster comes after it is published when the general community in the field can read it and make their comments.

  7. Another fucking dupe by Anonymous Coward · · Score: 0

    Seriously, Slashdot editors... is it too fucking hard to look at the news you posted *yesterday* before adding an article? Turn in your keys and ID badge.

    1. Re:Another fucking dupe by sumdumass · · Score: 1

      I think you are forgetting about the FireHose. Chances are one if not both of the dupes was promoted via the fire hose where enough positive votes promoted it as designed.

  8. Obvious solution. by Anonymous Coward · · Score: 1

    They should post their data to slashdot. Who will duplicate that shit so many times it will never vanish.

  9. Slashdot can solve that by DMiax · · Score: 1

    That's why Slashdot is keen on posting all new studies at least twice, thus increasing the chances they are still available for future generations!

  10. From personal experience by Anonymous Coward · · Score: 1

    I've found dead links to data in peer reviewed papers published just a week or less prior to reading them, sometimes these links were never valid to begin with.

    1. Re:From personal experience by Obfuscant · · Score: 1

      I've found dead links to data in peer reviewed papers published just a week or less prior to reading them, sometimes these links were never valid to begin with.

      Maybe the peer-review process should be shorter, or you should keep up with current journals and not depend on ten year old articles?

      Seriously though. maintenance of data requires money. I have 22 years worth of data here. Much of it is raw video on VHS tapes. Much of it is on old floppies. Much of it is on TK70 tapes. Much of it is on early versions of magnetoptical disks. I don't have anything that reads any of those formats anymore.

      Who pays to keep copying old data onto new media as new media are developed and old media readers break? Not the funding agencies. They don't even pay for upkeep of the data that I do have online.

  11. Re:Scientific Data Disappears At Alarming Rate too by badboy_tw2002 · · Score: 4, Funny

    Don't worry, Slashdot stories won't suffer the same fate as each one is duplicated later on!

  12. Entropy most common scientific subjec to lose data by JoeyRox · · Score: 1

    Couldn't resist.

  13. Options by jklovanc · · Score: 4, Interesting

    Maybe there should be an option to "ignore" an article or "report as duplicate". The second option would require someone to react to it so it may not work.

  14. Re:Scientific Data Disappears At Alarming Rate too by msobkow · · Score: 2

    Gee, three hours to a dupe.

    That has to be some kind of new record.

    --
    I do not fail; I succeed at finding out what does not work.
  15. At least by Nemyst · · Score: 3, Interesting

    Slashdot is doing its part by posting the same data multiple times. Perhaps one copy will survive the test of time!

  16. Re:Scientific Data Disappears At Alarming Rate too by msobkow · · Score: 1

    Oh. Wait. 15 hours. Maybe it's not a record after all. :P

    Forgot about the 24 hour clock. :)

    --
    I do not fail; I succeed at finding out what does not work.
  17. Neglect causes dupes. by Anonymous Coward · · Score: 1

    Dupity dupe dupe!

  18. Re:expensive (whole) cloth by allcoolnameswheretak · · Score: 1

    What data? I just need to walk outside. It's end of December in Germany and we have 6C outside. Tomorrow 12C are forecast. I doubt I will see any snow at all this year. When I was a kid, we used to build snowmen and do battle with snowballs at this time.

  19. Re:expensive (whole) cloth by composer777 · · Score: 1

    Right, all those almanac records have just up and disappeared. It's a "conspiracy".

  20. primary data archival by Anonymous Coward · · Score: 1

    Working in the field, I can pretty much state that far from enough care is taken with data archival and/or transfer to newer storage media when older ones approach obsolescence.

    There's:
    A: not enough staff to take care of it properly or keep a proper archival environment for the various media
    B: not enough money & time to modernize the records/transfer to new mediums
    C: sometimes not enough money to even properly maintain obsolete, long-unsupported and obscure data recording equipment
    (I've seen 'rubber' pinch rollers that had turned to tar-like sludge pretty often. Still have no idea what could have caused that. It's a nightmare to clean.)
    D: data recording equipment that hasn't been so much as looked at in such a long time that the last guy who knew how to even run the thing died of old age
    E: recording medium with true 'shelf' lifespan far shorter than originally stated. (Cue reel-to-reel tapes delaminating, thermal graphic records bleaching/blacking out, etc. -- related to point A)
    F: esoteric and variable recording methods and configurations that were not written down at the time the data was initially collected&recorded
    G: outright loss and/or disposal of unique equipment due to inattentive staff or inventory management personnel / procurement personnel deciding something is useless and worthless.

    Let's not even talk about accidentally overwriting data without ever realizing it (say, 'flipping' tapes in a situation where they shouldn't) because no one would actually check that the data was adequately recorded until years/decades after the fact.

    1. Re:primary data archival by cusco · · Score: 1

      Sometimes there is also deliberate and/or malicious destruction to take into account as well, like the Bush mAdministration ordering the destruction of the Mariner and Pioneer data.

      --
      "Think about how stupid the average person is. Now, realise that half of them are dumber than that." - George Carlin
  21. Re:expensive (whole) cloth by harvey+the+nerd · · Score: 1

    Lonnie Thomson's missing ice core data, unarchived for 20+ yrs comes to mind, among many. Catastrophic anthropogenic global warming, it's a religion. The smart money is thinking more about the probable cold years after 2018.

  22. Could learn a bit from CSIRO here... by Anonymous Coward · · Score: 0

    They're pretty good at preserving their research data these days...

    https://data.csiro.au

  23. Re:expensive (whole) cloth by Anonymous Coward · · Score: 1

    What data? I just need to walk outside. It's end of December in Germany and we have 6C outside. Tomorrow 12C are forecast. I doubt I will see any snow at all this year. When I was a kid, we used to build snowmen and do battle with snowballs at this time.

    That's very interesting! We had record low temperatures here a couple weeks ago. Colder than I've ever experienced in my life and I've been living here for 30 years. Exciting times! But, unfortunately two data points is not enough to make any kind of conclusion about changes in the global climate. I think you may be confusing meteorology and climatology. We need lots of data to examine climate change, which has been collected for that very reason. It'd be a shame to lose it.

  24. Re:expensive (whole) cloth by Obfuscant · · Score: 1

    That's very interesting! We had record low temperatures here a couple weeks ago. Colder than I've ever experienced in my life and I've been living here for 30 years. Exciting times! But, unfortunately two data points is not enough to make any kind of conclusion about changes in the global climate.

    Record cold here, too. Now we have three points, and it's two to one in favor of global cooling. Woot woot!

    We need lots of data to examine climate change, which has been collected for that very reason. It'd be a shame to lose it.

    Don't fear. If we lose any real data, the atmospheric modelers will happily create new old data.

  25. Re:Scientific Data Disappears At Alarming Rate too by riverat1 · · Score: 1

    Maybe they do it so you can use your mod points on one of the posts and make comments on the dupe.

  26. Re:expensive (whole) cloth by riverat1 · · Score: 1

    It's sad the misunderstanding of climate science that your post demonstrates. Modelers don't create data (at least not the data you're thinking about), they compare their model output to real world data to understand how well they model the real world.

  27. Re:expensive (whole) cloth by Obfuscant · · Score: 1
    Whoosh.

    And look up the word "hindcast" if you don't think modelers don't create "old" data.

  28. Digital Data by koan · · Score: 1

    Think of all the family photos that will get deleted or destroyed by hardware failure, and to think I have family photos (on film) from over 100 years ago.

    --
    "If any question why we died, Tell them because our fathers lied."
  29. research institution IT is the problem by Anonymous Coward · · Score: 0

    When a researcher (a postdoc, say) leaves the typical university, her web page gets shut down and her email account deleted. Researchers tend to keep links to their papers, data, and open-source code on these web pages. But university IT departments tend to be super conservative. If the person is gone, so's the data.

    It frustrates me a lot, actually. I think there needs to be a new role in IT: the librarian-archivist: someone who is dedicated to keeping data alive. Exiting researchers could apply to have their web pages frozen (optionally with a forwarding URL), and the IT librarian's job would be to review these applications and do the work necessary to keep these pages alive indefinitely. It's all static content, so it's not that hard aside from the storage problem.

    Additionally, even if the institution doesn't want the exiting researcher to send from her institutional email address any longer, the institution could still forward the email to a new account.

  30. Re:expensive (whole) cloth by riverat1 · · Score: 2

    I'm perfectly aware of what hindcasting is. The results of a hindcast are never presented as real world data.

  31. LIbrary of Congress? by riverat1 · · Score: 1

    Maybe designating the Library of Congress as a repository for scientific data would work. They're pretty good at archiving stuff.

  32. There are no unique identifiers for authors by damn_registrars · · Score: 1

    Part of the problem with corresponding with authors of papers more than 2 years old is that there is no good way to uniquely identify an author. If you know that you are interested in a "John Smith" who wrote a Nature paper i n1989, good luck figuring out which "John Smith" is the same one today (if he is still alive). Another good example is of how many papers are by "Z Huang":currently over 6,000 to date in pubmed.

    Considering how we expect researchers to change institutions multiple times in their careers in order to advance, this only becomes more difficult of a problem over time.

    --
    Damn_registrars has no butt-hole. Damn_registrars has no use for a butt-hole.
  33. Re:expensive (whole) cloth by Obfuscant · · Score: 1

    The results of a hindcast are never presented as real world data.

    A. That you know of.

    B. There was a conditional clause involved that included a complete loss of valuable real data. If the data was valuable and there is a model that can recreate it, it can be done.

    C. If you know about hindcasting, then you know that modelers, as a regular course of business, create "new old data" which they then compare to the real old data. Saying "modelers don't create data" is wrong; they don't routinely or honestly create what they will call real data.

    D. Even that last statement isn't totally true. In ocean modeling it is not unheard of for a model output (created data) to be used to correct some real-world measurements for parameters that cannot easily be measured. For example, if tide level data is required from a place that doesn't have a tide gauge, the modeled tide level from a validated model may be used. This leads to second generation "real" data that is directly dependent upon model output.

    E. And again, "whoosh". It was a joke. "Woot woot!"

  34. Scientific Data condensed as papers by volvox_voxel · · Score: 1

    One thing that I lament about scientific publications, is that the results are boiled down to a few pages. You rarely see raw data , an generally only the statistical analysis. I would like to see web links in journals that include more of the raw data, the programs that generated that data, etc. We live in a day in age when gigabytes are cheap. It would be a lot easier to duplicate someone's work for peer review if the inherent data & analysis programs were more accessible. Although, there are a fair number of organizations that have no interest in making their data easier to understand because of commercialization and patent issues..

    I for one see a lot of EE/CS papers that are devoid of source code. Source code is cumbersome to print, which is why I think it's rarely included as it would take up too much paper. I do think the inclusion of source code facilitates a better understanding of the authors intent. I would love to see CS papers links hyperlinks to a database of the journal publisher as a new standard in the "information age".

  35. A Thing or Two, Within a Factor of Fifty by darenw · · Score: 1

    "Research scientists could learn an important thing or two from computer scientists,..."

    What is the error bar on "a thing or two"?

    As someone with a foot in each camp, I believe it's more like fifty or a hundred. The methods of scientists regarding computing are often built of slow evolutionary changes upon old familiar methods, while incorporating selected cutting edge hardware or algorithms. It is partly the nature of some science projects to carry out observations over many years, ideally with the same instruments, processing and management. In academic computer science, as well as real world IT, all layers and all aspects of any large system are always changing over time. ("All" = 100% give or take a few %) (And yes, somedays, it does seem like over 100%)

  36. Re:expensive (whole) cloth by jedrek · · Score: 1

    It's 6C in Warsaw right now... and last year we'd had snow for two months by this time.

    A handful of data points does not a trend make.

  37. Maybe the dog really did eat John Lott's homework by nbauman · · Score: 1

    https://en.wikipedia.org/wiki/John_Lott#Disputed_survey

    Disputed survey

    In the course of a dispute with Otis Dudley Duncan in 1999–2000,[55][56] Lott claimed to have undertaken a national survey of 2,424 respondents in 1997, the results of which were the source for claims he had made beginning in 1997.[57] However, in 2000 Lott was unable to produce the data, or any records showing that the survey had been undertaken. He said the 1997 hard drive crash that had affected several projects with co-authors had destroyed his survey data set,[58] the original tally sheets had been abandoned with other personal property in his move from Chicago to Yale, and he could not recall the names of any of the students who he said had worked on it. Critics alleged that the survey had never taken place,[59] but Lott defends the survey's existence and accuracy, quoting on his website colleagues who lost data in the hard drive crash.[60][self-published source?]

  38. Journals by belg4mit · · Score: 1

    Perhaps this is n opportunity for journals to update their business models?
    Warehouse and convert data, as well as curate contact lists for papers.

    --
    Were that I say, pancakes?
  39. Re:expensive (whole) cloth by riverat1 · · Score: 1

    Oops, I guess I fell victim to Poe's law.

    But when you get down to it pretty much everything in science is a model of the real world in one way or another.

  40. Those who do not remember science... by __aaltlg1547 · · Score: 1

    ... are condemned to repeat it.

  41. Re:expensive (whole) cloth by Anonymous Coward · · Score: 0

    Tell that to the warmists during a heat wave or a hurricane. Both are considered indisputable proof of global warming. There is indeed a double standard.

  42. Re:expensive (whole) cloth by Anonymous Coward · · Score: 0

    Some people think that global warming is related to solar flares, not human activity.

  43. Do research data even exist ? by Anonymous Coward · · Score: 0

    I think it is really convenient to not give direct access to raw data. Thus your claims are harder to verify. Just in case, I'm sure they have a subset ready that will work perfectly with their results

  44. Purdue University Research Repository by Mark+Leighton+Fisher · · Score: 1

    I'm late to the party here, but I thought it was worth mentioning that the Purdue University Research Repository (https://purr.purdue.edu) is designed as a Trusted Digital Repository for research data. The default lifetime is 10 years, but the Purdue Libraries will add noteworthy datasets to its permanent digital collection after their default lifetime expires. (And yes, I am a programmer on the project.)

    --
    "Display some adaptability" -- Doug Shaftoe, _Cryptonomicon_