Slashdot Mirror


Copyright Status of Thermodynamic Properties?

orzetto writes "I work at a research institute, and programming models of physical systems is what I do most of the time. One significant problem when modeling physical processes is finding thermodynamic data. There are some commercial solutions, but these can be quite expensive, and to the best of my knowledge there are no open source efforts in this direction. In my previous job, my company used NIST's Supertrapp, which is not really that expensive, but is written in Fortran, and an old-fashioned dialect at that. As a result, it is a bit difficult to integrate into other projects (praised be f2c), and the programming interface is simply horrible; worse, there are some Fortran-induced limitations such as a maximum of 20 species in a mixture. I was wondering whether it would be legal to buy a copy of such a database (they usually sell with source code, no one can read Fortran anyway); take the data, possibly reformatting it as XML; implement a new programming interface from scratch; and publish the package as free software. Thermodynamic data is not an intellectual creation but a mere measurement, which was most likely done not by the programmers but by scientists funded with our tax money. What are your experiences and opinions on the matter? For the record, I am based in Germany, so the EU database directive applies."

33 of 154 comments (clear)

  1. FORTRAN by pilsner.urquell · · Score: 4, Funny

    FORTRAN awful? Give me a break.

    </sarcasm>

    1. Re:FORTRAN by Nefarious+Wheel · · Score: 4, Funny

      Integer*16 I

      Real*4 Still

      Real*4 Think

      Integer*16 In

      Real*4 Fortran

      C you insensitive clod!

      --
      Do not mock my vision of impractical footwear
    2. Re:FORTRAN by Anonymous Coward · · Score: 4, Informative

      Huh? Recent versions (ie, in the past couple decades) of Fortran are really very decent for scientific calculation, in many respects better than C. There's a ton of computational chemistry software, for example, written primarily in modern Fortran.

    3. Re:FORTRAN by orkybash · · Score: 2, Interesting

      Most stuff written *today* is written in modern fortran where you can actually have variable names of a decent length. Most legacy code that you have to rely on (e.g. linear algebra routines) are written in the cruddy old fortran. But it's solid code, works as a black box, and I would venture to guess that it's not a *whole* lot less readable than your average implementation of printf. Plus, if you want to update it to modern fortran, be my guest - hope you have a lot of time, patience, money, and a good set of unit tests....

    4. Re:FORTRAN by MaskedSlacker · · Score: 4, Informative

      I used to work solely in FORTRAN for simulation work. The big advantage of Fortran90/95 over C is that the compilers are heavily optimized for doing iterated operations over every value of an array. So for say, fluid dynamics, it really is the best. I suspect you might be able achieve a similar speed in C, but that you would have to hand optimise instead (ugh).

  2. NIST - Public Domain by John+Hasler · · Score: 4, Informative

    If the NIST program is the product of the work of US Government employees it is in the public domain. I would not be surprised if many of the commercial closed-source programs for the same purpose are based on it. In any case, tabulated data is not protected by US copyright so someone in the US could certainly do as you suggest.

    --
    Warning: this article may contain humor, sarcasm, parody, and perhaps even irony. Read at your own risk.
    1. Re:NIST - Public Domain by morgan_greywolf · · Score: 2, Informative

      OTOH, Maybe they just ripped off the Koreans.

      Looks like the same info to me.

      (But IANAP)

    2. Re:NIST - Public Domain by Anonymous Coward · · Score: 2, Interesting

      It appears that there is an exemption to the public domain status which applies here:

      15 U.S.C. Â 290e authorizes U.S. Secretary of Commerce to secure copyright for works produced by the Department of Commerce under the Standard Reference Data Act.[8]

  3. IRTTALIYJ by Anonymous Coward · · Score: 3, Insightful

    I Recommend Talking To A Lawyer In Your Jurisdiction.

    HTH

  4. Wrote code in ForTran 77 for six years by Anonymous Coward · · Score: 4, Funny

    Let me tell you something: God speaks ForTran, and the guys who translated the bible from ForTran to Hebrew did a really really bad job.

    1. Re:Wrote code in ForTran 77 for six years by morgan_greywolf · · Score: 3, Funny

      Let me tell you something: God speaks ForTran, and the guys who translated the bible from ForTran to Hebrew did a really really bad job.

      Indeed. For example, here's the FORTRAN source code to Genesis.

    2. Re:Wrote code in ForTran 77 for six years by Lawrence_Bird · · Score: 2, Insightful

      If this guy can't handle reading the FORTRAN code I seriously doubt he is capable of re-inventing it in a 'new' language. FORTRAN is not that hard to understand even "old" dialects.

  5. FAQ claims copyright by Mathinker · · Score: 2, Informative

    The FAQ claims that the US government has a copyright on the material. This could be possible if the material was not directly generated by the NIST itself --- for example, they paid a contractor to generate it and it is considered a "work for hire".

    The facts themselves probably can't be considered to be under copyright.

    OTOH, I agree with a previous poster that you should consult a lawyer if you want to actually do anything which isn't sheeple-ish with the data.

    1. Re:FAQ claims copyright by Alsee · · Score: 5, Informative

      The EU database law specifically does not protect foreign databases unless that foreign country also creates a database a law and establishes mutual protection. The US has no such protection, in fact it seems no country outside the EU has established reciprocal database protection. It should be possible to do this open source project based on data from the US or from anywhere outside the EU.

      The FAQ [nist.gov] claims that the US government has a copyright on the material.

      The factual data in that database cannot be protected by copyright, it is not protected as a database in the US, and is not covered under EU law. The only copyright they could claim on it is either if it contains creative images or creative text or the like, then those particular elements could be protected, or they could perhaps claim a copyright on the creative arrangement and formatting of the data in the database. Both of those issued can be avoided.

      What can be done is use this database and read out the needed factual data elements and then re-write it into the database for the open source project. Purely factual text-fields such as the name of an element or compound or whatever can be copied, just be careful not to copy any images or free-form text fields such as descriptive text or explanatory text. Then write the data out in your own arrangement. The best thing to do there is to arrange the data in some strict alphabetical or numerical order - there is no creativity and no copyrightability in that sort of unique ordering. That means not only storing the records in alphabetical order, but also order the data elements within each record in name-of-field alphabetical order. It might even be a good idea to rename any fields that care reasonably open to custom naming. There is no need to rename a field like "name" or "address" or "phone number", but a field like "work contact number" could easily be called "work phone".

      The best way to go about it would be to create a mostly-empty, but functioning, database before even looking at your intended source material, that way by definition there is no copying of the formatting of the database. Once there is a functioning database design then the factual data elements can be copied from the source to fill the already-designed database.

      -

      --
      - - You can't take something off the Internet! That's like trying to take pee out of a swimming pool.
    2. Re:FAQ claims copyright by Throtex · · Score: 2, Insightful

      I wasn't expecting to find the correct answer to a legal question here on Slashdot, but, there it is. /thread. Too bad I don't have any mod points.

      One nit, though, just be careful with "renaming a field" as a solution ... that could still get you nailed as a derivative work. I do like the idea of building the framework from scratch, and only then populating it with the data.

    3. Re:FAQ claims copyright by Anonymous Coward · · Score: 3, Informative

      I am a federal worker and I oversee some contracts that involve writing Fortran codes for simulating nuclear reactors. That is not quite right. You need to consult the Federal Acquisition Regulation (FAR), Chapter 27. Specifically, see

      27.404-2 Limited rights data and restricted computer software.
      and
      27.404-3 Copyrighted works.

      http://www.acquisition.gov/far/current/html/Subpart%2027_4.html#wp1041836

      If you read those sections, and take the time to really understand the definitions they use, and read the appropriate appendices, etc, you will find that the legalese seems to indicate that the contractor IS allowed to copyright data generated in performance of the contract (with the government's permission), and that the goverment maintains an exclusive, irrevocable license to use such data for its purposes, but the government does not necessarily maintain an exclusive right to "redistribute" such data.

      It is my belief that the law is written this way so as to give potential contractors an incentive to do business with the government. If a company can't build a portfolio of intellectual property, then it has no means of distinguishing itself from the competition. In the long run, the government would not get the best value for its $.

  6. Re:Department by morgan_greywolf · · Score: 3, Informative

    Anything produced by the United States Federal Government (which the National Institute of Standards and Technology certainly qualifies as), is in the public domain.

    That's what he meant.

  7. Where did commercial solutions get data from? by Gnavpot · · Score: 2, Insightful

    I would assume that it would be difficult to sell a commercial solution for scientific purposes unless it is based on already documented and accepted data. Basing your scientific work on calculations made by a commercial solution with homegrown data would make it difficult to openly document your method to other scientists. So why not find the published version of those data instead of lifting them out of software?

    But what do I know? I am an engineer, not a scientist.

    In my work I do a lot of calculations of water and steam properties, and the available software I know of is strictly using the calculation methods published by IAPWS. So if I wanted to, I could buy the IAPWS documentation and make my own software.

  8. It Probably Wouldn't Be Legal by SwashbucklingCowboy · · Score: 4, Informative

    A database is copyrightable. See http://www.bitlaw.com/copyright/database.html

    1. Re:It Probably Wouldn't Be Legal by Wdi · · Score: 4, Interesting

      I do not know about this exact database, but many scientific databases are hand-curated and extensively reviewed. Many do not include every measurement published in the literature, but carefully and judiciously select those data points deemed, by expert opinion, most reliable. Thermodynamic databases do not contains "facts" per se, but measured data points which may or may not be close to the facts. The editing and review process, which is quite an investment, does often create a solid foundation for copyright. These databases are not just a routine business, like a reformatted dump of the data from a telephone company.

    2. Re:It Probably Wouldn't Be Legal by pearl298 · · Score: 2, Informative

      A database is copyrightable, but the applicable case law from when I practised (YEARS AGO!) was the phone directory - it was held to be sufficient that the copier rearranged and reorganised the information to provide a "mere spark of creativity".

  9. It will be a very difficult project by Anonymous Coward · · Score: 5, Interesting

    I can't find my copy of Supertrapp at the moment, but as I recall there is some strange wording in the license. It's definitely NOT public domain as asserted by the uninformed.

    It's also not tabulated data. It's a collection of equations and empirical constants embedded in what may be the worst code I've ever seen.

    It may be easier to track down the original papers and work from those, though that too is difficult as lots of the original work was published in obscure journals.

    FWIW I am very comfortable w/ FORTRAN and prefer it for serious numerical work (default choice is C). I'm also quite skilled at interfacing FORTRAN to other languages.

    I'm interested in working on such a project and have quite a bit of experience w/ the problem, though only limited experience w/ Supertrapp because it is so bad I tended to avoid using it unless I absolutely had to. Please send me an email so we can discuss more. rhb acm.org

    Reg Beardsley

    1. Re:It will be a very difficult project by GPSguy · · Score: 3, Informative

      I tend to work in the atmospheric sciences, where, as one might guess we work with microphysical processes and thermodynamic data. I would second the recommendation that working from the original, authoritative publications would be a good approach. If you're well-versed in the field already, you're familiar with the seminal works. If you're not, your job is bigger than you realize, as programming for a scientific project is rarely just finding equations and re-coding them, or finding a database of physical constants and calling them. You've got to understand where in the domain in question they come into play and use appropriate equations and parameterizations.

      Fortran, even Fortran-66, is rarely unreadable. However, it often is written like a short story in a local dialect. The author has a method and style, and you have to understand it, or at least become conversant with it, before reading and understanding the flow of the code occurs. I should point out that this is really not different from any other language. Fortran, however, has been maligned because its roots were not in object-basis. Fortran 90 and Fortran 95 both, however, comply with the OO paradigm. The inherent problem is, CS departments often don't teach Fortran, and their faculty will tell you how horrid it is. Why? Because their discipline is COMPUTER science, not, say, solid earth geophysics, and they're conversant with a number of languages.and feel they can pick the best one for the job. The geophysicist, on the other hand, spent his time learning how and why those pesky tectonic plates move around, something the computer scientist never really studied unless, maybe, he took a rocks-for-jocks class and got really interested. Rather than mastering C, C++, Java and C#, the geophysicist learned just enough Fortran to get his work done, and proceeded down a different path. Since Fortran ("FORmula TRANslation") was developed to help discipline scientists transform their equations to operable code, this really makes sense.

      My first computer initiation was using Fortran (Fortran-II) on an IBM 1401 while I was still in junior high school. My first formal course in programming used SWIFT, BASIC and SNOBOL, over the course of a summer while in high school. Virtually every course in college I took (I was not a CS major, but could/should have been from my transcript) was in Fortran (plus a pair of assembly language courses) because the choices were Fortran, Cobol or assembly. Imagine, if you will, not having a "modern language around, and having to code decent I/O or even decent APIs with that choice.

      --
      Never ascribe to malice that which can adequately be explained by tenure.
    2. Re:It will be a very difficult project by bwcbwc · · Score: 2, Interesting

      Items created/published by the U.S. government are in the public domain at least in the U.S. I'm not sure if the rights are granted abroad as well.

      However, items created under a contract to the U.S. government may or may not be in the public domain. There's a section of US law that companies have to invoke in their contract or the software license regarding U.S. government "limited rights" to keep their code or other work private. On the other hand, NIST DATA shouldn't be copyrightable in any case, although companies still like to test the theory that their databases are copyrighted fairly frequently.

      I am not a lawyer. Even if I were a lawyer, I'm not YOUR lawyer. The above discussion is mostly intended to point out that you're more likely to get a better deal for data out of the government than out of a private company. If you need pre-built software from a private firm, you're likely to be tied into a license agreement that affirms copyright over the data or its storage format (or both), which could be a source of disagreement with the software vendor if you decide to follow your plan of extracting the data for your own use.

      --
      We are the 198 proof..
  10. Free Software vs. Genuiness of Data by modrzej · · Score: 4, Informative

    People using this NIST data do it because it has NIST sign on it, so they don't risk being dependent on tabulated values from not exhaustively verified source. If you're rewritting the source code, you should take care to establish means by which users could check that data are unaltered with respect to what NIST servers contain. If you work for renowned institute, that should be easy, just store the database on your server and sync it with NIST, along with sources of data cited at NIST website.

    As it comes to Fortran programming, it's optimal language for scientific computing. Modern dialects have some of the power of C (allocatable arrays, long subourtine names, free format code, modules, interoperability with C), but, what is preferable in scientific computing, programmer isn't encouraged to tinker with machine-specific stuff. Many existing codes are written in Fortran, e.g. powerful LAPACK library and many computational chemistry packages, so for many physicists/chemists/engineers Fortran is the only language they know and care of. Moreover, Fortran in recent years has gained parallel-programming functionality thanks to OpenMP (it's provided with features eqivalent to that in C/Cpp).

  11. Don't take anyone's advice here unless... by LunarStudio · · Score: 2, Insightful

    ...they specialize in international copyright law. While a US Citizen may be able to "copy" or "rewrite" code US taxpayers paid for, don't assume other, non-contributing (non tax-paying to the US) foreigners can openly copy and redistribute code that is technically the property of US citizens. Personally, I don't care as your topic is of little interest to me, but unless someone is attorney/lawyer in these copyrights, I wouldn't listen to anyone here.

  12. Definitely copyrightable by j.+andrew+rogers · · Score: 2, Interesting

    Empirical models of thermodynamic properties are definitely protected by copyright. There is a high-value market for these models, and different models of the same thermodynamic process will evaluate differently so it is a valuable creative product rather than a mere description of reality. For fields where tiny improvements in efficiency generate big cost savings, you want to use the most accurate model available where "most accurate" will be a function of the use case.

    Thermodynamic property models are not measurements of reality, they are mathematical models of a physical process derived from empirical data. They are what you use to predict reality when it is not possible or practical to measure it. Turning the empirical data points into continuous functions is a creative step and the value of the creative step is in minimizing the divergence between the model and reality over as broad a range as possible. There are companies that specialize in producing and selling ultra-accurate thermodynamic property models.

  13. Re:Thermodynamics by vlm · · Score: 2, Interesting

    I think that the biggest problem isn't intellectual property, but the people who administer it. I don't think that the demand is particularly great. As such, there isn't a great incentive to release it freely. There are costs to administering such a large DB. Furthermore, nobody wants their name on a database of all the fundamental properties because in that data there are bound to be mistakes.

    You are looking at the liability issue for the creator/admin, the supply side. The bigger liability problem is on the engineer, demand side.

    Something that is missing from this discussion, is some Chemical Engineer specific knowledge that I can attempt to provide. The whole point of a "steam table" and similar products like discussed here, is there is no accurate formula for vapor pressure at various temps. The simplistic linear equations taught in high school don't work at the extremes, or don't give accurate enough results to design a safe and profitable plant. So, more than a century ago, physicist / chemist / engineers started making lab measurements, and selling graphs and tables of data. The modern version of that product is the expensive computer models discussed in the article, which optimistically try to answer any input conditions with correct and continuous answers based on a mixture of theory, optimism, and some distinct individual laboratory measurements.

    Because the data model is used to design multi-million dollar plants, and because the only way to verify the results is very expensive lab work, and is therefore often glossed over, a mistake in the data model could be a multi-million dollar mistake, assuming the losses are purely economic and no human victims. The creator/admin probably was intelligent enough to release under a license that removes all liability for data errors. The end user engineer will not be so lucky.

    On one far extreme of the provability / testability spectrum, you've got yet another word processor, where if the screen doesn't match what you typed in, literally a trained gorilla could figure out the word processor is broken, and act accordingly (throw poo at programmer? The more things change, the more they stay the same) Or maybe a crypto hash where a hundred programmers can write it in a hundred languages and all the outputs better match for a given input.

    At the other extreme of provability / testability, you've got a Chem-E basically having to take the program output on faith that it's correct. The program says the pressure of supercritical steam at 700 K is 230 atm, I know that is somewhat above critical temp and above critical pressure, so the best I can do is "sounds about right to me"? So specify plant components based on a 230 atm environment (adding appropriate safety factors, etc) Now steam pressure is old stuff, boring, and everyone knows about what to expect, but using really weird stuff under really weird conditions, who knows what crazy output from the data model might slip past, resulting in a disaster?

    Despite the dangers, it would be great for education, and cheap experimental research/simulation, even if it would be too legally dangerous to use in formal design work.

    --
    "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
  14. Re:FEED ME by causality · · Score: 4, Insightful
    Yes, I am deliberately replying to a top-level post to ask a question. A question about something that arguably doesn't belong in the summary (editors, anyone?).

    I was wondering whether it would be legal to buy a copy of such a database

    That's a legal question. The answer to that question might seriously complicate your life if you get it wrong. What would possess a person to ask this of Slashdot instead of contacting a lawyer? Better yet, why would a German expect a USA-based Web site to be familiar with the nuances of German (or EU) copyright law? I'm trying to picture a situation where I'd contact a German online forum to ask for legal advice pertaining to American law and I just can't come up with anything.

    I suppose next we'll see an Ask Slashdot which says "hi, I'm a diabetic and I forgot how much insulin I am supposed to inject myself with, please advise." And I'll have to scroll down significantly to see a partly-buried comment where someone finally suggests that perhaps he should be asking a doctor...

    --
    It is a miracle that curiosity survives formal education. - Einstein
  15. Be very very careful by Fallen+Andy · · Score: 2, Insightful
    Three words: Don't do it. Here's a *real life* story as to why. Once upon a time (ok, about 13-14 years ago) there was a large Greek software company that wanted to make a property tax program. The problem was that they didn't have the data. Yours truly got to reverse engineer a competitors database. Yes, I extracted all of their pathetically encrypted DB (substitution cipher WTF?). Now, if you know anything about databases or mailing lists or even log tables, you know that there are often deliberately false entries so that it's easy to know your data is be ripped (a bit earlier in time I caught out a Cypriot company ripping off the english greek dictionary data I'd been involved in that way).

    I warned the project manager that sure go ahead and use the data as a basis for programming but not for the production program.

    A couple of months later, the competitors lawyers appeared and (cough) out of court (cough) settlement.

    Never did find out how much it cost "my" software house...

    In the end they had to employ a gaggle of impoverished undergrads to build their own DB.

    So, be very very careful. It might be a good idea to *ask* if you can re-use the data - often it's possible for non commercial purposes...

    Andy

  16. Re:FEED ME by mabhatter654 · · Score: 2, Insightful

    slashdot knows more than most IP lawyers. You have to remember, lawyering is essentially about either telling you what case law precedent has established (like don't rob banks), or arguing for your side, not about "truth" or "right and wrong". Most lawyers are more the type that will take the money, then figure out how to argue the way you want... they don't do good with "advice" not tied to rulings in court. Ask the right question and you'll pay the lawyers a bunch less money... nobody wants THAT!

    In regards to the question, he's looking to pull government funded data out of a program. Considering most countries in Europe allow the state to charge for everything it can and that they have "database aggregation" "copyrights". His plan would probably get him sued.

    So now that most of slashdot would agree on that outcome.... what other resources are available to obtain his desired outcome. This is where the slashdot crowd helps because they're in different countries and chances are pretty good somebody will know what government or research office he should talk to... There are still huge chunks of government services that aren't documented anywhere on the internet. Corporations that know who has the info freely available love to keep their sources opaque so the industry has to go thru them to get free information, and in many cases that means "handshake" deals so key offices never have time to get their web pages posted properly. You'd be surprised how many government services are still known only by the posting outside the office door (in the basement just above the beware of leopard sign) or maybe the phone book if you're lucky.

  17. Not even close by pestie · · Score: 2, Insightful

    I don't know if you're trolling or just grossly misinformed, but that's not even close to correct. Lyrics are the copyrighted creative works of the person/people who wrote them. "Changing one word" does not allow someone else to then distribute the lyrics legally. That would be considered a "derivative work," the creation of which is a right provided to copyright holders under copyright law.

  18. What database? by ephraimhorse · · Score: 3, Informative

    All the formulation for the prediction of water properties are published by International Association for the Properties of Water and Steam (IAPWS) so that they can be used. It is often important to use exactly the same correlations so that that all the thermodynamic data are self-consistent. Therefore the formulations are standardized by an international body. I do not think that the use of this formulae is in any way restricted because such restriction would defeat the very purpose - standarization. See http://www.iapws.org/ for the collection of the current formulations. There may be a restriction for a particular implementation (computer program) or sets of tables ( "lookup tables" and interpolation are often used for performance). Not sure. Hope someone starts an extension to implement these kind of things in Gnumeric. Cheers. Ephraim the horse.