I agree with you: decentralized is fine; and decentralized + PKI would be even nicer security wise. And as a patient, I'd trust it over a central system for all the reasons mentioned elsewhere in this discussion.
My main point was that while PKI is optional for decentralized PHR, in order to develop a centralized PHR system like Google Health, you pretty much *have* to have PKI before the doctors will use your system. The lack of trust is a design flaw which, somehow, I don't think any of the centralized phr developers have even realized that they have, much less that PKI would fix it... otherwise they'd be hawking it at the forefront of their advertisements to doctors. I'm not really sure how they missed the trust issue, because it's the first thing the doctors I work with mentioned after they heard about Google Health.
BTW, those are some nice links regarding PKI, thanks for them! Going to have to look into how I can put that stuff to use.
It occurs to me I used a bunch of industry specific acronyms in the above post; let me define 'em...
PHR - patient health records
PHI - protected heath information - mostly equivalent to PHR, but sometimes with private doctor-to-doctor discussions (such as a patient's drug seeking habits)
EMR - electronic medical records - "EMR" software as a class basically is the eletronic equivalent of the wall of paper charts in your doctor's office. most PHR exchange will happen between these types of systems, or be printed out, edited, and faxed (sometimes to another EMR).
credentialling / credentials management - tracking of doctor licenses, certifications, etc... this stuff is personal information about the doctors (ssn, etc) that's flying around between their office, the govt, and insurance companies.
NPI / NPIDB - National Practitioner Data Bank - government database of the public parts of a doctor's credentials; that's trying to unify and replace all the others that are out there (UPIN, Medicaid, Medicare, DEA). It's in use, but the information frequently is years out of date, even with the best intent of all involved.
I work for a company that produces various types of medical records management software (credentials management, PHI document exchange, EMR); and I've spent a lot of time talking to a number of doctors, both tech-saavy and not so much. That disclaimed...
Let me tell you what the key problem is with electronic medical records: they are legally the property of the patient, but no doctor can (or will) trust the important details of such records unless they come from another doctor, and have a verifiable history leading back to that doctor. Not that they don't believe the part that lists a patient's allergies, but when the medical record says the patient has a debilitating disease which *requires* they be given morphine and lots of it, the doctor has to be able to verify the patient didn't just fake a record for a quick drug fix.
This leads to an interesting state electronically: if data records are to be centralized, a public key system must be set up, tied to each doctor, allowing them to both contribute & authenticate records, and allowing the patient to do the same (but the patient contributions will have to remain "untrusted" medically). You can have centralization without a public key system, but then you're just trusting the gatekeeper to never mess up, get hacked, or paid off. And even if you'd set up such a system which you know (as a programmer/cryptographer) can be made to work... you have to get the doctors to trust it as well; as given how seriously most of them take the responsibility to safeguard their patient's records, that's a hard sell even to a tech-saavy doctor.
Which is why the only major movement we've had in adoption of electronic records has been a decentralized one... doctors are converting their offices to use electronic systems internally, exchange information electronically; but always records are transmitted in a p2p fashion (whether by email, fax, courier, etc); allowing the receiving doctor to trust the veracity of the information (at least as far as they trust the originating doctor); without requiring them to trust the patient.
Google Health is merely one of the most prominent "my PHR online" projects out there, but the problem they are faced with solving is not merely legal or luddite based, but a issue of cryptographic trust in it's truest sense.
And that's not to mention that centralization of medical records creates a much more attractive point of failure for all kinds of things (such identity theft, if merely for the purposes of using some else's insurance), and even if a public key system is implemented, the doctor (and staff) are handing off part of their trust to a central database... and given the mess of outdated information the NPI registry contains, they are loath to believe in such a system.
disclaimer: my company has a number of ongoing projects in this field, but my assessment here is pretty well unbiased architecture and adoption-wise as far as I know, we have a number of pokers in the fire fitting most of the above scenarios.
Or, failing that, measure time in.125 (1/8th) second increments instead of 0.1, and then it will align with the binary floating point representation. Voila, no error.
I'm well aware that OAEP was designed for asymetric ciphers... but I try not to be straight-jacketed by the on-the-box labelling of an algorithm, and keep my mind on what the algorithm actually does. OAEP is basically just a general principle for armoring a block of data to foils partial recovery unless you can decrypt a large % of it.
I agree, OAEP doesn't do _anything_ to fix the weakness in the key-schedule, but I was under the impression this attack required not just related keys, but related plaintext, and needed the similarly of plaintext in order to finess the original key's bits out of the schedule. Because of that, doing something like OAEP to scramble the plaintext would seriously hamper this exploit, since they effectively wouldn't know the anything about the masked bits passed into AES. It wasn't meant to be a fix, but more of a suggestion of a stop-gap for those of us who will continue to work with legacy AES systems, and need any extra security we can throw in.
[OTHO, if I mis-read, and the exploit doesn't rely on any knowledge of the plaintext, or even similar plaintext, OAEP wouldn't be applicable]
Re:There is no such thing as ten-round AES-256
on
Another New AES Attack
·
· Score: 4, Informative
Another (somewhat less-well known) thing that can be done is to use OAEP+ (http://en.wikipedia.org/wiki/Optimal_Asymmetric_Encryption_Padding) to encrypt the datablocks that you're transmitting. The link is to OAEP but OAEP+ is probably what you'd want to use with AES... I don't have a link handy, and the basic principle of the two is the same...
The OAEP algorithm scrambles your data chunks by XORing your plaintext with randomly generated bits, but done in a way that's recoverable IF and ONLY IF you have the entire ciphertext decoded (designed for RSA, but can apply to AES). This means that the same key+plaintext will always result in different ciphertext, and also means that in order to get any useful bits of key/plaintext information, the attacker must get them all, or they're just guessing as to which set of random bits OAEP used (and it generally puts 128 bits worth in).
While the actual OAEP protocol is a block-level action, and the safe version adds 128 bits of randomness (and thus size), the general idea can be modified to be as cheap or expensive as you want... the idea in general makes many asymetric ciphers MUCH more secure.
Those brute-forcing apps are great for whittling down the number of possibilities.
But there's more than just simple login delays in your way... there's the password hashing algorithm being used to encrypt the password for storage. The two leading algorithms right now are BCrypt and SHA512-Crypt. Both of these algorithms have the facility to increase the number of "rounds" of encryption that's applied to your password when generating the hash. What does this mean? As computers get more powerful (and/or as you need more security), you can up the number of rounds required to encrypt your password, so that it reliably takes a constant amount of time to verify it.
If you pick enough rounds that it takes 1 seconds for the system to encrypt/verify your password, you won't notice much of a delay when logging in. But consider the worst-case scenario where the attacker has a copy of your/etc/shadow file: Barring parallelization, he's limited to trying 1 password per second, simply because of the complexity of the calculation you're requiring him to perform. At that rate, trying all 3 letter combinations would take him 4 hours, all 6 letter combinations would take 9 years. Mind you, those numbers are before any whittling away known subsets is performed. But given that, you can always up the number of rounds even more to re-balance things. Some high security systems I've set up take around 5 seconds on a quad-core system just to verify the password!
Parallelization will help, of course, but if your attacker has 128 cores to work with, those 9 years will still take him 1 month. And if you have something worth an attacker spending _that_ much time and resources, let's hope a password is not the only thing standing in his way.
[re: windows, I don't know windows password hash algorithms at all. I love a pointer to some resources though]
I agree... it just plain scares me that so many large systems don't even bother with such trivial precautions as hashing. It's even more trivial than sql injections. Up until it happened, I would have _never_ guessed myspace & phpbb stored plaintext. It seems borderline incompetent.
I've implemented tons of little one-off account systems, for websites small enough they'll probably never even see a hacker. But before I even implemented the first one, I went through the trouble of finding the best password hash algorithm I could (http://people.redhat.com/drepper/SHA-crypt.txt)
Sure, I've had customers ask "why can't it just email me my password when I forget?" But you know what? Just a few minutes of quick explanation, and even people with NO math or cs background can understand why it's important.
So for the love of the gods, people, please take an hour out of your time to put in a hash alg (even md5-crypt is better than nothing)... it's just not that hard.
---
Just to go off on a rant here... I've also noticed in some web applications there is the tendency to just pick a hash alg at random. Be warned: not all hash algorithms are created equal.
"Checksum" algorithms such as CRC32 are woefully insufficient: easy to reverse (for small strings), easy to find collisions. They're basically just one guessable step away from plaintext.
"Integrity" algorithms such as MD5 & SHA are a little better, since they're very hard to reverse, and difficult to find collisions. The problem with using these types of hashes directly is that they will always hash a password to the _same_ string. While that's desirable for their purposes (file integrity, etc), that's not good at all for passwords: you can pre-build a table of known mappings beforehand, and use it to quickly guess many passwords in parallel (aka a rainbow table): Given a table of 10k user passwords hashed like this, and a pre-built table, the odds are very good you'll get a significant number of the passwords in a very short amount of time.
This is why a proper "Password" hash (eg bcrypt, md5-crypt, sha-crypt) includes a "salt" which is randomly generated each time the password is set (and not just the first time). This prevents the rainbow attacks which are possible on plain integrity hashes. But prepending (or appending) the salt is not enough, because since it's effect can be undone mathematically, at least enough so that it presents no real additional barrier.
Genuine password hashes, while using an integrity hash their basis, mix & blend the password and the salt in so many variable ways as to make this reversal impossible. And there are so many nuances here that _you should not roll your own_ (unless you're Bruce Schneier). Read bcrypt, sha-crypt or md5-crypt's specs for some details.
Note: don't use the old unix-crypt, while it is a password hash in the strict sense, it's so old and simple, it's barely stronger than crc32.
Note: sha-crypt adds additional flexibility via it's "rounds" system, allowing it to easily grow more complicated as computers grow more powerful. This is why I prefer it above all the others.
End rant: all this is why you should use sha-crypt or md5-crypt, and nothing lesser.
I can't recommend Eric enough... Since it's PyQT based, as of Eric v4 it's seamlessly cross-platform... and integrates really nicely w/ a number of vcs systems... I've tried a number of them, and it's the best IMHO.
Mind you, a coworker of mine swears by SPE, so take that for what you will:)
Just to run down the three I'm familiar with:
Eric4: cross platform; qt4+qscintilla based; great editor; ok class browser; good vcs & project management; good debugger; poor command completion; handles lots of filetypes (c++, js, ruby, python, etc). Command completion & class browser are my main complaints w/ this program.
SPE: cross platform; wxwindows based; great editor; excellent class browser; no vcs or project management; debugger?; good command completion; ONLY DOES PYTHON... uses os.startfile() for most other filetypes. Not supporting any other filetypes, or project files, are my main complaints w/ this program.
WingIDE: cross platform; gtk based; great editor; great class browser; not quite as familiar w/ the rest of it because it's semi costly, and I'm a cheap bastard. Main complaint here: just the money:) Oh... and it's command completion (while probably the best I've seen) uses introspection to a large degree, so get ready to have your modules imported all the time.
All 3 are under very active development, and written in python directly, so you can hack them to your needs... Eric4 even has a plugin system.
I'd add that correlation usually implies that there is some common cause which is a necessary condition of all the correlated events, even if it is not sufficient to cause all of them by itself.
People frequently loose sight of the fact that all "correlation != causation" is meant to indicate is that the common cause of correlated events is not required to be one the events themselves, but can be some other external event.
Whether the cause is bias in the measurement, direct/indirect causation, some remotely connected common causation, or whatever.. Correlation hardly _ever_ is simply coincidence.
My company currently maintains two large python applications (40-50 LOC, not including custom support libraries). One of them is 6 years old, the other 2 years old. On our development team, we have familiarity with Java, C, C++, Perl, and the concensus is that we've had less maintenance work under python then the other languages.
If you add in the time spent on prototyping and testing, python has saved us way more time and effort.
Regarding type checking and reliability... You need to read up on the idea of "duck typing", Python's philosophy is that actual types get in the way, it's protocols and interfaces that matter. And after 7 years of python programming, I'd have to agree with that philosophy.
All of the critical parts of our apps have had unittests written (which catch semantic glitches at a level typechecking never will)... We've actually spent some time _removing_ type checking from the system, and replacing those lines with hasattr(obj, "append") calls, or creating synthethic protocol tests, allowing our implementations of various inputs to be widely varying: Jython implementations, CPython objects, who cares, they all LOOK the same.
OTHO, maybe you're trolling. "Python looks like Basic" indeed. Out of all the languages, I would have never been reminded of that one. Javascript maybe (especially considering what prototype.js has going on)
I'm sorry, what? Of course there's an objective view.
I agree with you re: the biasing tendencies/constraints inherently present in human physiology, as well human (group) psychology.
But of _course_ there's an objective viewpoint. It's the viewpoint which accurately reflects all things which can be accurately measured. Most things which seem to be subjectively reported are mainly so because no metric exists or is being used which can measure the objectivity.
Math, most glaring of all, is objective. No one can have multiple correct opinions on what 2+2 equals... sure the symbols could be redefined, but semantically, there is but a single answer, unchanging.
But it doesn't stop there. Look up probabalistic primality tests on wikipedia: http://en.wikipedia.org/wiki/Primality_test Primality is an objective truth, which these tests are able to arrive at despite their innate bias & inaccuracy.
So what does that have to do with people? Most people's opinions and beliefs (and we've all got lots) deal with subjects which hard to measure, especially with regards to accuracy: Politics for one; but opinions on the worth of programming topics too, which is near and dear to everyone here.
It's not our neurological fallibility that does us in: some "objective" truths are _defined_ by human concensus: did I win the lottery this month? I can believe "yes" all I want, but unless my state's gaming commission believes as I do, it just isn't true.
But their belief defines the truth, and when looking for objective viewpoints, you have identify all the subjectively-defined truths. Driving on the left vs the right is another... there's no universal truth there, but certain groups of people have to agree, as a matter of protocol (Native language is yet another).
The other statements of fact are just objective truths we aren't able or willing to measure. Programming is one such... but because it's such a young branch of human knowledge (100 years worth of work and counting), it's one of the "known unknowns": like other types of engineering, we know there _is_ a best way to do things, we just haven't developed very good guiding principles to measure it yet.
But the only way to get there is to seek out objectivity, since it could be present anywhere. So read the book over, and maybe this guy's wrong on a number of accounts, and you'll notice them. So spread the word, blog/comment/whatever about _why_ he's wrong.
If your metric is accurate, it will be born out as it is applied to other situations, and by other people. We'll build up better and better guiding principles out of these meta-measurements. Eventually, we'll see more and more programming topics which _can_ be accurately and objectively measured, and in much the same way probabalistic primality tests wade through their biases to arrive at the correct answer.
Not believing in objective viewpoints is giving up the search for the truth, and that would be a shame, because that's one of the grandest of all of humanity's endeavours.
Regarding programming, we do have one great truth already:
Given the nature of the human mind, GOTO Considered Harmful. Great for processors to think in, but bad for us:)
Much as I like sqllite, I agree with you completely... metakit is definitely what he wants.
For those not familiar with metakit, it straddles a very interesting (to me at least) line between "some csv files i threw together" and "full on sql db" in terms of it's storage and query semantics, while at the same time being a flat file like sqllite.
And it really deserves to be better known from a theoretical standpoint, in how it structures the data... especially some of the more interesting applications it's been put to, such as a graph-theory style database written as a layer on top of metakit (can't remember the name).
Also, it's bound to a number of languages (python and tcl are the only ones i've played with).
Perhaps if there weren't SO MANY responses saying "sqllite" over and over, your response would have gotten the informative rating it needs.
Typical "air conditioner" situation: you want to make the inside of a room cooler than the outside temperature. Since the room starts out similar in temp to the outside, you have to spend energy pushing heat "uphill" to an increasingly warmer outside. Making heat flow against the direction it would normally flow, that's a cooler in the thermodynamic sense.
In the CPU situation, you want to make the inside of the cpu EQUAL to the outside temperature. Since the running CPU starts out way warmer than the outside temp, the heat will flow naturally on it's own "downhill" to the outside. Any sort of cooling system merely hastens the flow.
In this situation, any device like a fan, etc is merely a more efficient radiator... as the temp of cpu gets closer to the outside, this device loses efficiency... and in no case could it get the cpu any _colder_ than the outside.
Being able to do that is what makes something a "cooler" in the physics sense.
Indeed. You should read up on the changes they're planning to make. Many of them are backwards-incompatible changes that will remove nasty warts, realign certain parts of python's object structure (str vs unicode, int vs long) in ways that are MUCH cleaner, MUCH harder to make mistakes with, etc.
This is not some Larry Wall day dream revision... they aren't rethinking python from the ground up, it's an evolutionary revision which is keeping the python 2.x codebase, but changing some things they've been wanting to do for a long long time. The 1.x -> 2.x transition is apt... the code will look almost the same, only cleaner:)
If that's not good enough for anyone, automated tools are being developed to convert python 2.6 -> 3.0. Yes, that's right, Python 2.6. It'll be the last version of 2.x released, and be carefully designed to be both compatible w/ 2.x, and still able to warn you about gotcha's that would appear if you ran the code under 3.0.
Not only that, but 3.0 isn't projected to be in mainstream use for at least 2 more years.
There's absolutely no need for the sensationalist panic this slashdot headline annouced itself with. Rethinking a complex system from the ground up is always a bad idea... but GVR knows that, and that's not whats happening here.
It's more akin to the linux kernel moving from 2.4 -> 2.6 Incompatible, yes, but a much better architecture.
Maybe you were being sarcastic. If not... I don't think you quite get what being in the "Information Age" is about. It's not just some business buzzword: With every advancing age (Stone, Bronze, Iron, Industrial, probably forgot one), a new basis for technology became dominant.
It wasn't because everyone thought the new bronze tools were "neat". It was that the people who used them were more efficient at what they did... bronze users could plow soil better, all kinds of things... they got more resources (energy, food, etc) than those who didn't. So they prospered.
The reason it's called the Information Age is that right now, information is the most powerful tool we have. Show me an oil company that can get oil w/o complex computer analysis of satellite imagery of radar scans of a potential oil field. Show me a farmer who's not at the very least doing cost/benefit analysis of fertilizers, crop markets, etc. Anyone who doesn't goes out of business, because everyone who does, does better because of their control of information.
If in the long run, we're going to hit an energy crunch, everyone is royally screwed... thanks to modern techniques, we're supporting WAY more people than the industrial age could even have supported... keeping america's food supply chain going at the speed it runs required sophisticated inventory tracking... those IT jobs aren't going away.
Sure, maybe we'll have a recession, energy crisis, etc. If so, I know what I'll do to survive... walk into some Mom & Pop grocery store, and tell that (for dirt cheap, recession and all), I'll sell 'em some custom accounting software, on spec, guaranteed to improve profits, or they don't have to pay me.
Some may turn me away... but the ones who don't, will know what their loss leaders are, be able to analyze the cyclic demands for various foods... and while those around them are guessing at these things, they'll have the information, and survive. And so will I.
Pixel pushers? Bah. We're mechanics, crafting the flow of information, to make other businesses run better. That'll never go out of style, for the same reason a guy with a gun will always win the knife fight.
It wasn't that they wouldn't spend $20 for a dvd drive. It's that they wouldn't spend the extra 5.25 drive bay space and cabling for something that's only needed once in a while for os-installation. And when you're trying to make a small low power device, that's at a premium.
For that once-in-a-while need to reinstall the os, there's certainly no need to go to the extreme of sending to the factory. My company uses a lot of small linux appliances like these (esp for firewalls) and I keep a external usb-cdrom on hand... use it to (re)install the os, and thats the only time it's needed. Rest of the time it would be wasted space. And I only had to pay for 1 drive, to use on ALL the systems.
So after 100 of these, that $20 would add up for me.
Regarding Thunderbird's handling of HTML & images... If it's actual displaying of html you dislike, there's the menu option "View/Message As/Plain Text".
More importantly though, if it's security you're worried about, by default Thunderbird won't display anything but embedded images... you have to explicitly tell it each time you view an email if you want it to load any referenced images... so there's no security leak there.
as well, it has a similar (on by default) feature of disabling any scripting in the html.
the end effect of all of this is that html in thunderbird is about as dangerous tracking/security wise as the markup language of a slashdot post.
while there's a thread on the subject of opensource
fps engines...
QFusion is an open source quake3 client, written
from scratch.
Just thought i'd post a link to it,
cause it's an impressive accomplishment,
and the source code is beautiful...
and the engine's speed compares favorably to the real Q3 client.
the progression of debian branches...
on
Debian 3.0r2 Released
·
· Score: 2, Informative
the 'unstable' 'stable' and 'testing' names are symlinks for one of the named debian distributions.
woody is currently the stable version. the stable version which will usually have slightly older software, but because it's been tested for a much longer time
it's better to use on business servers.
sarge is currently the testing version. it should probably be for workstation/home use. the packages are newer, but not as bug-free. while it could be used in a production environment,
stable will always be a safer bet.
as the stable version, woody gets mainly security updates. at some point, sarge will become well testing enough that woody will be retired (like 'potato' before it), and sarge will become the current stable branch.
a new fork will be created at that point, and become the new testing version.
'sid' will always be the unstable branch of debian. you don't want to use 'unstable'. it will almost always have the newest software versions, but they will probably break your system. if you see something you like, download it singly, don't install sarge to get it.
in short... get sarge/testing to try out debian. if there's problems, or you want older more tested software, get woody/stable. if all you want is problems, for your own mind to solve, get sid/unstable.
To go from, say, a C language file to an exe, the compiler first loads the C file (ending in.c), and all the files it refers to, and then parses all of it into an internal structure.
this structure is then optimized: loops are unrolled, functions are inlined, and info that is mention but isn't needed is stripped out.
the resulting structure is then written out as a series of assembly instructions, which are then converted to the numeric codes the processor understands.
this is the exe.
to go backwards, it's (generally) trivial to take an exe and get a plaintext file containing the assembly instructions (this file usually ends in '.a')
it's the optimization step that causes issues: one of the main things the computer doesn't need which is stripped out is variable names, comments, etc. without them, there's no context. you can figure out the algorithm from the assembly, but you can't easily figure out what it's operating on. to make things worse, other optimizations may alter the code for faster execution, making it even harder to figure out.
Occasionally, mistakes are made... Microsoft slipped up a while back, and released a windows patch which had the 'debugging info' left in it. All this really amounts to is the variable names, function names, etc... which is bloody useful.
Making this process even worse is that some (rare) executeables are self modifying, which makes them MUCH harder to predict.
in summary, it's not that hard to get back to C code, assuming the program was even written in C. You'd just have variable names like 'var0001', 'var0002' 'func0001', etc.
It's basically the difference between having a nice nested tree structure which you can compartmentalize and analyze, versus one long list of instructions, which the computer may start and stop execution of at any point.. sorta like DNA.
difference is... even if Bush honestly won, without falsifying election results, then he _still_ only won by a small margin. Instead of acting like a candidate who'd squeaked by on the barest of margins, he acts like his views are supported by all Americans, and does whatever-the-fuck he wants to. As a president elected with 50% of the populace behind him, he was not elected to serve only those 50%, but all 100%... to not even pay lip service to the other half is insulting not just to them, but to the process itself.
fine then, but how would you encode that difference in the general case, in abstract, and know that you'd gotten it right for all concrete cases? a person could go on a case by case basis for what they consider valid or not, and even if everyone else in the world agreed with them, it would make no difference...
you still couldn't codify the difference as law, without explicitly stating what precise differences make BO2k invalid... and even once that was done, what if BO2k added just those features, and no more? is it now "legit"?... etc.
and while we're revising _that_ law forever, what about every other class of software product? make/maintain a law for each of them? what about the ones that were missed? should they by default be made illegal, "just in case"?
this is somewhere the law should never go, for regulation of such thinks is tantamount to creating a thoughtcrime, because all we're talking about are ideas. the law has no place until the idea is coupled with intent; if that intent is to do harm, it will betray itself in the resultant actions, and those are _already_ illegal.
to stop already illegal acts, more laws are not the answer... to quote a crude saying, "it's like fucking for virginity".
even if it isn't solidered (sp?) on, even if it's a eeprom sitting a cosy ZIF socket, take a hint of the future from the xbox of the now... modders tried to change it's bios, got sued under the DMCA.
the moral of tale is that it doesn't matter what natural laws, mathematics, the cs industry, the consitution say... the DMCA overrides them all! it says so right on the bill:)
I agree with you: decentralized is fine; and decentralized + PKI would be even nicer security wise. And as a patient, I'd trust it over a central system for all the reasons mentioned elsewhere in this discussion.
My main point was that while PKI is optional for decentralized PHR, in order to develop a centralized PHR system like Google Health, you pretty much *have* to have PKI before the doctors will use your system. The lack of trust is a design flaw which, somehow, I don't think any of the centralized phr developers have even realized that they have, much less that PKI would fix it... otherwise they'd be hawking it at the forefront of their advertisements to doctors. I'm not really sure how they missed the trust issue, because it's the first thing the doctors I work with mentioned after they heard about Google Health.
BTW, those are some nice links regarding PKI, thanks for them! Going to have to look into how I can put that stuff to use.
It occurs to me I used a bunch of industry specific acronyms in the above post; let me define 'em...
PHR - patient health records
PHI - protected heath information - mostly equivalent to PHR, but sometimes with private doctor-to-doctor discussions (such as a patient's drug seeking habits)
EMR - electronic medical records - "EMR" software as a class basically is the eletronic equivalent of the wall of paper charts in your doctor's office. most PHR exchange will happen between these types of systems, or be printed out, edited, and faxed (sometimes to another EMR).
credentialling / credentials management - tracking of doctor licenses, certifications, etc... this stuff is personal information about the doctors (ssn, etc) that's flying around between their office, the govt, and insurance companies.
NPI / NPIDB - National Practitioner Data Bank - government database of the public parts of a doctor's credentials; that's trying to unify and replace all the others that are out there (UPIN, Medicaid, Medicare, DEA). It's in use, but the information frequently is years out of date, even with the best intent of all involved.
I work for a company that produces various types of medical records management software (credentials management, PHI document exchange, EMR); and I've spent a lot of time talking to a number of doctors, both tech-saavy and not so much. That disclaimed...
Let me tell you what the key problem is with electronic medical records: they are legally the property of the patient, but no doctor can (or will) trust the important details of such records unless they come from another doctor, and have a verifiable history leading back to that doctor. Not that they don't believe the part that lists a patient's allergies, but when the medical record says the patient has a debilitating disease which *requires* they be given morphine and lots of it, the doctor has to be able to verify the patient didn't just fake a record for a quick drug fix.
This leads to an interesting state electronically: if data records are to be centralized, a public key system must be set up, tied to each doctor, allowing them to both contribute & authenticate records, and allowing the patient to do the same (but the patient contributions will have to remain "untrusted" medically). You can have centralization without a public key system, but then you're just trusting the gatekeeper to never mess up, get hacked, or paid off. And even if you'd set up such a system which you know (as a programmer/cryptographer) can be made to work... you have to get the doctors to trust it as well; as given how seriously most of them take the responsibility to safeguard their patient's records, that's a hard sell even to a tech-saavy doctor.
Which is why the only major movement we've had in adoption of electronic records has been a decentralized one... doctors are converting their offices to use electronic systems internally, exchange information electronically; but always records are transmitted in a p2p fashion (whether by email, fax, courier, etc); allowing the receiving doctor to trust the veracity of the information (at least as far as they trust the originating doctor); without requiring them to trust the patient.
Google Health is merely one of the most prominent "my PHR online" projects out there, but the problem they are faced with solving is not merely legal or luddite based, but a issue of cryptographic trust in it's truest sense.
And that's not to mention that centralization of medical records creates a much more attractive point of failure for all kinds of things (such identity theft, if merely for the purposes of using some else's insurance),
and even if a public key system is implemented, the doctor (and staff) are handing off part of their trust to a central database... and given the mess of outdated information the NPI registry contains, they are loath to believe in such a system.
disclaimer: my company has a number of ongoing projects in this field, but my assessment here is pretty well unbiased architecture and adoption-wise as far as I know, we have a number of pokers in the fire fitting most of the above scenarios.
Or, failing that, measure time in .125 (1/8th) second increments instead of 0.1, and then it will align with the binary floating point representation. Voila, no error.
I'm well aware that OAEP was designed for asymetric ciphers... but I try not to be straight-jacketed by the on-the-box labelling of an algorithm, and keep my mind on what the algorithm actually does. OAEP is basically just a general principle for armoring a block of data to foils partial recovery unless you can decrypt a large % of it.
I agree, OAEP doesn't do _anything_ to fix the weakness in the key-schedule, but I was under the impression this attack required not just related keys, but related plaintext, and needed the similarly of plaintext in order to finess the original key's bits out of the schedule. Because of that, doing something like OAEP to scramble the plaintext would seriously hamper this exploit, since they effectively wouldn't know the anything about the masked bits passed into AES. It wasn't meant to be a fix, but more of a suggestion of a stop-gap for those of us who will continue to work with legacy AES systems, and need any extra security we can throw in.
[OTHO, if I mis-read, and the exploit doesn't rely on any knowledge of the plaintext, or even similar plaintext, OAEP wouldn't be applicable]
Another (somewhat less-well known) thing that can be done is to use OAEP+ (http://en.wikipedia.org/wiki/Optimal_Asymmetric_Encryption_Padding) to encrypt the datablocks that you're transmitting. The link is to OAEP but OAEP+ is probably what you'd want to use with AES... I don't have a link handy, and the basic principle of the two is the same...
The OAEP algorithm scrambles your data chunks by XORing your plaintext with randomly generated bits, but done in a way that's recoverable IF and ONLY IF you have the entire ciphertext decoded (designed for RSA, but can apply to AES). This means that the same key+plaintext will always result in different ciphertext, and also means that in order to get any useful bits of key/plaintext information, the attacker must get them all, or they're just guessing as to which set of random bits OAEP used (and it generally puts 128 bits worth in).
While the actual OAEP protocol is a block-level action, and the safe version adds 128 bits of randomness (and thus size), the general idea can be modified to be as cheap or expensive as you want... the idea in general makes many asymetric ciphers MUCH more secure.
Those brute-forcing apps are great for whittling down the number of possibilities.
But there's more than just simple login delays in your way... there's the password hashing algorithm being used
to encrypt the password for storage. The two leading algorithms right now are BCrypt and SHA512-Crypt. Both of these algorithms have the facility to increase the number of "rounds" of encryption that's applied to your password when generating the hash.
What does this mean? As computers get more powerful (and/or as you need more security), you can up
the number of rounds required to encrypt your password, so that it reliably takes a constant amount of time
to verify it.
If you pick enough rounds that it takes 1 seconds for the system to encrypt/verify your password, /etc/shadow file: Barring parallelization, he's limited to trying 1 password per second,
you won't notice much of a delay when logging in. But consider the worst-case scenario where the attacker
has a copy of your
simply because of the complexity of the calculation you're requiring him to perform. At that rate,
trying all 3 letter combinations would take him 4 hours, all 6 letter combinations would take 9 years.
Mind you, those numbers are before any whittling away known subsets is performed. But given that,
you can always up the number of rounds even more to re-balance things. Some high security
systems I've set up take around 5 seconds on a quad-core system just to verify the password!
Parallelization will help, of course, but if your attacker has 128 cores to work with, those 9 years
will still take him 1 month. And if you have something worth an attacker spending _that_ much time and resources,
let's hope a password is not the only thing standing in his way.
[re: windows, I don't know windows password hash algorithms at all. I love a pointer to some resources though]
I agree... it just plain scares me that so many large systems don't even bother with such trivial precautions as hashing. It's even more trivial than sql injections. Up until it happened, I would have _never_ guessed myspace & phpbb stored plaintext. It seems borderline incompetent.
I've implemented tons of little one-off account systems, for websites small enough they'll probably never even see a hacker. But before I even implemented the first one, I went through the trouble of finding the best password hash algorithm I could (http://people.redhat.com/drepper/SHA-crypt.txt)
Sure, I've had customers ask "why can't it just email me my password when I forget?" But you know what? Just a few minutes of quick explanation, and even people with NO math or cs background can understand why it's important.
So for the love of the gods, people, please take an hour out of your time to put in a hash alg (even md5-crypt is better than nothing)... it's just not that hard.
---
Just to go off on a rant here...
I've also noticed in some web applications there is the tendency to just pick a hash alg at random. Be warned: not all hash algorithms are created equal.
"Checksum" algorithms such as CRC32 are woefully insufficient: easy to reverse (for small strings), easy to find collisions. They're basically just one guessable step away from plaintext.
"Integrity" algorithms such as MD5 & SHA are a little better, since they're very hard to reverse, and difficult to find collisions.
The problem with using these types of hashes directly is that they will always hash a password to the _same_ string. While that's desirable for their purposes (file integrity, etc), that's not good at all for passwords: you can pre-build a table of known mappings beforehand, and use it to quickly guess many passwords in parallel (aka a rainbow table): Given a table of 10k user passwords hashed like this, and a pre-built table, the odds are very good you'll get a significant number of the passwords in a very short amount of time.
This is why a proper "Password" hash (eg bcrypt, md5-crypt, sha-crypt) includes a "salt" which is randomly generated each time the password is set (and not just the first time). This prevents the rainbow attacks which are possible on plain integrity hashes. But prepending (or appending) the salt is not enough, because since it's effect can be undone mathematically, at least enough so that it presents no real additional barrier.
Genuine password hashes, while using an integrity hash their basis, mix & blend the password and the salt in so many variable ways as to make this reversal impossible. And there are so many nuances here that _you should not roll your own_ (unless you're Bruce Schneier). Read bcrypt, sha-crypt or md5-crypt's specs for some details.
Note: don't use the old unix-crypt, while it is a password hash in the strict sense, it's so old and simple, it's barely stronger than crc32.
Note: sha-crypt adds additional flexibility via it's "rounds" system, allowing it to easily grow more complicated as computers grow more powerful. This is why I prefer it above all the others.
End rant: all this is why you should use sha-crypt or md5-crypt, and nothing lesser.
I can't recommend Eric enough...
Since it's PyQT based, as of Eric v4 it's seamlessly cross-platform... and integrates really nicely w/ a number of vcs systems... I've tried a number of them, and it's the best IMHO.
Mind you, a coworker of mine swears by SPE, so take that for what you will :)
Just to run down the three I'm familiar with:
Eric4: cross platform; qt4+qscintilla based; great editor; ok class browser; good vcs & project management; good debugger; poor command completion; handles lots of filetypes (c++, js, ruby, python, etc). Command completion & class browser are my main complaints w/ this program.
SPE: cross platform; wxwindows based; great editor; excellent class browser; no vcs or project management; debugger?; good command completion; ONLY DOES PYTHON... uses os.startfile() for most other filetypes.
Not supporting any other filetypes, or project files, are my main complaints w/ this program.
WingIDE: cross platform; gtk based; great editor; great class browser; not quite as familiar w/ the rest of it because it's semi costly, and I'm a cheap bastard. Main complaint here: just the money :) Oh... and it's command completion (while probably the best I've seen) uses introspection to a large degree, so get ready to have your modules imported all the time.
All 3 are under very active development, and written in python directly, so you can hack them to your needs... Eric4 even has a plugin system.
Indeed!
I'd add that correlation usually implies that there is some common cause which is a necessary condition of all the correlated events, even if it is not sufficient to cause all of them by itself.
People frequently loose sight of the fact that all "correlation != causation" is meant to indicate is that the common cause of correlated events is not required to be one the events themselves, but can be some other external event.
Whether the cause is bias in the measurement, direct/indirect causation, some remotely connected common causation, or whatever.. Correlation hardly _ever_ is simply coincidence.
My company currently maintains two large python applications (40-50 LOC, not including custom support libraries). One of them is 6 years old, the other 2 years old. On our development team, we have familiarity with Java, C, C++, Perl, and the concensus is that we've had less maintenance work under python then the other languages.
If you add in the time spent on prototyping and testing, python has saved us way more time and effort.
Regarding type checking and reliability... You need to read up on the idea of "duck typing", Python's philosophy is that actual types get in the way, it's protocols and interfaces that matter. And after 7 years of python programming, I'd have to agree with that philosophy.
All of the critical parts of our apps have had unittests written (which catch semantic glitches at a level typechecking never will)... We've actually spent some time _removing_ type checking from the system, and replacing those lines with hasattr(obj, "append") calls, or creating synthethic protocol tests, allowing our implementations of various inputs to be widely varying: Jython implementations, CPython objects, who cares, they all LOOK the same.
OTHO, maybe you're trolling.
"Python looks like Basic" indeed.
Out of all the languages, I would have
never been reminded of that one.
Javascript maybe (especially considering what prototype.js has going on)
I'm sorry, what? Of course there's an objective view.
:)
I agree with you re: the biasing tendencies/constraints inherently present in human physiology, as well human (group) psychology.
But of _course_ there's an objective viewpoint.
It's the viewpoint which accurately reflects all things which can be accurately measured.
Most things which seem to be subjectively reported are mainly so because no metric exists or is being used which can measure the objectivity.
Math, most glaring of all, is objective. No one can have multiple correct opinions on what 2+2 equals... sure the symbols could be redefined, but semantically, there is but a single answer, unchanging.
But it doesn't stop there. Look up probabalistic primality tests on wikipedia:
http://en.wikipedia.org/wiki/Primality_test
Primality is an objective truth, which these tests are able to arrive at despite their innate bias & inaccuracy.
So what does that have to do with people? Most people's opinions and beliefs (and we've all got lots) deal with subjects which hard to measure, especially with regards to accuracy: Politics for one; but opinions on the worth of programming topics too, which is near and dear to everyone here.
It's not our neurological fallibility that does us in: some "objective" truths are _defined_ by human concensus: did I win the lottery this month? I can believe "yes" all I want, but unless my state's gaming commission believes as I do, it just isn't true.
But their belief defines the truth, and when looking for objective viewpoints, you have identify all the subjectively-defined truths.
Driving on the left vs the right is another...
there's no universal truth there, but certain groups of people have to agree, as a matter of protocol (Native language is yet another).
The other statements of fact are just objective truths we aren't able or willing to measure. Programming is one such... but because it's such a young branch of human knowledge (100 years worth of work and counting), it's one of the "known unknowns": like other types of engineering, we know there _is_ a best way to do things, we just haven't developed very good guiding principles to measure it yet.
But the only way to get there is to seek out objectivity, since it could be present anywhere.
So read the book over, and maybe this guy's wrong on a number of accounts, and you'll notice them.
So spread the word, blog/comment/whatever about _why_ he's wrong.
If your metric is accurate, it will be born out as it is applied to other situations, and by other people. We'll build up better and better guiding principles out of these meta-measurements. Eventually, we'll see more and more programming topics which _can_ be accurately and objectively measured, and in much the same way probabalistic primality tests wade through their biases to arrive at the correct answer.
Not believing in objective viewpoints is giving up the search for the truth, and that would be a shame, because that's one of the grandest of all of humanity's endeavours.
Regarding programming, we do have one great truth already:
Given the nature of the human mind, GOTO Considered Harmful. Great for processors to think in, but bad for us
Much as I like sqllite, I agree with you completely... metakit is definitely what he wants.
For those not familiar with metakit,
it straddles a very interesting (to me at least)
line between "some csv files i threw together" and "full on sql db" in terms of it's storage
and query semantics, while at the same time being
a flat file like sqllite.
And it really deserves to be better known from a
theoretical standpoint, in how it structures the
data... especially some of the more interesting
applications it's been put to, such as a
graph-theory style database written as a layer on
top of metakit (can't remember the name).
Also, it's bound to a number of languages (python
and tcl are the only ones i've played with).
Perhaps if there weren't SO MANY responses saying
"sqllite" over and over, your response would have
gotten the informative rating it needs.
In a physics sense, no, that's not a cooler.
Typical "air conditioner" situation: you want to make the inside of a room cooler than the outside temperature.
Since the room starts out similar in temp to the outside, you have to spend energy pushing heat "uphill" to
an increasingly warmer outside. Making heat flow against the direction it would normally flow,
that's a cooler in the thermodynamic sense.
In the CPU situation, you want to make the inside of the cpu EQUAL to the outside temperature.
Since the running CPU starts out way warmer than the outside temp, the heat will flow naturally on it's
own "downhill" to the outside. Any sort of cooling system merely hastens the flow.
In this situation, any device like a fan, etc is merely a more efficient radiator...
as the temp of cpu gets closer to the outside, this device loses efficiency... and in no case
could it get the cpu any _colder_ than the outside.
Being able to do that is what makes something a "cooler" in the physics sense.
Indeed. You should read up on the changes they're planning to make.
:)
Many of them are backwards-incompatible changes that will remove nasty warts,
realign certain parts of python's object structure (str vs unicode, int vs long)
in ways that are MUCH cleaner, MUCH harder to make mistakes with, etc.
This is not some Larry Wall day dream revision...
they aren't rethinking python from the ground up,
it's an evolutionary revision which is keeping the python 2.x codebase,
but changing some things they've been wanting to do for a long long time.
The 1.x -> 2.x transition is apt... the code will look almost the same,
only cleaner
If that's not good enough for anyone, automated tools are being developed
to convert python 2.6 -> 3.0. Yes, that's right, Python 2.6. It'll
be the last version of 2.x released, and be carefully designed to be
both compatible w/ 2.x, and still able to warn you about gotcha's
that would appear if you ran the code under 3.0.
Not only that, but 3.0 isn't projected to be in mainstream use
for at least 2 more years.
There's absolutely no need for the sensationalist panic this slashdot headline
annouced itself with. Rethinking a complex system from the ground up is always
a bad idea... but GVR knows that, and that's not whats happening here.
It's more akin to the linux kernel moving from 2.4 -> 2.6
Incompatible, yes, but a much better architecture.
Maybe you were being sarcastic.
If not... I don't think you quite get what being in the "Information Age" is about.
It's not just some business buzzword:
With every advancing age (Stone, Bronze, Iron, Industrial, probably forgot one),
a new basis for technology became dominant.
It wasn't because everyone thought the new bronze tools were "neat".
It was that the people who used them were more efficient at what they did...
bronze users could plow soil better, all kinds of things...
they got more resources (energy, food, etc) than those who didn't.
So they prospered.
The reason it's called the Information Age is that right now,
information is the most powerful tool we have. Show me an oil company
that can get oil w/o complex computer analysis of satellite imagery
of radar scans of a potential oil field. Show me a farmer who's
not at the very least doing cost/benefit analysis of fertilizers,
crop markets, etc. Anyone who doesn't goes out of business, because
everyone who does, does better because of their control of information.
If in the long run, we're going to hit an energy crunch,
everyone is royally screwed... thanks to modern techniques,
we're supporting WAY more people than the industrial age could even have supported...
keeping america's food supply chain going at the speed it runs
required sophisticated inventory tracking... those IT jobs aren't going away.
Sure, maybe we'll have a recession, energy crisis, etc.
If so, I know what I'll do to survive... walk into some Mom & Pop grocery store,
and tell that (for dirt cheap, recession and all), I'll sell 'em some custom
accounting software, on spec, guaranteed to improve profits, or they don't have to pay me.
Some may turn me away... but the ones who don't, will know what their loss leaders
are, be able to analyze the cyclic demands for various foods...
and while those around them are guessing at these things, they'll have the information,
and survive. And so will I.
Pixel pushers? Bah. We're mechanics, crafting the flow of information,
to make other businesses run better. That'll never go out of style,
for the same reason a guy with a gun will always win the knife fight.
It wasn't that they wouldn't spend $20 for a dvd drive.
It's that they wouldn't spend the extra 5.25 drive bay space
and cabling for something that's only needed once in a while for os-installation.
And when you're trying to make a small low power device, that's at a premium.
For that once-in-a-while need to reinstall the os,
there's certainly no need to go to the extreme of sending to the factory.
My company uses a lot of small linux appliances like these (esp for firewalls)
and I keep a external usb-cdrom on hand... use it to (re)install the os,
and thats the only time it's needed. Rest of the time it would be wasted space.
And I only had to pay for 1 drive, to use on ALL the systems.
So after 100 of these, that $20 would add up for me.
Regarding Thunderbird's handling of HTML & images...
If it's actual displaying of html you dislike, there's the menu option "View/Message As/Plain Text".
More importantly though, if it's security you're worried about,
by default Thunderbird won't display anything but embedded images...
you have to explicitly tell it each time you view an email if you want it to load
any referenced images... so there's no security leak there.
as well, it has a similar (on by default) feature of disabling any scripting in the html.
the end effect of all of this is that html in thunderbird is about as dangerous tracking/security wise
as the markup language of a slashdot post.
It's not a book per se, but I would highly recommend the OpenGL tutorials at NeHe Productions.
They've got a bunch of tutorials, starting with basic opengl stuff up to some very tricky effects.
Each tutorial goes step by step through downloadable C code examples, explaining everything.
I picked up most of what I know of OpenGL from it.
while there's a thread on the subject of opensource fps engines...
QFusion is an open source quake3 client, written from scratch.
Just thought i'd post a link to it,
cause it's an impressive accomplishment,
and the source code is beautiful...
and the engine's speed compares favorably to the real Q3 client.
the 'unstable' 'stable' and 'testing'
names are symlinks for one of the named
debian distributions.
woody is currently the stable version.
the stable version which will usually have
slightly older software, but because it's been
tested for a much longer time
it's better to use on business servers.
sarge is currently the testing version.
it should probably be for workstation/home use.
the packages are newer, but not as bug-free.
while it could be used in a production environment,
stable will always be a safer bet.
as the stable version, woody gets mainly
security updates. at some point, sarge
will become well testing enough that
woody will be retired (like 'potato' before it),
and sarge will become the current stable branch.
a new fork will be created at that point,
and become the new testing version.
'sid' will always be the unstable branch of
debian. you don't want to use 'unstable'.
it will almost always have the newest
software versions, but they will probably
break your system. if you see something you
like, download it singly, don't install
sarge to get it.
in short...
get sarge/testing to try out debian.
if there's problems, or you want older
more tested software, get woody/stable.
if all you want is problems,
for your own mind to solve,
get sid/unstable.
To go from, say, a C language file to an exe, .c),
the compiler first loads the C file (ending in
and all the files it refers to,
and then parses all of it into an internal
structure.
this structure is then optimized:
loops are unrolled, functions are inlined,
and info that is mention but isn't needed
is stripped out.
the resulting structure is then
written out as a series of assembly
instructions, which are then
converted to the numeric codes
the processor understands.
this is the exe.
to go backwards, it's (generally)
trivial to take an exe and get a
plaintext file containing the assembly
instructions (this file usually ends in '.a')
it's the optimization step that causes
issues: one of the main things the computer
doesn't need which is stripped out is
variable names, comments, etc.
without them, there's no context.
you can figure out the algorithm from the assembly,
but you can't easily figure out what
it's operating on.
to make things worse, other optimizations
may alter the code for faster execution,
making it even harder to figure out.
Occasionally, mistakes are made...
Microsoft slipped up a while back,
and released a windows patch which had
the 'debugging info' left in it.
All this really amounts to is the variable
names, function names, etc...
which is bloody useful.
Making this process even worse is that
some (rare) executeables are self modifying,
which makes them MUCH harder to predict.
in summary, it's not that hard to get
back to C code, assuming the program
was even written in C. You'd just have
variable names like 'var0001', 'var0002'
'func0001', etc.
It's basically the difference between
having a nice nested tree structure
which you can compartmentalize and analyze,
versus one long list of instructions,
which the computer may start and stop
execution of at any point.. sorta like DNA.
difference is... even if Bush honestly won,
without falsifying election results,
then he _still_ only won by a small margin.
Instead of acting like a candidate who'd squeaked
by on the barest of margins, he acts like
his views are supported by all Americans,
and does whatever-the-fuck he wants to.
As a president elected with 50% of the populace
behind him, he was not elected to serve only
those 50%, but all 100%... to not even pay
lip service to the other half is insulting
not just to them, but to the process itself.
fine then, but how would you encode that
... etc.
difference in the general case, in abstract,
and know that you'd gotten it right for all
concrete cases? a person could go on a case
by case basis for what they consider valid
or not, and even if everyone else in the
world agreed with them, it would make no
difference...
you still couldn't codify
the difference as law, without explicitly
stating what precise differences make
BO2k invalid... and even once that was done,
what if BO2k added just those features,
and no more? is it now "legit"?
and while we're revising _that_ law forever,
what about every other class of software product?
make/maintain a law for each of them?
what about the ones that were missed?
should they by default be made illegal, "just in case"?
this is somewhere the law should never
go, for regulation of such thinks is tantamount
to creating a thoughtcrime, because all
we're talking about are ideas.
the law has no place until the idea is coupled
with intent; if that intent is to do harm,
it will betray itself in the resultant actions,
and those are _already_ illegal.
to stop already illegal acts,
more laws are not the answer...
to quote a crude saying,
"it's like fucking for virginity".
even if it isn't solidered (sp?) on,
:)
even if it's a eeprom sitting a cosy ZIF socket,
take a hint of the future
from the xbox of the now...
modders tried to change it's bios,
got sued under the DMCA.
the moral of tale is that it doesn't
matter what natural laws, mathematics,
the cs industry, the consitution say...
the DMCA overrides them all!
it says so right on the bill