be-fan wrote: Also, when stuff like DNS lookups are done through files, isn't it getting a little silly? Isn't it simpler to map closer to the way the software is actually working? Especially in the cases of stuff like DNS lookups which don't totally make sense being treated as a file. Also, doesn't going through the FS create some overhead?
I don't think it's silly to use files for DNS lookups, especially considering that it means I don't have to deal with those ugly gethostbyname()-related structs anymore when I'm programming. As far as this being inefficient "because it's going through the filesystem", this is not true. You are thinking of the old concept of filesystem, which refers to things on a big magnetic disk. The Inferno "filesystem" files are not necessarily files at all in the traditional sense. Many of them (including the DNS-related files) do not reside on the hard drive at all. They reside in memory, and they run code when people open/read/write them. There is no inherent inefficiency in this.
AC is right in that I did botch a few of these points. I apologize for the inaccuracies; I was listing things off the top of my head without consulting the official sources.
Specifically, Inferno was indeed written mostly in C (not in Limbo). Limbo is the language for application developers, and is compiled to bytecode for running on the Dis VM. As to where the Inferno kernel runs in regard to the Dis VM and the actual hardware, I'm not really sure. Check the websites for more info.
My other previous comments are accurate to my knowledge (namespaces, files, Styx, etc.)
Last time I looked at my middleware it was using file descriptors for socket connections Of course it was. But to get those descriptors, it had to call things like socket(), bind(), etc. Inferno attempts to get rid of some of that non-file baggage by letting you just open and read some files rather than having to use network-specific system calls.
I wonder if their sockets only support "r" "r+" "w+" options To my knowledge, their implementation supports any TCP socket options.
Just to inform people that may not know too much about Inferno, I thought I'd post a few unique things about it (and some ideas that the developers wanted to focus on). I've done work for Lucent related to it, so I know a few useful things about it.
Runs on a Virtual Machine
Inferno only runs directly on one architecture: the Dis virtual machine. This isolates in a very clean way the platform-specific parts that need to be changed to get Inferno to run on various architectures: just rewrite the Dis machine wherever you want Inferno to run.
Written in Limbo
Most of Inferno was written in a new language called Limbo. It's authors seem to like this language, although I've seen a good bit of criticism of it from others ("the infernal language"). I don't know much at all about Limbo first hand.
Namespaces
This concept is borrowed from Plan 9. The concept is that different processes have different views of the system called namespaces. Each namespace is essentially a hierarchical file system that presents available resources to the process. If a process does not have enough privilege to use something, it won't even see it. What are these resources? This gets into the next point...
Files, Files, Files
This concept is borrowed from Unix but is taken to new extremes. All resources in the namespace are presented as files. The idea is to unify the interfaces to a variety of things by having an open/close/read/write way to use a variety of resources. Unix presented devices this way (/dev). Inferno goes a step further by presenting other machines, network cards, and a variety of other resources as files. For example, DNS lookups and socket programming can be done simply by reading and writing files. This is not so under Unix.
Transparency of Networked Machines (or toasters)
By using this hierarchical filesystem view of everything, Inferno can essentially hide that parts of this file system are remote (like NFS mounts). The hope is that programmers can write programs that read like simple shell scripts (open, read, etc.) to manipulate local and remote data and machines.
Styx Protocol
Styx is the protocol that is spoken over the network in order to present this filesystem abstraction remotely. Note that these 'files' are not traditional files: they can have arbitrary semantics (they can make things happen on the server when read/written; the possibilities are endless). A Styx server exports some filesystem to a Styx client. Note that the server decides what the semantics are for this filesystem (writing file A causes some configuration to be rewritten, reading file B reads off the current configuration, etc.) The client can mount this filesystem so that it appears in the namespaces of processes on the client machine.
I think Inferno has a lot of potential, even if only for simple management of networks. Today, management of heterogeneous network can be pretty complicated (managing routers with LDAP, SNMP, etc.). If routers implemented Styx servers to expose their configuration options as a filesystem, then some Inferno machine could mount all of the routers in a directory and run a simple script to configure them all. This would be much simpler than what goes on today.
You would encrypt swap to prevent the leaking of any sensitive data that is resident in a processes memory (cleartext passwords, private or secret keys, etc.)
Without encrypted swap, an application with sensitive data may be swapped out at some point to the disk. Even if the process zeros its own memory eventually, this disk copy may be left around for prying eyes (another process does a large malloc and scans this dirty memory for keys/passwords).
It seems to me that zeroing the swap before reuse would be a cheaper alternative to this. Here is the argument for why I think encryption doesn't buy you any security that zeroing doesn't:
My reasoning is that another process would never get your old "dirty" memory with your key after a malloc. They would have to resort to spying in your memory in realtime.
As for someone looking at your actual memory in realtime, encrypted swap isn't going to stop that. If they are sufficiently powerful to do this, they are sufficiently powerful to go into the kernel, extract the swap encryption key and read things anyway.
Could someone more in the know explain what encryption buys you that kernel-level zeroing doesn't?
This is not true. More mathematical cryptosystems can be proven to be as secure as some difficult math problem (factoring, discrete log, etc.). Why not use these provably secure cryptosystems (Rabin, Goldwasser/Micali, etc.)? Because most of them are significantly less efficient, and because many outside of the theoretical crypto community probably don't even know of their existence. (There seems to be a large lag between the powerful systems invented by theorists, and what is implemented by any software developers)
The more efficient ciphers actually used in software tend to be of the "scramble" variety involving specific S-boxes full of random looking numbers and lots of connecting XOR gates. It is hard to prove anything formally about these ciphers since they are not based on number theory or formal mathematics. In regard to these ciphers not based on pure math problems (*fish, Serpent, 3DES, et. al) public scrutiny is the best hardening factor.
I wouldn't necessarily say that Blowfish is more secure than Twofish or Serpent just because it is older though. The AES candidates have probably undergone a lot more scrutiny than the non-AES candidates since they would potentially replace DES as a standard. This makes the crypto community focus more intensely on them then some arbitrary cipher.
As far as the original question on choosing encryption: I think the biggest issue is quality of existing implementation in libraries. I am a software developer that works almost exclusively in a crypto/security setting, and the first thing I look for in an algorithm is for it to be part of a tested, well-documented crypto library with a reasonable license (like openSSL perhaps). This is for two aspects of the same reason: crypto algorithms are not what people break when they break security systems.
They break bad cipher implementations
Without a good library, I am either using a broken library or rolling my own. Rolling my own is riskier than using one that has been extensively tested by others.
They break other pieces of the system Whether I use 3DES, IDEA, RSA, etc. is probably less important than the other parts of my code when I am writing security-related tools. Better to focus more time/energy making sure that all of my bounds checks are in place, no dangerous races exist, etc. These are what attackers use to break into systems.
One final note on the importance of "well-documented" libraries. The reason I emphasize this is because inherently libraries, more than any other software, are code to be reused. Good documentation facilitates reuse, and it makes things safer. I had an unfortunate experience with a poorly documented C crypto library (an old CryptoLib) that called free on the pointers passed to it as arguments without any indication whatsoever of this in the doc. The result was a program that would crash suddenly/mysteriously because of later accesses to that freed memory. This kind of thing could have been prevented if the doc had included the salient fact that my arguments were being freed (since this is hardly an expected behavior).
I don't think it's silly to use files for DNS lookups, especially considering that it means I don't have to deal with those ugly gethostbyname()-related structs anymore when I'm programming. As far as this being inefficient "because it's going through the filesystem", this is not true. You are thinking of the old concept of filesystem, which refers to things on a big magnetic disk. The Inferno "filesystem" files are not necessarily files at all in the traditional sense. Many of them (including the DNS-related files) do not reside on the hard drive at all. They reside in memory, and they run code when people open/read/write them. There is no inherent inefficiency in this.
Specifically, Inferno was indeed written mostly in C (not in Limbo). Limbo is the language for application developers, and is compiled to bytecode for running on the Dis VM. As to where the Inferno kernel runs in regard to the Dis VM and the actual hardware, I'm not really sure. Check the websites for more info.
My other previous comments are accurate to my knowledge (namespaces, files, Styx, etc.)
Of course it was. But to get those descriptors, it had to call things like socket(), bind(), etc. Inferno attempts to get rid of some of that non-file baggage by letting you just open and read some files rather than having to use network-specific system calls.
I wonder if their sockets only support "r" "r+" "w+" options
To my knowledge, their implementation supports any TCP socket options.
Inferno only runs directly on one architecture: the Dis virtual machine. This isolates in a very clean way the platform-specific parts that need to be changed to get Inferno to run on various architectures: just rewrite the Dis machine wherever you want Inferno to run.
Most of Inferno was written in a new language called Limbo. It's authors seem to like this language, although I've seen a good bit of criticism of it from others ("the infernal language"). I don't know much at all about Limbo first hand.
This concept is borrowed from Plan 9. The concept is that different processes have different views of the system called namespaces. Each namespace is essentially a hierarchical file system that presents available resources to the process. If a process does not have enough privilege to use something, it won't even see it. What are these resources? This gets into the next point...
This concept is borrowed from Unix but is taken to new extremes. All resources in the namespace are presented as files. The idea is to unify the interfaces to a variety of things by having an open/close/read/write way to use a variety of resources. Unix presented devices this way (/dev). Inferno goes a step further by presenting other machines, network cards, and a variety of other resources as files. For example, DNS lookups and socket programming can be done simply by reading and writing files. This is not so under Unix.
By using this hierarchical filesystem view of everything, Inferno can essentially hide that parts of this file system are remote (like NFS mounts). The hope is that programmers can write programs that read like simple shell scripts (open, read, etc.) to manipulate local and remote data and machines.
Styx is the protocol that is spoken over the network in order to present this filesystem abstraction remotely. Note that these 'files' are not traditional files: they can have arbitrary semantics (they can make things happen on the server when read/written; the possibilities are endless). A Styx server exports some filesystem to a Styx client. Note that the server decides what the semantics are for this filesystem (writing file A causes some configuration to be rewritten, reading file B reads off the current configuration, etc.) The client can mount this filesystem so that it appears in the namespaces of processes on the client machine.
I think Inferno has a lot of potential, even if only for simple management of networks. Today, management of heterogeneous network can be pretty complicated (managing routers with LDAP, SNMP, etc.). If routers implemented Styx servers to expose their configuration options as a filesystem, then some Inferno machine could mount all of the routers in a directory and run a simple script to configure them all. This would be much simpler than what goes on today.
Without encrypted swap, an application with sensitive data may be swapped out at some point to the disk. Even if the process zeros its own memory eventually, this disk copy may be left around for prying eyes (another process does a large malloc and scans this dirty memory for keys/passwords).
It seems to me that zeroing the swap before reuse would be a cheaper alternative to this. Here is the argument for why I think encryption doesn't buy you any security that zeroing doesn't:
My reasoning is that another process would never get your old "dirty" memory with your key after a malloc. They would have to resort to spying in your memory in realtime.
As for someone looking at your actual memory in realtime, encrypted swap isn't going to stop that. If they are sufficiently powerful to do this, they are sufficiently powerful to go into the kernel, extract the swap encryption key and read things anyway.
Could someone more in the know explain what encryption buys you that kernel-level zeroing doesn't?
This is not true. More mathematical cryptosystems can be proven to be as secure as some difficult math problem (factoring, discrete log, etc.). Why not use these provably secure cryptosystems (Rabin, Goldwasser/Micali, etc.)? Because most of them are significantly less efficient, and because many outside of the theoretical crypto community probably don't even know of their existence. (There seems to be a large lag between the powerful systems invented by theorists, and what is implemented by any software developers)
The more efficient ciphers actually used in software tend to be of the "scramble" variety involving specific S-boxes full of random looking numbers and lots of connecting XOR gates. It is hard to prove anything formally about these ciphers since they are not based on number theory or formal mathematics. In regard to these ciphers not based on pure math problems (*fish, Serpent, 3DES, et. al) public scrutiny is the best hardening factor.
I wouldn't necessarily say that Blowfish is more secure than Twofish or Serpent just because it is older though. The AES candidates have probably undergone a lot more scrutiny than the non-AES candidates since they would potentially replace DES as a standard. This makes the crypto community focus more intensely on them then some arbitrary cipher.
As far as the original question on choosing encryption: I think the biggest issue is quality of existing implementation in libraries. I am a software developer that works almost exclusively in a crypto/security setting, and the first thing I look for in an algorithm is for it to be part of a tested, well-documented crypto library with a reasonable license (like openSSL perhaps). This is for two aspects of the same reason: crypto algorithms are not what people break when they break security systems.
Without a good library, I am either using a broken library or rolling my own. Rolling my own is riskier than using one that has been extensively tested by others.
Whether I use 3DES, IDEA, RSA, etc. is probably less important than the other parts of my code when I am writing security-related tools. Better to focus more time/energy making sure that all of my bounds checks are in place, no dangerous races exist, etc. These are what attackers use to break into systems.
One final note on the importance of "well-documented" libraries. The reason I emphasize this is because inherently libraries, more than any other software, are code to be reused. Good documentation facilitates reuse, and it makes things safer. I had an unfortunate experience with a poorly documented C crypto library (an old CryptoLib) that called free on the pointers passed to it as arguments without any indication whatsoever of this in the doc. The result was a program that would crash suddenly/mysteriously because of later accesses to that freed memory. This kind of thing could have been prevented if the doc had included the salient fact that my arguments were being freed (since this is hardly an expected behavior).