Java will remain relevant because of the large number of languages being built for the JVM: scala, erjang, clojure, groovy etc. Thus writing libraries in java has significant appeal.
I second this motion. I use Firefox at work (linux) and at home (on mac and windows) and on all three there are huge memory leaks and slowdowns after a day. It seems that this should be a much higher priority than anything else.
I consistenly get 2ms ping time over my wireless link, even at low signal strength. I have a new Linksys G access point, maybe that helps.
Using 802.11g instead of b should allow a tradeoff between signal strength and distance, extending the range for the same bandwith connection. As infrared gets cheaper you will definitely see long hauls being carried by these links. (40+ miles?)
All in all, I would cut your ideal time down by a factor of at least five to 500ms. At any rate, this technology is getting cheaper, faster, and more reliable; I think we are going to see networks like this happen.
Modern storage solutions (like EMC) use redundant battery backed ram to buffer writes, greatly reducing perceived write latency. This gives you a lot of the performance gain of a ram only database, and also scales very well to large loads. (in fact, when choosing RAID stripe size you take into account whether writes are buffered; if not, keep stripes small for log files)
If you know that your data will always fit into available ram then there are a number of performance optimizations that can be done. I'm not sure about ACID becoming "trivial"; You still need most of the same db components: indexes, lock managers, operation journaling, etc. But many of these could be greatly simplified:
1. Page/Buffer Manager Eliminated. Since no disk IO will be required for the normal running of the db, there will be no need for a page manager. This eliminates complexity such as block prefetch and marking and replacement strategies. In fact, the data will probably not be stored on pages at all. Details such as block checksum, flip flop, log position, page latches etc can all be removed. The values in the rows would be sitting in memory in native language formats rather than packed making retrieval much faster. There would be no need for block chaining.
2. More flexible indexing. Since it is not necessary to store data in pages, traditional B-Trees are not absolutely required. Other index structures like AVL trees would be faster and might allow better concurrency. These trees would also be easier to keep balanced...most databases don't cleanup indexes after deletes, forcing periodic rebuilding. Other index schemes not generally considered because of poor locality prinicles could be considered. Note that Hash Indexes would probably still use Linear Hashing.
3. Lock Manager Simplified. Row level locking (and MVC) are still desired features, but keeping the locks all in memory simplifies implementation. Oracle and InnoDB store lock information in the blocks (associated with transaction) to allow update transactions larger than memory.
4. Log manager simplified. You will still need journaling capability for rollback, replication, recovery from backup etc. But the implementation of the log need not be traditional. Any structure that maintains information about transactions and contains causal ordering will do. Techniques such as keeping old versions of rows adjacted to current versions that are unacceptable for disk based databases (ahem, Postgres) could be used.
Although these may seem like small things, they can add up: less code to run is faster code. A company called TimesTen offered a product that they claimed was 10x faster than Oracle using an all memory DB. Generally the corporate world doesn't care to split hairs. They want something that works, and they are willing to throw some money and iron at it. Thats why battery backed ram in the disk controller to buffer writes is probably going to be fine for now.
A last note: modern databases already know to not bother with indexes when a table is sufficiently small.
There are many problems with this design, some have already been mentioned. There are serious issues with performing atomic updates. Modern databases use locking to allow high levels of concurrency. Foreign key constraint checking is one thing that would be very hard to implement in this design, as it is generally implemented in the indexes themselves. Likewise, to get all databases in a "RAIDb 0" group to reflect the same state, operations such as concurrent delete and insert must be completely serialized to assure consistency...serialized across all clients, not just from one source.
Furthermore, to scale up systems generally take advantage of stripping. At the IO level that means striping across multiple disks (modern convention is to stripe across all!). In a parallel database one usually stripes a single table across multiple nodes for parallel query processing. While it is possible with C_JDBC to put table X on node A, table Y on node B I don't see any provision for striping the data. It will be very difficult to use your hardware efficiently in this scenario.
If you are going to go through the trouble of implementing a complete query processor (that can handle jobs larger than ram), a full update/query scheduler (lock manager), and a journalling mechanism that can (somehow) even maintain atomic transactions (even in the face of multiple failures) then why not just build your own database. This system might be useful in certain rare cases but I wouldn't use it except possibly for replication.
If you read the article it is pretty clear that they meant to say 100 Megabits. 65 x a T1 is about 100, and later down they say "With 100 Mbps of capacity...". Strangely, later still the article says "...the next generation of Ethernet, which will deliver 10 Gbps..."
Java will remain relevant because of the large number of languages being built for the JVM: scala, erjang, clojure, groovy etc. Thus writing libraries in java has significant appeal.
JJ
I second this motion. I use Firefox at work (linux) and at home (on mac and windows) and on all three there are huge memory leaks and slowdowns after a day. It seems that this should be a much higher priority than anything else.
JJ
I consistenly get 2ms ping time over my wireless link, even at low signal strength. I have a new Linksys G access point, maybe that helps.
Using 802.11g instead of b should allow a tradeoff between signal strength and distance, extending the range for the same bandwith connection. As infrared gets cheaper you will definitely see long hauls being carried by these links. (40+ miles?)
All in all, I would cut your ideal time down by a factor of at least five to 500ms. At any rate, this technology is getting cheaper, faster, and more reliable; I think we are going to see networks like this happen.
JJ
Modern storage solutions (like EMC) use redundant battery backed ram to buffer writes, greatly reducing perceived write latency. This gives you a lot of the performance gain of a ram only database, and also scales very well to large loads. (in fact, when choosing RAID stripe size you take into account whether writes are buffered; if not, keep stripes small for log files)
...most databases don't cleanup indexes after deletes, forcing periodic rebuilding. Other index schemes not generally considered because of poor locality prinicles could be considered. Note that Hash Indexes would probably still use Linear Hashing.
If you know that your data will always fit into available ram then there are a number of performance optimizations that can be done. I'm not sure about ACID becoming "trivial"; You still need most of the same db components: indexes, lock managers, operation journaling, etc. But many of these could be greatly simplified:
1. Page/Buffer Manager Eliminated. Since no disk IO will be required for the normal running of the db, there will be no need for a page manager. This eliminates complexity such as block prefetch and marking and replacement strategies. In fact, the data will probably not be stored on pages at all. Details such as block checksum, flip flop, log position, page latches etc can all be removed. The values in the rows would be sitting in memory in native language formats rather than packed making retrieval much faster. There would be no need for block chaining.
2. More flexible indexing. Since it is not necessary to store data in pages, traditional B-Trees are not absolutely required. Other index structures like AVL trees would be faster and might allow better concurrency. These trees would also be easier to keep balanced
3. Lock Manager Simplified. Row level locking (and MVC) are still desired features, but keeping the locks all in memory simplifies implementation. Oracle and InnoDB store lock information in the blocks (associated with transaction) to allow update transactions larger than memory.
4. Log manager simplified. You will still need journaling capability for rollback, replication, recovery from backup etc. But the implementation of the log need not be traditional. Any structure that maintains information about transactions and contains causal ordering will do. Techniques such as keeping old versions of rows adjacted to current versions that are unacceptable for disk based databases (ahem, Postgres) could be used.
Although these may seem like small things, they can add up: less code to run is faster code. A company called TimesTen offered a product that they claimed was 10x faster than Oracle using an all memory DB. Generally the corporate world doesn't care to split hairs. They want something that works, and they are willing to throw some money and iron at it. Thats why battery backed ram in the disk controller to buffer writes is probably going to be fine for now.
A last note: modern databases already know to not bother with indexes when a table is sufficiently small.
JJ
There are many problems with this design, some have already been mentioned. There are serious issues with performing atomic updates. Modern databases use locking to allow high levels of concurrency. Foreign key constraint checking is one thing that would be very hard to implement in this design, as it is generally implemented in the indexes themselves. Likewise, to get all databases in a "RAIDb 0" group to reflect the same state, operations such as concurrent delete and insert must be completely serialized to assure consistency...serialized across all clients, not just from one source.
Furthermore, to scale up systems generally take advantage of stripping. At the IO level that means striping across multiple disks (modern convention is to stripe across all!). In a parallel database one usually stripes a single table across multiple nodes for parallel query processing. While it is possible with C_JDBC to put table X on node A, table Y on node B I don't see any provision for striping the data. It will be very difficult to use your hardware efficiently in this scenario.
If you are going to go through the trouble of implementing a complete query processor (that can handle jobs larger than ram), a full update/query scheduler (lock manager), and a journalling mechanism that can (somehow) even maintain atomic transactions (even in the face of multiple failures) then why not just build your own database. This system might be useful in certain rare cases but I wouldn't use it except possibly for replication.
JJ
If you read the article it is pretty clear that they meant to say 100 Megabits. 65 x a T1 is about 100, and later down they say "With 100 Mbps of capacity...". Strangely, later still the article says "...the next generation of Ethernet, which will deliver 10 Gbps..."
I think someone is bad with numbers.
JJ
Lots of porno with red and blue body paint! JJ