What would really be news here would be database engines and/or filesystems that grocked SSD performance patterns well and could combine pools of spinning disks and SSDs in optimal ways for a given workload.
RAID by itself does only increase the number of in-flight IOs, but it almost always comes with that most magical of pixie dust, the battery-backed cache.
The other point that I'll make is that often the only writes your RDBMS is waiting on are log writes, which are sequential anyway.
In any case - I'll cede your point that spinning rust will likely NEVER scale as well as NAND.
I didn't get 250k IOPS. I _said_ 250k IOPS was 2-5x better. I used the same math you did, and specifically hedged about not knowing how poor the scaling was with these kinds of systems. I am _not_ a storage engineer, just a developer with a (professional) interest in high performance random IO systems.
Yeah. The relevant metric for databases really is $/IOPS, not $/GB.
So, off the cuff, I figure you need a 700-disk array of 146GB drives to do this much storage at RAID 10 ( or 0+1 for you pedants ). That's a lot of random IO capacity. I don't know how poorly IOPS scale for systems at this magnitude, but I'd be surprised if the SSD solution was 10x IOPS over 700 15k spindles. Maybe 2-5x?
Asking about saving is the wrong question. Saving shouldn't be a question. A document's current state should be persisted at the drop of a hat and that means undo info as well.
A small faction at MS gets this.
Android, as a platform and as recommended dev practice, gets this. Many great IPhone apps get this.
An app should expect to be terminated rudely and abruptly at any time. You'll impress the hell out of your users if you follow this rule.
I bet you could find an efficient, massive diesel engine designed for being mounted in a large ship that's capable of delivering a few tens of megawatts. There's your power needs. Data is another story, of course.
...and now I find myself agreeing with pretty much everything you just said.
When performance counts, there is absolutely, positively zero substitute for understanding the workload at hand and the hardware it will run on.
Re:don't use swap, doofs
on
Knuth Got It Wrong
·
· Score: 2, Informative
In no way whatsoever did he say 'remember swap is slow; try not to use it.'
That's as wrong as the idiotic summary.
Here's a relevant quote:
A 300-GB backing store, memory mapped on a machine with no more than 16 GB of RAM, is quite typical. The user paid for 64 bits of address space, and I am not afraid to use it.
The article is about redesigning binary heaps to account for non-linear access times between nodes due to swap. This point is critical. He's NOT avoiding swap, he's planning for it.
Above and beyond SQL-92/SQL-99, PostgreSQL does a good job of implementing the non-optional parts of SQL:2003 and SQL:2008 as well, and in that regard are competitive with or better than the commercial alternatives.
PL/SQL is probably unlikely to ever be available in the Open Source PostgreSQL product, but it is a feature of EnterpriseDB, which is a PostgreSQL superset.
INSERT..ON DUPLICATE KEY UPDATE is IIRC similar to the new SQL:2003 MERGE statement, which is on the TODO list for PostgreSQL.
CLUSTER is a subset of Oracle's index-organized tables / SQL Server's clustered index features.
One of the truly innovative features that is arriving is exclusion constraints. If you've ever had to implement a scheduling system that deals with concurrent updates, you'll recognize that PostgreSQL has an absolutely killer feature that makes it trivial to solve the concurrent range-excluded searched update problem without messy application code. This feature is pure gold.
Designing a performance-intensive application that is portable across multiple databases is a frustrating, difficult task. Starting with ANSI/ISO syntax is indeed a great way to base your design, but the devil is truly in the details. ORMs can hurt as much as they help.
[...] but he sure thought up some neat ideas for a universe that John Scalzi will never come close to.
That's pretty unfair.
Scalzi may not have had the mass-market success or big screen treatment of Lucas, but turning senior citizens into killing machines is one of many 'neat' ideas. Have you actually read him?
Which brings us to challenge #0: A time machine, enabling some benevolent organization to go back in time and supply your unfortunate parents with the birth control that they so richly deserve.
I don't remember any any API doc in any garbage collected language that specified that a given reference was a strong (aka NORMAL) reference (as the general case). Weak references are specified because they are rather out of the norm, for most general-purpose programs. Yes, I know, all of you are geniuses and you use them all the time. Us normal idiots don't. Last time I had to use them I was foolishly rolling my own cache In any case, no, this isn't a persistent PEBKAC problem among competent practitioners. References are assumed to be strong unless documented otherwise.
Something resolvable to the number does need to be stored, for dispute resolution. Otherwise the merchant cannot have an intelligent conversation with the bank regarding a disputed charge.
IIRC, the TitanTV bits involved a application-specific mime type and a little xml file download which provided the info for the scheduling magic. So, no. It shouldn't be difficult for MythTV to interop with that.
Power and cooling are a big win here, no doubt.
What would really be news here would be database engines and/or filesystems that grocked SSD performance patterns well and could combine pools of spinning disks and SSDs in optimal ways for a given workload.
RAID by itself does only increase the number of in-flight IOs, but it almost always comes with that most magical of pixie dust, the battery-backed cache.
The other point that I'll make is that often the only writes your RDBMS is waiting on are log writes, which are sequential anyway.
In any case - I'll cede your point that spinning rust will likely NEVER scale as well as NAND.
I didn't get 250k IOPS. I _said_ 250k IOPS was 2-5x better. I used the same math you did, and specifically hedged about not knowing how poor the scaling was with these kinds of systems. I am _not_ a storage engineer, just a developer with a (professional) interest in high performance random IO systems.
Yeah. The relevant metric for databases really is $/IOPS, not $/GB.
So, off the cuff, I figure you need a 700-disk array of 146GB drives to do this much storage at RAID 10 ( or 0+1 for you pedants ). That's a lot of random IO capacity. I don't know how poorly IOPS scale for systems at this magnitude, but I'd be surprised if the SSD solution was 10x IOPS over 700 15k spindles. Maybe 2-5x?
Don't sell any of your equipment to anyone named O'Neill.
That's not a valid argument. It's an implied ad hominem at best. You can do better. Do so, or shut your FUD spreading mouth.
Asking about saving is the wrong question. Saving shouldn't be a question. A document's current state should be persisted at the drop of a hat and that means undo info as well.
A small faction at MS gets this.
Android, as a platform and as recommended dev practice, gets this. Many great IPhone apps get this.
An app should expect to be terminated rudely and abruptly at any time. You'll impress the hell out of your users if you follow this rule.
I bet you could find an efficient, massive diesel engine designed for being mounted in a large ship that's capable of delivering a few tens of megawatts. There's your power needs.
Data is another story, of course.
...and now I find myself agreeing with pretty much everything you just said.
When performance counts, there is absolutely, positively zero substitute for understanding the workload at hand and the hardware it will run on.
In no way whatsoever did he say 'remember swap is slow; try not to use it.'
That's as wrong as the idiotic summary.
Here's a relevant quote:
A 300-GB backing store, memory mapped on a machine with no more than 16 GB of RAM, is quite typical. The user paid for 64 bits of address space, and I am not afraid to use it.
The article is about redesigning binary heaps to account for non-linear access times between nodes due to swap. This point is critical. He's NOT avoiding swap, he's planning for it.
Above and beyond SQL-92/SQL-99, PostgreSQL does a good job of implementing the non-optional parts of SQL:2003 and SQL:2008 as well, and in that regard are competitive with or better than the commercial alternatives.
PL/SQL is probably unlikely to ever be available in the Open Source PostgreSQL product, but it is a feature of EnterpriseDB, which is a PostgreSQL superset.
INSERT..ON DUPLICATE KEY UPDATE is IIRC similar to the new SQL:2003 MERGE statement, which is on the TODO list for PostgreSQL.
CLUSTER is a subset of Oracle's index-organized tables / SQL Server's clustered index features.
One of the truly innovative features that is arriving is exclusion constraints. If you've ever had to implement a scheduling system that deals with concurrent updates, you'll recognize that PostgreSQL has an absolutely killer feature that makes it trivial to solve the concurrent range-excluded searched update problem without messy application code. This feature is pure gold.
Designing a performance-intensive application that is portable across multiple databases is a frustrating, difficult task. Starting with ANSI/ISO syntax is indeed a great way to base your design, but the devil is truly in the details. ORMs can hurt as much as they help.
You're usually better off spending money on spindles and ram and raid controllers way before exotic multicore CPUs, on a database machine.
Those who would sacrifice latency for bandwidth deserve neither.
Holiday celebrates you!
[...] but he sure thought up some neat ideas for a universe that John Scalzi will never come close to.
That's pretty unfair.
Scalzi may not have had the mass-market success or big screen treatment of Lucas, but turning senior citizens into killing machines is one of many 'neat' ideas. Have you actually read him?
Money. Why else? Private networks are more expensive than plugging into the ol' tubes.
Doesn't make it right. I'm not defending, just pointing out the obvious reason.
[...] but it has no place in embedded hardware.
cuz, lolz, it sure as heck wasn't designed with that use case in mind. At all. Shucks no.
omgplzmodfunnykthxbye
Which brings us to challenge #0: A time machine, enabling some benevolent organization to go back in time and supply your unfortunate parents with the birth control that they so richly deserve.
Imagine a Beowulf cluster of these, imagining you. In Soviet Russia. In Hot. Grits-Filled. PANTS. That is all.
I don't remember any any API doc in any garbage collected language that specified that a given reference was a strong (aka NORMAL) reference (as the general case). Weak references are specified because they are rather out of the norm, for most general-purpose programs. Yes, I know, all of you are geniuses and you use them all the time. Us normal idiots don't. Last time I had to use them I was foolishly rolling my own cache
In any case, no, this isn't a persistent PEBKAC problem among competent practitioners. References are assumed to be strong unless documented otherwise.
Something resolvable to the number does need to be stored, for dispute resolution. Otherwise the merchant cannot have an intelligent conversation with the bank regarding a disputed charge.
Am I supposed to (group)hate @pple or not?!
IIRC, the TitanTV bits involved a application-specific mime type and a little xml file download which provided the info for the scheduling magic. So, no. It shouldn't be difficult for MythTV to interop with that.
For sale. Baby shoes. Never worn.