Slashdot Mirror


Oracle Promises 100x Faster DB Queries With New In-Memory Option

Hugh Pickens DOT Com writes "ZDNet reports that Oracle's Larry Elison kicked off Oracle OpenWorld 2013 promising a 100x speed-up querying OTLP database or data warehouse batches by means of a 'dual format' for both row and column in-memory formats for the same data and table. Using Oracle's 'dual-format in-memory database' option, every transaction is recorded in row format simultaneously with writing the same data into a columnar database. 'This is pure in-memory columnar technology,' said Ellison, explaining that means no logging and very little overhead on data changes while the CPU core scans local in-memory columns. Ellison followed up with the introduction of Oracle's new M6-32 'Big Memory Machine,' touted to be the fastest in-memory machine in the world, hosting 32 terabytes of DRAM memory and up to 384 processor cores with 8-threads per core."

16 of 174 comments (clear)

  1. Oracle gains speed by rossdee · · Score: 4, Funny

    Especially upwind, but not 100x

    still Emirates Team NZ only need to win one more race..to take back the Americas cup

    1. Re:Oracle gains speed by Chrisq · · Score: 4, Insightful

      At first glance, one would think "Oracle" is a company devoted to catering high end golf outings and boat racing.

      They make software, right?

      Only as a means of raising the money for high end golf outings and boat racing.

    2. Re:Oracle gains speed by pinkstuff · · Score: 3, Interesting

      An estimated $100m USD. The winner sets the rules for the next one, so if New Zealand wins they will lower the cost, allowing more teams to able to compete.

    3. Re:Oracle gains speed by ae1294 · · Score: 4, Funny

      At first glance, one would think "Oracle" is a company devoted to catering high end golf outings and boat racing.

      They make software, right?

      Only as a means of raising the money for high end golf outings and boat racing.

      With blackjack and hookers....

    4. Re:Oracle gains speed by ooshna · · Score: 4, Funny

      In fact forget the boat racing and the blackjack.

  2. Great news for NSA by Anonymous Coward · · Score: 4, Funny

    With increasing surveillance on American citizens such database will provide security forces with instant profile of each person. Let's combine that with license plate scanning, cell phone tracking, sexual preferences and health records.

    Now we can sleep well at night, our children are safe.

  3. One Big Memory Machination by VortexCortex · · Score: 5, Funny

    "Big Memory Machine"... So, they finally built Deepthought?

    In-memory IO is grand, when that's your're bottleneck. Mine tends to be in the network level, so I use a local daemon for query result caching at the application level as "in-memory" speedup. The speedups are nice, but pricey. Color me unimpressed -- that's pink, BTW; I'm a Caucasoid your colors may vary, but only up to VARCHAR(20);

    Uhg. Is "in memory" now just another buz-word? I guess we've come full circle back to Mainframe? Big memory banks are faster and better for a while, but then the bandwidth goes up and the price, reliability and scalability will favor distributed systems (as currently). I wonder which phase of the cycle quantum computing will favor: distributed / localized? You have to take into consideration your user distribution too...

    So, eventually you'll want a hybrid system where the memory is distributed and cloned at each query-able interface, but still maintaining the entire dataset "in memory"...
    SELECT * FROM earth WHERE answer LIKE "everything";
    ...
    42 rows returned

    1. Re:One Big Memory Machination by Anonymous Coward · · Score: 3, Funny

      If you used ECC memory, the answer wouldn't have been 42.

    2. Re:One Big Memory Machination by doomsayerxero · · Score: 3, Funny

      That should be Varchar2(20).

      --
      Don't screw up, don't throw up.
  4. This merely allows poor code to suck less. by pla · · Score: 4, Interesting

    First, let me say that I would love to have a table option to keep a particularly heavily-hit table always in memory.

    This ain't it.

    From TFA, "Maintaining those indexes is expensive and slows down transaction processing. Let's get rid of them," Ellison remarked. "Let's throw all of those analytic indexes away and replace the indexes with in-memory column sort."

    This merely minimizes the penalties of poor indexing and RBAR by making complete table scans on arbitrary columns faster. Apparently Mr. Ellison has forgotten his algoithmics and combinatorics - Oh, wait, no he didn't, he dropped out as a sophmore. Pity, because had he stayed, he would have learned that even with a 1000x slower storage medium, an O(log N) algorithm (index seek) will eventually beat an O(N log N) algorithm (column sort).

    Thanks, Larry, but you want to make Oracle faster? Remove cursors from the core language, and although that alone won't "fix" it, you'll see all the hacks who can't think in set-based logic drop out overnight.

    1. Re:This merely allows poor code to suck less. by homb · · Score: 3, Informative

      From TFA, "Maintaining those indexes is expensive and slows down transaction processing. Let's get rid of them," Ellison remarked. "Let's throw all of those analytic indexes away and replace the indexes with in-memory column sort."

      This merely minimizes the penalties of poor indexing and RBAR by making complete table scans on arbitrary columns faster. Apparently Mr. Ellison has forgotten his algoithmics and combinatorics - Oh, wait, no he didn't, he dropped out as a sophmore. Pity, because had he stayed, he would have learned that even with a 1000x slower storage medium, an O(log N) algorithm (index seek) will eventually beat an O(N log N) algorithm (column sort).

      I think you misunderstand the way columnar databases work. They are not doing a column sort the way you think. The column itself is an index.
      Of course the inanities coming out of Ellison's mouth don't help explain things correctly. No Larry, you don't do away with indexes. You mostly store indexes on everything, automatically.

      Thanks, Larry, but you want to make Oracle faster? Remove cursors from the core language, and although that alone won't "fix" it, you'll see all the hacks who can't think in set-based logic drop out overnight.

      Can't argue there!

    2. Re:This merely allows poor code to suck less. by oranGoo · · Score: 3, Insightful

      From TFA, "Maintaining those indexes is expensive and slows down transaction processing. Let's get rid of them," Ellison remarked. "Let's throw all of those analytic indexes away and replace the indexes with in-memory column sort." This merely minimizes the penalties of poor indexing and RBAR by making complete table scans on arbitrary columns faster. Apparently Mr. Ellison has forgotten his algoithmics and combinatorics - Oh, wait, no he didn't, he dropped out as a sophmore. Pity, because had he stayed, he would have learned that even with a 1000x slower storage medium, an O(log N) algorithm (index seek) will eventually beat an O(N log N) algorithm (column sort).

      RTA - the improvement is there specifically for real time analytic workloads. In these kind of workflows the optimal algorithm is O(n) in general case and indexes are useless (query optimizing engine will always choose scans as you need to visit a lot of data). You might know a thing or two about algorithms, but you should brush up on problem analysis 101.

      Other mistakes in logic: Index seek and column sort are not different algorithms for the same task so comparing them brings little insight (without considering some other details of the query optimizer). This leads you to nonsensical claim that O(log N) will eventually beat or be equal to O(N log N). It is not eventually, the first will be always faster or equal.

    3. Re:This merely allows poor code to suck less. by pla · · Score: 3, Interesting

      Okay, you'll have to 'splain it too me as well then, because I don't see the joke (and only refrained from posting substantially the same response because an AC beat me to it).

      Memory runs roughly 1000x faster than disk (it can get down to around 50x on an array of SSDs, but up to 100,000x for random seeks across physical platters). Holding all else equal, 1000*O(log N) will take longer than 1*O(N log N) until N=1000, despite the lower time complexity. Additionally, the AC made a good point about the relative constant factor of the algorithms themselves, in that a binary search of a sorted list has virtually no overhead, while a good general purpose sorting algorithm does.


      And all that said, yes, I see now that I made an error because this change applies to columnar rather than row-oriented data; O(log N) will still eventually beat O(N), however.

  5. Re:Beowulf by Anonymous Coward · · Score: 5, Funny

    But does it run on Solaris?

    Sadly, no. In fact it won't even function as a Minecraft server without some patches and a Java update. I hear that Oracle is still waiting on the vendor for Java update.

  6. Re:Like Microsoft SQL Server by SDrag0n · · Score: 3, Interesting
    --
    I don't have time to make a sig
  7. Re:Oracle-friendly site(s) by bluefoxlucid · · Score: 3, Informative

    No, I mean MongoDB will take a 3 database cluster and let you "Replica Acknowledge" a transaction with "Majority" count. Once it hits 50%+1 servers, it's 100% guaranteed solid unless you lose both servers. If both servers suffer a power drop at that point, the last server refuses to accept writes; when those servers come back, they will replay their oplog back to the last server to synchronize it. There's one flaw here: there's no "Replica Journal Acknowledge", so it's theoretically possible to lose that transaction anyway; both servers have to suffer a system failure (power drop, kernel panic) within 100mS of receiving the operation, since they write out their data to disk every 100mS. In practice this is extremely unlikely.

    That means once you've sent it and gotten back that it's written, it is written. You'd have to lose both (or more--3 servers in a cluster of 5, etc.) servers' power or hard drives (corruption, failure) before the data is propagated further.

    By contrast, Percona and MariaDB have XtraDB. XtraDB does optimistic locking: in normal autocommit, the transaction might get rolled back silently--it will write successfully to one server and return success, but if another server simultaneously gets a write that conflicts and starts propagating it then the transaction will be silently rolled back (i.e. undone, removed, lost, failed). With BEGIN-COMMIT transactions, you may get a Deadlock on "COMMIT" and then you're informed that it did in fact roll back the transaction and you must re-submit (i.e. do this if you actually care about durability of the data). With autocommit, as well as with any transactions (even explicit COMMIT) on MySQL master-slave replication or PostgreSQL WAL replication, you may in fact be informed that the transaction is 100% committed and then have that server FAIL and the slave comes up without that transaction--unavoidable silent data loss.

    The failure mode expressed by MySQL master-slave replication and PostgreSQL WAL replication in the default asynchronous streaming replication mode is the same failure mode as with "Journaled" write concern in MongoDB. When running "Journaled" rather than "Replica Acknowledged," you write to exactly one server and are told it's committed when it's written to disk--it's durable on that server, but not necessarily replicated. If that server power drops and comes back up, it may find new operations have made its non-replicated operations invalid; it will then silently roll those back.

    Therefor, in cluster layouts, it is possible for MongoDB to have a negligible reliability advantage over PostgreSQL's most common replication methods. PostgreSQL has settings that make up that last bit of reliability, putting it roughly on par with MongoDB. MongoDB has a guaranteed "It has reached enough servers that it is valid on the cluster unless God hates you" write concern by which the data is likely to actually be there if it tells you it's there, unless a very specific subset of servers experience a catastrophic failure in an extremely small (tenths of a second) window--a subset large enough to take down your entire cluster.

    Short version: MongoDB allows you to, on a per-query basis, write data into the database at any level of reliability that MySQL and PostgreSQL provide. Single-server Journaled (WAL log shipping, WAL asynchronous streaming, MySQL master-slave replication), multi-server Replica Acknowledged (PostgreSQL WAL synchronous streaming), and a single-server "Acknowledged" mode that is faster but gives a weaker data durability guarantee (transaction is valid, not yet to disk, and not replicated).

    *"PostgreSQL streaming replication is asynchronous by default. If the primary server crashes then some transactions that were committed may not have been replicated to the standby server, causing data loss. The amount of data loss is proportional to the replication delay at the time of failover."