Slashdot Mirror


Facebook Trapped In MySQL a 'Fate Worse Than Death'

wasimkadak writes with this excerpt from GigaOM: "According to database pioneer Michael Stonebraker, Facebook is operating a huge, complex MySQL implementation equivalent to 'a fate worse than death,' and the only way out is 'bite the bullet and rewrite everything.' Not that it's necessarily Facebook's fault, though. Stonebraker says the social network's predicament is all too common among web startups that start small and grow to epic proportions."

61 of 509 comments (clear)

  1. Commercial databases by drolli · · Score: 2, Interesting

    Well. then they convert from one db to another. So what. its not like that would be a completely new thing to happen, and i am sure that oracle or any other big db provider will send experts to help with the task.

    1. Re:Commercial databases by svick · · Score: 2

      SQL is a standard, but every provider implements it differently, with their own additions. So, for any non-trivial uses of SQL, you need to do at least some changes.

      In some cases, the changes could be really big. Especially when using some of the more complex features, like the support for recursive queries.

    2. Re:Commercial databases by jjohnson · · Score: 4, Insightful

      A minor difference that exists in 4,000 instances and who knows how many places in the code that's also distributed across multiple servers, isn't minor, especially when there are hundreds or even thousands of minor differences.

      And no, the differences in SQL between Oracle and MySQL aren't minor. It's not just syntax, and it's not MySQL-can-Oracle-can't. It's the performance characteristics of various queries, the logic of how they're implemented, and the incredible investment in configuring a large cluster to work smoothly (which MySQL and Oracle do extremely differently. Large scale systems add a layer of complexity all their own that's a totally separate engineering challenge.

      Short version: Switching from MySQL to anything else would be the equivalent to a ground-up rewrite, though this is largely true of any database system. MySQL hasn't somehow uniquely trapped them here.

      --
      Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.
    3. Re:Commercial databases by Dr.Dubious+DDQ · · Score: 2

      I would guess that instead of using PDO or similar abstraction layer, their PHP code is littered with "mysql_*" function calls, so they'll necessarily need to modify everything to handle any other database.

      Or just wait for enough people to move to Google+ instead so that their database load is reduced...

    4. Re:Commercial databases by NeoMorphy · · Score: 4, Insightful

      If you wan't to start a fun/interesting project that you didn't expect any revenue from, it would make more sense to use free software. MySQL is a popular choice for web applications and there is a lot of freely available documentation and examples available. Many people have been successful doing it, so it's a proven path that works.

      Oracle is expensive. It would have cost a fortune to start Facebook with Oracle, and I can't imagine what it would cost them now. But even if they have to hire a ton of experts to convert to Oracle( assuming that is the best thing to do...) They can probably be funded by the money saved by not using Oracle over the past couple of years.

      Maybe Oracle would have been a mistake, there are companies migrating from Oracle to DB2/DB2 to Oracle/Oracle to Sybase/Sybase to MySQL/Mainframe to AIX/AIX to Solaris/Solaris to Linux/etc.. It seems like nobody can agree to the best hardware/OS/database solution, but there are plenty of people who swear that the solution they know is the best one.

    5. Re:Commercial databases by Billly+Gates · · Score: 3, Interesting

      I went for a job interview a few years ago which was very SQL intense. I looked at some SQL code in C# for both ODBC as well as direct SQL Server code, and it was the most complex thing I have ever seen and frankly hair pulling ugly. It was no simple UPDATE INTO TABLE like simply MySQL with php.

      Rather, It was weird ASYNC VSYNC Data.adaptor,x and weird eseortoric lines consisting of 35 to 40 lines of code for each insert doing God knows what! Maybe a SQL programmer can explain what a Vsync was and what a data,adaptor is and why was that code so evil? The question was how to fix it? I realized I was obviously not qualified for the job.

      I googled the code and it seemed it was operating optimally with all that stuff. Sure you can use a simple insert statement, but that is frowned upon as not optimized by SQL programmers. Most of them use very complicated steps and layers of abstraction where KISS is frowned upon, because if the database doesn't perform well you do not want to look like an *ss.

      PHP is really nice for it's simplicity, but as soon as you move to Java or C# it gets very ugly and each database requires different code and optimization techniques and a rewrite if you change your schema. Again, I am not a SQL programmer so hopefully I wont get bashed too much here by the real ones, but it is just what I observe as I want to learn this myself. I have a feeling this is why Hibernate is becoming popular to avoid these things as it is a framework that does some of the nasty details in a seperate layer ... at least from what I read.

      But with Java or .NET you can get caching, transaction control, and other neat things you can't get with PHP but it comes at a cost. Same is true with a real database like PostgreSQL, SQL Server, DB 2, or Oracle. SQL statements are a small part of the code and the rest is proprietary with working with the RDBMS. My hunch is the vendors love these as it encourages lock in with expensive licensing fees.

    6. Re:Commercial databases by kimvette · · Score: 2

      SQL is a standard, but no, "SQL" isn't standard. There are syntax differences between databases, and if you get into stored procedures (or equivalent) and triggers (or equivalent), or rely on referential integrity (which is implemented on some RDBMS systems, but not others, and doesn't always work the same), it won't be a matter of dumping the database from one RDBMS and then importing it to another RDBMS. Things are going to break.

      I'd hate to have to deal with a(-) Facebook dump file(s); I'm sure everything isn't crammed into a single table, or even a single database, but I'd imagine must be a horrible, fragile, scary mess even if the architecture is sound.

      That's the beauty of MySQL, Postgres, etc. come into play - not only are they easily scalable, but they are open source so if you are a large organization, you can cook your own fork and address the shortcomings, unlike smaller organizations which lack the resources to even consider the "it's open source, fix it yourself" mantra. In fact, it'd be neat if for once Facebook does something less than evil and contributes significant enhancements to MySQL.

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    7. Re:Commercial databases by laffer1 · · Score: 4, Informative

      This isn't true. I just migrated an application from MySQL 4.1 to Postgresql 9.0 at work. It took me about two weeks, but certainly not a complete rewrite from scratch. It varies greatly on the application, the language it's written in, frameworks in use, and the number of product specific features in use. This was a perl / mason app.

      If an application was making extensive use of stored procedures, then it would require a lot of effort to rewrite those, but not the whole application. If the application were written in C, it would be a lot of work to change. I think facebook uses PHP and that's not too hard to change out especially if they were sane and used an abstraction layer like PDO.

      If the app were written in Java or .NET and using an ORM, it would be TRIVIAL to change to another database.

      With my experience, the biggest problems were date functions and the fact that MySQL embeds index creation in the create table syntax whereas postgres requires it be separate and the names of indexes are global. This meant that I had some work cut out for me changing index names. There were also a few quirks with some join queries as MySQL is not picky about ordering in the from clause.

      You are correct that they'll have to tune queries and things, but it's not a total rewrite if they wrote their app in a reasonable way.

      For the record, Postgresql 9 is faster for many of our queries but seems slower doing INSERT. YMMV

    8. Re:Commercial databases by Surt · · Score: 2

      This is exactly why anyone in their right minds puts some sort of ORM/query layer in front of their database so that their mid-tier/front-end code has no knowledge of what the sql looks like.

      --
      "Who is the Journal of Quantum Physics going to believe?" --Stephen Hawking
    9. Re:Commercial databases by Loconut1389 · · Score: 2

      if more hosts would offer PGsql, I would use it, but my clients options get limited otherwise.

    10. Re:Commercial databases by kimvette · · Score: 2

      And, to add to that, Facebook is insane if they didn't implement what is commonly called an "access layer" for abstraction, so that the system can be rapidly ported from one RDBMS to another. However, even if they did implement that in their architecture, some issues come up: is it implemented throughout the project, or did some developers bypass it for performance, and is it intermingled with presentation code? Can they re-implement the access layer without performance suffering? Does the new RDBMS provide similar performance under their circumstances (a faster DBMS isn't always faster if it's not highly optimized for a corner case that another RDBMS by pure chance happens to excel at).

      So no, it's not a matter of export/import and forget about it, but if they were smart about it from the very beginning (doubtful) it could be painful - and even if they did have the foresight to make it modular, it doesn't mean that Oracle would actually perform better for them.

      But, I think few outside of Facebook would know the answer to those questions, and I think given the size Facebook has grown to, I'm sure that they have the staff on hand to keep it under control. I'm far more interested in learning what Google uses on their back end for a database that rarely if ever breaks under the immense load Google faces. THAT is more impressive than Facebook, IMHO because as far as I am concerned Facebook doesn't really matter since it's not essential; it's just a toy, but Google is essential.

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    11. Re:Commercial databases by curunir · · Score: 2

      In my experience, you don't really have to worry about the types of SQL that MySQL won't accept or the specialized syntax that MySQL will accept. The biggest pain is worrying about the SQL that MySQL will accept and choose the stupidest execution plan possible. This leads to unintuitive queries designed around MySQL's shortcomings.

      For example, one hack we had to employ regularly was to select only the columns from a table that were part of the WHERE or ORDER BY clauses and then join back against the original table once the results had been filtered and sorted. When we just selected normally from the table, MySQL would retrieve the entire rows and then determine that the amount of data it had to sort through exceeded the amount of RAM we had on the machine so it must use a file sort. This technique would commonly bring queries that were taking days to complete down to under 1 minute.

      In the end, our application was full of this kind of "keep MySQL from doing the stupidest thing possible" queries that would probably run decently under Oracle or PostgreSQL but would be better written in a way that allows a real SQL database to perform its own optimization and allows the developer to easily grok what the query does. It's that kind of SQL that's the most insidious. The kind of SQL that MySQL forces on you not through syntax errors by through painful lessons.

      --
      "Don't blame me, I voted for Kodos!"
    12. Re:Commercial databases by qwijibo · · Score: 3, Interesting

      Ability to convert depends completely on the application. If the MySQL app written using simple or at least standard SQL, it will be easy to migrate. However, MySQL has some very problematic areas (i.e.: select foo from table1 where id in (select id from table2 where criteria='something')) that make people do some very nonstandard and MySQL-only style fixes to address performance. The query shown with 5000 rows in table1, 50000 in table2, table2 only having 50 rows that met the criteria took ~10ms on PostgreSQL 8.3, and 52 minutes on MySQL 5.1 on the same hardware. The only way I could find to get the ~10ms performance on MySQL was so goofy that MySQL itself refused to allow me to create a view from that select statement.

      Converting from PostgreSQL to Oracle has always seemed much easier and smoother, but PostgreSQL isn't as popular as MySQL because it hasn't been as easy to throw hardware at problems with scaling PostgreSQL, whereas MySQL has always made that option easier.

      Each database has its own pros and cons, but most times you don't discover how hard it is to migrate until it's too late.

    13. Re:Commercial databases by LurkerXXX · · Score: 2

      Oh give MySQL a break. They finally fixed it so it no longer recognized February 31st as a real date, so they are making progress.

    14. Re:Commercial databases by sphealey · · Score: 2

      > You write the code that actually does the queries as stored procedures
      > in the database, then write a DAL that essentially works as a database
      > driver. Your code does nothing to the DB other than requesting that it
      > execute an SP, and the SPs can be tuned for the specific database server.

      True, but that is not writing "database independent code" - it is writing separate versions of the code for each database and building a good UI that can be configured to be compatible with all of the versions. That's exactly what Tom Kyte (of Oracle) recommends as a development strategy, and there is a funny case in _Tales of the Oak Table_ (a book I recommend to everyone involved with databases) where Oracle consultants called to a customer site ended up writing a second version of their customer's code for SQL Server, cleaning up the original version for Oracle, and having both be 100x faster than the original "independent" code with essentially the same total number of source code lines.

      > Note that I've explicitly excluded Oracle from that list, as I've never once seen a
      > production Oracle database ever reach the performance of... well, any other
      > database server, really. I don't doubt that Oracle can be made fast; I just doubt
      > that getting the personnel who know how to do that and paying them to do so is
      > worth the cost compared to easier to use, less expensive, and faster out of the
      > box systems like, oh, DB2 and SQL Server.

      We'll have to agree to disagree here. I agree there is some truth in what you say about needing people who actually understand Oracle; if it is installed and used as if it were Access then it doesn't work well. My personal experience however is more with an application running with decent performance on a single Oracle instance being replaced with a "cluster" of SQL Server that starts out with 10 machines (5 databases, 5 of whatever application server Microsoft is pushing this year) and ends up as a cluster(-f***) of 300 or more machines and a constant cry for "more RAM", "more budget", "more servers" - and is never as fast as the system it replaced.

      sPh

    15. Re:Commercial databases by mooingyak · · Score: 2

      ...how the heck do you write performant code that works against both databases?

      Erm... writing _two_ code baselines that provide the same high level interface, perhaps. It's not as if that same problem hasn't been dealt with several times before (compiling to different architectures, for example).

      Sometimes. I once dealt with a database called UNIFY which had a piss-poor query planner. It tended to overwhelmingly favor certain types of indexes (which were built implicitly for you whenever you had a foreign key relationship) over any other kind. We had a frequently used query against a sku database on style, color, and size. There were indexes on any combination of those fields, but color was also a foreign key to the color table. Which meant that it ALWAYS used the color index. Problems rose up when the conditions were something like color = black, style = blah, size = medium. There were maybe 30 skus for any given style. There were around 650k skus with color = black. There were a number of ways we could solve this, but what ended up working out for us was to only query against style/size and then run the output through a filter that kicked out all the rows with the wrong color.

      While you can still abstract something like that, it gets to be a bitchy problem and makes the overall work much more complicated.

      --
      William of Ockham had no beard. The most likely explanation is that it was chewed off by squirrels every morning.
    16. Re:Commercial databases by LordThyGod · · Score: 2

      If someone wanted/needed transactions, why would they ever opt for an engine that would not work for them? Ignorance? Laziness? Secondly, the "default engine" does indeed support transactions. And referential integrity.

    17. Re:Commercial databases by sphealey · · Score: 2

      > If you're actually suggesting writing X versions of stored procedures
      > just to be able to run on X different DBs that's not solving the
      > migration problem. That's continuously living in the migration problem.

      Yes, that is exactly what the parent is suggesting, since experience shows it is the only way to get correct, performant, scalable systems. Organizations that actually have a need to support multiple databases (few truly do) generally find that this method is in the end less labor-intensive than attempting to recreate the features and services that Oracle, IBM, Microsoft/Sybase have spent 10s of millions of manhours developing over 30 years; instead they use those features that they (and their customers, if they are software vendors) paid the big bucks for.

      I'd would ask that you think deeply about this and do some research before replying; keep in mind that "common practice since 2000" is not the same thing as most efficient practice.

      sPh

    18. Re:Commercial databases by LordThyGod · · Score: 2

      I'm sorry, but unless you're positive your project is a toy project and will always remain so, it is extremely unlikely MySQL can be justified for a project.

      Unless you are looking to do something like Facebook. The proof is in the pudding. It works. It stays up. Its rarely seems to suffer performance issues. Its not been hacked (that I know of). Where's the real world problem? All the chest thumping is over "I could have done it better" is just hooey.

    19. Re:Commercial databases by drolli · · Score: 2

      The fact is: there is no single best solution. Specific bottlenecks will require specific solutions.

    20. Re:Commercial databases by JAlexoi · · Score: 2

      Maybe because Oracle forbids any public statements that could tarnish their image? When you license Oracle DBMS, you sing away a lot of your rights as a consumer. Including the right to complain about Oracle DBMS to anyone else other than Oracle.

    21. Re:Commercial databases by WaffleMonster · · Score: 2

      This is exactly why anyone in their right minds puts some sort of ORM/query layer in front of their database so that their mid-tier/front-end code has no knowledge of what the sql looks like.

      LOL because we all know thats how you improve performance.

    22. Re:Commercial databases by Rayonic · · Score: 2

      (i.e.: select foo from table1 where id in (select id from table2 where criteria='something'))

      Haver you tried using a join and if so how well does it work?

      e.g. select table1.foo from table1 inner join table2 on table1.id=table2.id where table2.criteria='something''

      Or this might be better, depending on the quality of the query optimizer:
      select table1.foo from table1 inner join table2 on table1.id=table2.id AND table2.criteria='something''

  2. Still their fault by nurb432 · · Score: 3, Informative

    Once they started the trend to grow beyond being a toy, they should have redone things right then.

    Waiting until you are painted in a corner is irresponsible.

    --
    ---- Booth was a patriot ----
    1. Re:Still their fault by Tridus · · Score: 2

      PostgreSQL can be had at a similar price point to MySQL only is better at pretty much everything else.

      --
      -- "So they told me that using the download page to download something was not something they anticipated." - Bill Gates
  3. Subject line should read... by arglebargle_xiv · · Score: 2

    ... "Michael 'Ingres' 'Postgres' 'VoltDB' Stonebraker says 'MySQL doesn't scale'".

  4. "We're so new" by michaelmalak · · Score: 5, Insightful

    I love the snippets "After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed" from the article and "We’ve been using stonge age technology to solve problems that didn’t exist 30 years ago." Yes, the problems existed 30 years ago, such as (land-line) telephone billing. I don't know how those problems were solved -- probably with a mainframe and a custom non-SQL database and not a PC running a SQL-based server -- but they were solved.

  5. Successful Troll is Successful by tyler_larson · · Score: 5, Insightful

    Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.

    Presumably he's tired of Facebook being used as a counter-example to everything he's been preaching.

    --
    "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
    RFC 1925
    1. Re:Successful Troll is Successful by fermion · · Score: 4, Interesting

      And note that two stories down it is reported that SAP is once again over budget and over schedule on a major implementation. So I suppose that now everyone will stop using SAP as it unreliable.

      --
      "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
    2. Re:Successful Troll is Successful by rgmoore · · Score: 5, Insightful

      No, he's not an academic purist; he's a businessman who's selling a product that competes with MySQL. So he's trying to convince web startups to pay a bunch of money for his product rather than rely on free MySQL because he claims it will help them scale better than Facebook. IOW, businessman trashes competitor's product, claims you should buy from him instead. Nothing to see here.

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

    3. Re:Successful Troll is Successful by sco08y · · Score: 4, Interesting

      Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.

      Presumably he's tired of Facebook being used as a counter-example to everything he's been preaching.

      He's no academic purist. He's pushing his product, and he's either an outright liar or, worse, doesn't know what he's talking about:

      Stonebraker said the problem with MySQL and other SQL databases is that they consume too many resources for overhead tasks (e.g., maintaining ACID compliance and handling multithreading)

      Is that so? MySQL, as with virtually all SQL DBMSs, defaults to "repeatable read" transactional guarantees, and it doesn't even spend time guaranteeing foreign key relationships by default. About the only thing MySQL really guarantees out of the box is durability.

      It's just nonsense to talk about all the "wasted resources" when, if they don't need them, it's a few lines in a config file to turn them off.

  6. PostGreSQL is far better than MySQL by darthium · · Score: 2

    MySQL is 'fast' because its lack of feature and robustness mainly. Implying maketshare means qualit.is like implying that current crappy pop music is better than Classica Music because of the marketshare they get.

    1. Re:PostGreSQL is far better than MySQL by repetty · · Score: 2

      MySQL is 'fast' because its lack of feature and robustness mainly.

      I've read the same thing about C.

  7. Keep on backpedalling, you silly NoSQLers. by Anonymous Coward · · Score: 5, Interesting

    It's hilarious how the NoSQL fools are now constantly backpedalling these days.

    It turns out that writing database queries in JavaScript is a stupid idea! Imagine that! All of their attempts to invent a better query language end up being almost identical to, guess what, SQL!

    Then they realize that trying to maintain data consistency using logic written in JavaScript, Ruby or PHP doesn't work so well. Values go unconstrained, and the referential integrity gets fucked up. Soon the data is nearly worthless.

    The smarter/less-ignorant ones then think that they'll just use transactions. But wait, their NoSQL database of choice doesn't support that, or doesn't support it properly. So they tell themselves that their data will become "eventually consistent", or worse, they try to implement some shitty ass "transaction" support using Ruby. Regardless of the path chosen, failure is the result.

    Now they're realizing that it's mandatory to use a real relational database when working on anything remotely serious. So we see this bullshit about "no" now meaning "not only". That's funny, last month it meant "no", as in, "we will never write a SQL query again, and we will never use a relational database again."

    I'm going to make a prediction: Next month, we'll get to read articles and comments from them about these amazing new database systems that they've just discovered. These new systems avoid all of the problems associated with NoSQL databases! What are their names? Oracle, DB/2, SQL Server, PostgreSQL and SQLite.

    1. Re:Keep on backpedalling, you silly NoSQLers. by Joe+U · · Score: 2

      Meh,

      If NOSQL really means Not Only SQL, then it's a smart idea.

      If it means re-writing relational database code to behave like SQL, then you should have been using SQL in the first place.

      If it means that your live object database that doesn't follow normal relationships has a much more efficient system, then you should not have been using SQL in the first place.

      Simply, use the right tool for the job.

      [I prefer a well defined stored procedure interface to my data, the slight amount of extra design time makes up for the fact that I not only reduce round trips to the DB server, I can scale it out easily without code changes on the web server. ]

  8. looks like facebook is doing just fine... by tommeke100 · · Score: 5, Insightful

    If anything, it's a success story for MySQL.

    1. Re:looks like facebook is doing just fine... by Dilaudid · · Score: 2

      I accept your point that it is impressive that such a mountain of suck runs so effectively on MySQL. However it is still a mountain of suck. Or to put it another way, I don't think computers were invented so some mini bill gates can try to cajole me into "poking" my friends for the purposes of selling advertising.

    2. Re:looks like facebook is doing just fine... by Doc+Ruby · · Score: 2

      You're right. Computers were invented to steer bombs to people and kill them. Through WWII and the 1960s missiles that brought us the IC and beyond.

      As mountainous as advertising suck is, bomb suck is worse. I like where computers have gone from their purely murderous beginnings.

      --

      --
      make install -not war

  9. Re:Still their fault ;) by jjetson · · Score: 2

    I'm sure this opinion is based on your experiences with 750,000,000 user sites. Thanks for the input Miss Cleo.

  10. Michael Stonebraker & VoltDB by solferino · · Score: 4, Informative

    The guy in the article does have some cred. He was a professor at UC Berkeley for 29 years where he was project leader on Ingres and led the creation of its follow up, Postgres.

    His new database, VoltDB, based on the 'NewSQL' ideas touched on in the article, is Free Software licensed under the GPLv3.

  11. Re:Delusional editorialism! by kimvette · · Score: 3, Insightful

    So hurray for MySQL. They saved 45-minutes during their installation on day one and now they'll spend a year or two plus millions of dollars to move away from their extremely dumb and uneducated decision. That's got to be one of the most expensive 45-minutes on earth - and yet its one of the single biggest decisions which MySQL users defend on a daily basis.

    IMHO it was a bargain - MySQL has worked up until now, it is still working, so as far as I am concerned that's a big success story for such a low-end free/free database - and it was a choice they made based on what they already had skills in and it enabled them to earn billions, so it was a very smart, inexpensive way for them to get started. Now for Facebook, spending a few million to get on to big iron is cheap money, whereas back in the day spending a week or two to really learn the ins and outs of Postgres or spending thousands on Oracle could have prevented them from surviving in the first place.

    --
    The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
  12. MySQL is facebook's issue? by sgt+scrub · · Score: 2

    I don't think so. Facebook isn't known for having a lot of down time. It is known for opening up information to the public. If anything, that would be considered too much up time. I've used MySQL and PostgreSQL. I found MySQL to be limited but most limitations were easily worked around in code. PostgreSQL wasn't as limited. However, the options that it provided forced the need to vacuum the database. I would rather write code but to each his own.

    --
    Having to work for a living is the root of all evil.
  13. That's not Facebook's problem by Animats · · Score: 5, Informative

    Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.

    Right.

    Some of the key architects of Facebook have spoken at Stanford about how the system is put together, and I went to that presentation and had a chance to talk to them. They didn't consider MySQL to be a bottleneck. Their big problem was PHP performance. They were writing a PHP compiler to fix that.

    Internally, the user-facing side of Facebook is in PHP. But the front end machines don't talk directly to the databases. They use an RPC system to talk to other machines that do the "business logic" parts of the system. Building a Facebook reply page may involve a hundred machines. There's heavy caching all over the system, of course, so the databases aren't hit for most read requests.

    The RPC system isn't HTML, JSON, or SOAP. It's a binary system that doesn't require text parsing. Otherwise, RPC would be the bottleneck.

    This makes for a flexible, easy to enhance system. New services go in new machines, which talk to existing machines.

    1. Re:That's not Facebook's problem by rekoil · · Score: 3, Informative

      The RPC system they're using is Thrift (http://thrift.apache.org/)., which they developed because JSON was becoming a bottleneck. And yeah, there's a metric crapload of memcached in their data centers as well. The multi-hour outage Facebook had late last year was due to a near-complete failure of the memcached layer, resulting in an overload of requests to the main mysql farms.

  14. Total rewrite is always bad... mkay? by spectro · · Score: 2

    Over and over we hear about this "scrap and start over" concept. It sounds like a great idea but you are assuming you can do a better job than the guys before and more often than not you will be wrong.

    I used to suggest it but now I know better. I have seen new devs with little experience passionately suggest so called "total refactoring". It has never ended well.

    --
    HTML is obsolete. It's time for a new, simpler and richer markup language.
  15. Or you could.... by MAXOMENOS · · Score: 2

    The underlying problem according to Stonebrook:

    During an interview this week, Stonebraker explained to me that Facebook has split its MySQL database into 4,000 shards in order to handle the site’s massive data volume, and is running 9,000 instances of memcached in order to keep up with the number of transactions the database must serve.

    Or you could put MySQL on an IBM Power Systems LPAR and use a commercial MySQL plug-in to store the data in a DB2 database. Then you can get away with maybe a dozen database machines instead of thousands. I have to imagine, btw, that Oracle has a similar offering in the works.

    Lesson: academic credentials are no match for real world experience.

  16. Re:I thought about going cheap for my startup by 16K+Ram+Pack · · Score: 2

    The 2 key words about a startup are "fail cheap". Spend as little as you can to deliver the business functionality that you need on day 1. Because if the customers don't come, you haven't lost much. If they do come, you'll have plenty of money to rewrite it, or be able to get the funding for it.

    Personally, I'm doing a startup in .net because I know .net, and can code faster in it. So, that's a smaller cost. The main downside is hosting charges, but even that has a tiny per-transaction cost difference.

  17. Re:And this opinion has nothing to do with the fac by dzfoo · · Score: 3, Informative

    And this opinion has nothing to do with the fact that this is the guy who write PostgreSQL and he has been bitching about how MySQL has a to big market share, for years??

    Not at all. But it does have something to do with the fact that he is plugging his new product, which implements something he calls "NewSQL."

          dZ.

    --
    Carol vs. Ghost
    ...Can you save Christmas?
  18. Oracle vs Facebook? by AliasMarlowe · · Score: 2

    It may not be a hardware problem, it may be a problem that actually has more to do with the fact that Oracle owns MySQL.

    It's not unreasonable to suppose Oracle might "nudge" Facebook into the deeper end of Oracle's trough of slimy swill. But who to root for? This is a bit of a conundrum. Seeing Facebook's delicate bits getting squeezed is not an unattractive proposition, but seeing Oracle benefit therefrom would be appalling.

    --
    Those who can make you believe absurdities can make you commit atrocities. - Voltaire
    1. Re:Oracle vs Facebook? by h4rr4r · · Score: 3, Informative

      Use Postgres.
      It costs the same as MySQL $0 and is a 100 times the DB.
      It offers far better data integrity, it supports transactions out of the box, it will handle DBs in the TB range, and is about as standards compliant as DBs get.

      The company I work for uses it for our service that we sell.

    2. Re:Oracle vs Facebook? by davester666 · · Score: 3, Funny

      Maybe Facebook should just put all our data in the cloud...it's not like security or privacy is a big concern for Facebook...

      --
      Sleep your way to a whiter smile...date a dentist!
    3. Re:Oracle vs Facebook? by marcosdumay · · Score: 2

      I'll second h4rr4r. Use Postgres. Until you are very big, a DBMS will only make any difference if you choose one of those that trade reliability and compatibility for speed (like MySQL). Your best strategy at the beginning is to start with a cheap one, and there is none cheaper than 0.

      When you get big enough that the DBMS will make any difference, you'll discover it is cheaper to add hardware than going with a proprietary one anyway. The best strategy here is to go with a cheap one, and there is none cheaper than 0.

      Once you are so big that adding hardware won't help, you'll discover that your only chance is to customuze your data layer. No out of the box DBMS will be usefull here, but if you are on Postgres you'll discover that it is easy to customize, and there are some companies that will even do that for you. The best strategy here is to go with a free (speech) DBMS, and the worst one is to go with an imcompatible propretary one (like Oracle).

      So, going with Postgres you'll save some bucks at the short term at the cost of saving a few more bucks at the middle and long term.

    4. Re:Oracle vs Facebook? by Skal+Tura · · Score: 2

      If you are worth your weight as a developer, you've already done model isolation layer where your all queries would be, thus it's not that hard to rewrite the queries. If this was to be expected, you've made it far simpler already.

      In any case, i don't see the anti-MySQL points. I've tried Postgres once - that was enough, i'm not going back to it. It was weird as shit, required some weird conundrums for permissions and DBs changes, didn't seem to be properly isolating but more like hacked together to support more than 1 DB, with 1 set of perms per server - no, i don't mean it didn't support, that's just the way it felt.

      If you got 2 choices, otherwise equal for what you need now - always choose the simpler one. Postgres definitely is not the simpler one.

  19. Stonebraker trapped in Stonebraker by midom · · Score: 5, Informative

    (reposting as a logged in user) I wrote a bit longer response to this:
    stonebraker trapped in stonebraker 'fate worse than death'

    I think I know a bit more about database situation inside FB than Mr.Stonebraker. Go figure.

  20. Success sometimes makes fools of us and our plans by handy_vandal · · Score: 2

    It's not a poor decision up front that got them here it's an impossible to predict growth. Success sometimes makes fools of us and our plans ....

    Very true: mod parent +Insightful.

    We see the same principle when some individual acquires Sudden Wealth, as for example by winning the lottery. Sudden Wealth -- it's every man's dream, right?

    On closer inspection, Sudden Wealth is not a miracle cure for unhappiness or any other problem. Quite the contrary: Sudden Wealth brings new problems, new diseases of the soul.

    Example: there is, I'm told, a self-help group (somewhere in America) whose members are Sudden Wealth lottery winners, who meet to share and discuss the problems brought on by Sudden Wealth, ranging from vague and inexplicable dissatisfaction, through family crises and grasping relatives and bitter divorces, all the way to abject misery and blatant death wish.

    So too with corporations and other collective enterprises. Growth without preparedness can elevate a Mom 'n' Pop storefront operation to the skyscraper heights of corporate power ... but I would keep a watchful eye for embittered alcoholics and starry-eyed madmen among the board members and executives.

    --
    -kgj
  21. Maybe... by sycodon · · Score: 2

    Maybe this guy's problem is that Facebook HAS created such a large and successful business without paying Oracle millions of dollars or his company millions of dollars.

    Kinda of sounds like that commercial for Scott trade where the Fat Cat broker is trying to keep his clients so he gets his fat commissions.

    --
    When Fascism comes to America, it will call itself Anti-Fascism, and tell you to give up your guns.
    1. Re:Maybe... by shutdown+-p+now · · Score: 3, Insightful

      Um, RTFA? It's not a pitch for Oracle. In fact, it's a rant against SQL in general. Quote:

      In Stonebraker’s opinion, “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.

      Sounds like the usual NoSQL FUD, right? But wait, there's more here:

      Stonebraker thinks sacrificing ACID is a “terrible idea,” and, he noted, NoSQL databases end up only being marginally faster because they require writing certain consistency and other functions into the application’s business logic.

      Right... so what then? More magic buzzwords to the rescue!

      But Stonebraker — an entrepreneur as much as a computer scientist — has an answer for the shortcoming of both “old SQL” and NoSQL. It’s called NewSQL or scalable SQL ... Pushed by companies such as Xeround, Clustrix, NimbusDB, GenieDB and Stonebraker’s own VoltDB, NewSQL products maintain ACID properties while eliminating most of the other functions that slow legacy SQL performance. VoltDB, an online-transaction processing (OLTP) database, utilizes a number of methods to improve speed, including by running entirely in-memory instead of on disk.

      Now the article is pretty light on details regarding what is that "new" SQL, and Googling around doesn't really help. So far, to be honest, it sounds more like a bunch of DB makers have ganged together and came up with a nifty word to market their products against Oracle, DB2, MSSQL, Postgres etc - if it's "new" it must be good, right?

  22. Reply from a former FB engineer by prostoalex · · Score: 3, Interesting
  23. And MySQL hates babies and kittens! by DragonHawk · · Score: 3, Insightful

    Geez, GooberToo, did a MySQL developer kill your father or something? You've posted two giant rants about how MySQL is so unsuitable for anything that it can't possibly work for any serious project. You make it sound like simply installing MySQL causes a server to immediately explode.

    You *are* aware that Facebook, Slashdot, Wikipedia, and many other sites use MySQL, yes? Maybe there are better choices (more likely, there are different tradeoffs, but whatever), but MySQL works well enough to power some of the most popular websites in the world. Proof by existence that what you claim is inaccurate.

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  24. Re:Oh dear by tqk · · Score: 2

    Jealous much, faggot cunt?

    Do you have no idea how silly that makes you look? Think anatomically, assuming you're at least capable of some thought. Twit.

    Trolls: The Town Drunks of the Internet

    --
    "Tongue tied and twisted, just an Earth bound misfit ..." -- Pink Floyd.
  25. Re:And this opinion has nothing to do with the fac by Rufty · · Score: 2

    MySQL has been faster that PostgreSQL for years, it doesn't have as many features, but it is **fast** !!

    /dev/null is even faster, but I wouldn't use that for data storage, either.

    --
    Red to red, black to black. Switch it on, but stand well back.