Slashdot Mirror


Facebook Trapped In MySQL a 'Fate Worse Than Death'

wasimkadak writes with this excerpt from GigaOM: "According to database pioneer Michael Stonebraker, Facebook is operating a huge, complex MySQL implementation equivalent to 'a fate worse than death,' and the only way out is 'bite the bullet and rewrite everything.' Not that it's necessarily Facebook's fault, though. Stonebraker says the social network's predicament is all too common among web startups that start small and grow to epic proportions."

24 of 509 comments (clear)

  1. Still their fault by nurb432 · · Score: 3, Informative

    Once they started the trend to grow beyond being a toy, they should have redone things right then.

    Waiting until you are painted in a corner is irresponsible.

    --
    ---- Booth was a patriot ----
  2. "We're so new" by michaelmalak · · Score: 5, Insightful

    I love the snippets "After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed" from the article and "We’ve been using stonge age technology to solve problems that didn’t exist 30 years ago." Yes, the problems existed 30 years ago, such as (land-line) telephone billing. I don't know how those problems were solved -- probably with a mainframe and a custom non-SQL database and not a PC running a SQL-based server -- but they were solved.

  3. Successful Troll is Successful by tyler_larson · · Score: 5, Insightful

    Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.

    Presumably he's tired of Facebook being used as a counter-example to everything he's been preaching.

    --
    "With sufficient thrust, pigs fly just fine. However, this is not necessarily a good idea...."
    RFC 1925
    1. Re:Successful Troll is Successful by fermion · · Score: 4, Interesting

      And note that two stories down it is reported that SAP is once again over budget and over schedule on a major implementation. So I suppose that now everyone will stop using SAP as it unreliable.

      --
      "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
    2. Re:Successful Troll is Successful by rgmoore · · Score: 5, Insightful

      No, he's not an academic purist; he's a businessman who's selling a product that competes with MySQL. So he's trying to convince web startups to pay a bunch of money for his product rather than rely on free MySQL because he claims it will help them scale better than Facebook. IOW, businessman trashes competitor's product, claims you should buy from him instead. Nothing to see here.

      --

      There's no point in questioning authority if you aren't going to listen to the answers.

    3. Re:Successful Troll is Successful by sco08y · · Score: 4, Interesting

      Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.

      Presumably he's tired of Facebook being used as a counter-example to everything he's been preaching.

      He's no academic purist. He's pushing his product, and he's either an outright liar or, worse, doesn't know what he's talking about:

      Stonebraker said the problem with MySQL and other SQL databases is that they consume too many resources for overhead tasks (e.g., maintaining ACID compliance and handling multithreading)

      Is that so? MySQL, as with virtually all SQL DBMSs, defaults to "repeatable read" transactional guarantees, and it doesn't even spend time guaranteeing foreign key relationships by default. About the only thing MySQL really guarantees out of the box is durability.

      It's just nonsense to talk about all the "wasted resources" when, if they don't need them, it's a few lines in a config file to turn them off.

  4. Re:Commercial databases by jjohnson · · Score: 4, Insightful

    A minor difference that exists in 4,000 instances and who knows how many places in the code that's also distributed across multiple servers, isn't minor, especially when there are hundreds or even thousands of minor differences.

    And no, the differences in SQL between Oracle and MySQL aren't minor. It's not just syntax, and it's not MySQL-can-Oracle-can't. It's the performance characteristics of various queries, the logic of how they're implemented, and the incredible investment in configuring a large cluster to work smoothly (which MySQL and Oracle do extremely differently. Large scale systems add a layer of complexity all their own that's a totally separate engineering challenge.

    Short version: Switching from MySQL to anything else would be the equivalent to a ground-up rewrite, though this is largely true of any database system. MySQL hasn't somehow uniquely trapped them here.

    --
    Anyone who loves or hates any language, platform, or manufacturer, doesn't know what they're talking about.
  5. Keep on backpedalling, you silly NoSQLers. by Anonymous Coward · · Score: 5, Interesting

    It's hilarious how the NoSQL fools are now constantly backpedalling these days.

    It turns out that writing database queries in JavaScript is a stupid idea! Imagine that! All of their attempts to invent a better query language end up being almost identical to, guess what, SQL!

    Then they realize that trying to maintain data consistency using logic written in JavaScript, Ruby or PHP doesn't work so well. Values go unconstrained, and the referential integrity gets fucked up. Soon the data is nearly worthless.

    The smarter/less-ignorant ones then think that they'll just use transactions. But wait, their NoSQL database of choice doesn't support that, or doesn't support it properly. So they tell themselves that their data will become "eventually consistent", or worse, they try to implement some shitty ass "transaction" support using Ruby. Regardless of the path chosen, failure is the result.

    Now they're realizing that it's mandatory to use a real relational database when working on anything remotely serious. So we see this bullshit about "no" now meaning "not only". That's funny, last month it meant "no", as in, "we will never write a SQL query again, and we will never use a relational database again."

    I'm going to make a prediction: Next month, we'll get to read articles and comments from them about these amazing new database systems that they've just discovered. These new systems avoid all of the problems associated with NoSQL databases! What are their names? Oracle, DB/2, SQL Server, PostgreSQL and SQLite.

  6. Re:Commercial databases by NeoMorphy · · Score: 4, Insightful

    If you wan't to start a fun/interesting project that you didn't expect any revenue from, it would make more sense to use free software. MySQL is a popular choice for web applications and there is a lot of freely available documentation and examples available. Many people have been successful doing it, so it's a proven path that works.

    Oracle is expensive. It would have cost a fortune to start Facebook with Oracle, and I can't imagine what it would cost them now. But even if they have to hire a ton of experts to convert to Oracle( assuming that is the best thing to do...) They can probably be funded by the money saved by not using Oracle over the past couple of years.

    Maybe Oracle would have been a mistake, there are companies migrating from Oracle to DB2/DB2 to Oracle/Oracle to Sybase/Sybase to MySQL/Mainframe to AIX/AIX to Solaris/Solaris to Linux/etc.. It seems like nobody can agree to the best hardware/OS/database solution, but there are plenty of people who swear that the solution they know is the best one.

  7. looks like facebook is doing just fine... by tommeke100 · · Score: 5, Insightful

    If anything, it's a success story for MySQL.

  8. Re:Commercial databases by Billly+Gates · · Score: 3, Interesting

    I went for a job interview a few years ago which was very SQL intense. I looked at some SQL code in C# for both ODBC as well as direct SQL Server code, and it was the most complex thing I have ever seen and frankly hair pulling ugly. It was no simple UPDATE INTO TABLE like simply MySQL with php.

    Rather, It was weird ASYNC VSYNC Data.adaptor,x and weird eseortoric lines consisting of 35 to 40 lines of code for each insert doing God knows what! Maybe a SQL programmer can explain what a Vsync was and what a data,adaptor is and why was that code so evil? The question was how to fix it? I realized I was obviously not qualified for the job.

    I googled the code and it seemed it was operating optimally with all that stuff. Sure you can use a simple insert statement, but that is frowned upon as not optimized by SQL programmers. Most of them use very complicated steps and layers of abstraction where KISS is frowned upon, because if the database doesn't perform well you do not want to look like an *ss.

    PHP is really nice for it's simplicity, but as soon as you move to Java or C# it gets very ugly and each database requires different code and optimization techniques and a rewrite if you change your schema. Again, I am not a SQL programmer so hopefully I wont get bashed too much here by the real ones, but it is just what I observe as I want to learn this myself. I have a feeling this is why Hibernate is becoming popular to avoid these things as it is a framework that does some of the nasty details in a seperate layer ... at least from what I read.

    But with Java or .NET you can get caching, transaction control, and other neat things you can't get with PHP but it comes at a cost. Same is true with a real database like PostgreSQL, SQL Server, DB 2, or Oracle. SQL statements are a small part of the code and the rest is proprietary with working with the RDBMS. My hunch is the vendors love these as it encourages lock in with expensive licensing fees.

  9. Re:Commercial databases by laffer1 · · Score: 4, Informative

    This isn't true. I just migrated an application from MySQL 4.1 to Postgresql 9.0 at work. It took me about two weeks, but certainly not a complete rewrite from scratch. It varies greatly on the application, the language it's written in, frameworks in use, and the number of product specific features in use. This was a perl / mason app.

    If an application was making extensive use of stored procedures, then it would require a lot of effort to rewrite those, but not the whole application. If the application were written in C, it would be a lot of work to change. I think facebook uses PHP and that's not too hard to change out especially if they were sane and used an abstraction layer like PDO.

    If the app were written in Java or .NET and using an ORM, it would be TRIVIAL to change to another database.

    With my experience, the biggest problems were date functions and the fact that MySQL embeds index creation in the create table syntax whereas postgres requires it be separate and the names of indexes are global. This meant that I had some work cut out for me changing index names. There were also a few quirks with some join queries as MySQL is not picky about ordering in the from clause.

    You are correct that they'll have to tune queries and things, but it's not a total rewrite if they wrote their app in a reasonable way.

    For the record, Postgresql 9 is faster for many of our queries but seems slower doing INSERT. YMMV

  10. Michael Stonebraker & VoltDB by solferino · · Score: 4, Informative

    The guy in the article does have some cred. He was a professor at UC Berkeley for 29 years where he was project leader on Ingres and led the creation of its follow up, Postgres.

    His new database, VoltDB, based on the 'NewSQL' ideas touched on in the article, is Free Software licensed under the GPLv3.

  11. Re:Delusional editorialism! by kimvette · · Score: 3, Insightful

    So hurray for MySQL. They saved 45-minutes during their installation on day one and now they'll spend a year or two plus millions of dollars to move away from their extremely dumb and uneducated decision. That's got to be one of the most expensive 45-minutes on earth - and yet its one of the single biggest decisions which MySQL users defend on a daily basis.

    IMHO it was a bargain - MySQL has worked up until now, it is still working, so as far as I am concerned that's a big success story for such a low-end free/free database - and it was a choice they made based on what they already had skills in and it enabled them to earn billions, so it was a very smart, inexpensive way for them to get started. Now for Facebook, spending a few million to get on to big iron is cheap money, whereas back in the day spending a week or two to really learn the ins and outs of Postgres or spending thousands on Oracle could have prevented them from surviving in the first place.

    --
    The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
  12. That's not Facebook's problem by Animats · · Score: 5, Informative

    Academic purist discovers that one of the most prolific and successful database users in the world is using a system he doesn't approve of. He decides, with no insider knowledge at all, and despite all evidence to the contrary, that they should throw everything away and start over from scratch using a system that he thinks would allow them to see the performance and scalability that they've already achieved.

    Right.

    Some of the key architects of Facebook have spoken at Stanford about how the system is put together, and I went to that presentation and had a chance to talk to them. They didn't consider MySQL to be a bottleneck. Their big problem was PHP performance. They were writing a PHP compiler to fix that.

    Internally, the user-facing side of Facebook is in PHP. But the front end machines don't talk directly to the databases. They use an RPC system to talk to other machines that do the "business logic" parts of the system. Building a Facebook reply page may involve a hundred machines. There's heavy caching all over the system, of course, so the databases aren't hit for most read requests.

    The RPC system isn't HTML, JSON, or SOAP. It's a binary system that doesn't require text parsing. Otherwise, RPC would be the bottleneck.

    This makes for a flexible, easy to enhance system. New services go in new machines, which talk to existing machines.

    1. Re:That's not Facebook's problem by rekoil · · Score: 3, Informative

      The RPC system they're using is Thrift (http://thrift.apache.org/)., which they developed because JSON was becoming a bottleneck. And yeah, there's a metric crapload of memcached in their data centers as well. The multi-hour outage Facebook had late last year was due to a near-complete failure of the memcached layer, resulting in an overload of requests to the main mysql farms.

  13. Re:And this opinion has nothing to do with the fac by dzfoo · · Score: 3, Informative

    And this opinion has nothing to do with the fact that this is the guy who write PostgreSQL and he has been bitching about how MySQL has a to big market share, for years??

    Not at all. But it does have something to do with the fact that he is plugging his new product, which implements something he calls "NewSQL."

          dZ.

    --
    Carol vs. Ghost
    ...Can you save Christmas?
  14. Re:Commercial databases by qwijibo · · Score: 3, Interesting

    Ability to convert depends completely on the application. If the MySQL app written using simple or at least standard SQL, it will be easy to migrate. However, MySQL has some very problematic areas (i.e.: select foo from table1 where id in (select id from table2 where criteria='something')) that make people do some very nonstandard and MySQL-only style fixes to address performance. The query shown with 5000 rows in table1, 50000 in table2, table2 only having 50 rows that met the criteria took ~10ms on PostgreSQL 8.3, and 52 minutes on MySQL 5.1 on the same hardware. The only way I could find to get the ~10ms performance on MySQL was so goofy that MySQL itself refused to allow me to create a view from that select statement.

    Converting from PostgreSQL to Oracle has always seemed much easier and smoother, but PostgreSQL isn't as popular as MySQL because it hasn't been as easy to throw hardware at problems with scaling PostgreSQL, whereas MySQL has always made that option easier.

    Each database has its own pros and cons, but most times you don't discover how hard it is to migrate until it's too late.

  15. Stonebraker trapped in Stonebraker by midom · · Score: 5, Informative

    (reposting as a logged in user) I wrote a bit longer response to this:
    stonebraker trapped in stonebraker 'fate worse than death'

    I think I know a bit more about database situation inside FB than Mr.Stonebraker. Go figure.

  16. Re:Oracle vs Facebook? by h4rr4r · · Score: 3, Informative

    Use Postgres.
    It costs the same as MySQL $0 and is a 100 times the DB.
    It offers far better data integrity, it supports transactions out of the box, it will handle DBs in the TB range, and is about as standards compliant as DBs get.

    The company I work for uses it for our service that we sell.

  17. Re:Oracle vs Facebook? by davester666 · · Score: 3, Funny

    Maybe Facebook should just put all our data in the cloud...it's not like security or privacy is a big concern for Facebook...

    --
    Sleep your way to a whiter smile...date a dentist!
  18. Reply from a former FB engineer by prostoalex · · Score: 3, Interesting
  19. And MySQL hates babies and kittens! by DragonHawk · · Score: 3, Insightful

    Geez, GooberToo, did a MySQL developer kill your father or something? You've posted two giant rants about how MySQL is so unsuitable for anything that it can't possibly work for any serious project. You make it sound like simply installing MySQL causes a server to immediately explode.

    You *are* aware that Facebook, Slashdot, Wikipedia, and many other sites use MySQL, yes? Maybe there are better choices (more likely, there are different tradeoffs, but whatever), but MySQL works well enough to power some of the most popular websites in the world. Proof by existence that what you claim is inaccurate.

    --

    dragonhawk@iname.microsoft.com
    I do not like Microsoft. Remove them from my email address.
  20. Re:Maybe... by shutdown+-p+now · · Score: 3, Insightful

    Um, RTFA? It's not a pitch for Oracle. In fact, it's a rant against SQL in general. Quote:

    In Stonebraker’s opinion, “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.

    Sounds like the usual NoSQL FUD, right? But wait, there's more here:

    Stonebraker thinks sacrificing ACID is a “terrible idea,” and, he noted, NoSQL databases end up only being marginally faster because they require writing certain consistency and other functions into the application’s business logic.

    Right... so what then? More magic buzzwords to the rescue!

    But Stonebraker — an entrepreneur as much as a computer scientist — has an answer for the shortcoming of both “old SQL” and NoSQL. It’s called NewSQL or scalable SQL ... Pushed by companies such as Xeround, Clustrix, NimbusDB, GenieDB and Stonebraker’s own VoltDB, NewSQL products maintain ACID properties while eliminating most of the other functions that slow legacy SQL performance. VoltDB, an online-transaction processing (OLTP) database, utilizes a number of methods to improve speed, including by running entirely in-memory instead of on disk.

    Now the article is pretty light on details regarding what is that "new" SQL, and Googling around doesn't really help. So far, to be honest, it sounds more like a bunch of DB makers have ganged together and came up with a nifty word to market their products against Oracle, DB2, MSSQL, Postgres etc - if it's "new" it must be good, right?