Slashdot Mirror


Slashdot Posting Bug Infuriates Haggard Admins

Last night we crossed over 16,777,216 comments in the database. The wise amongst you might note that this number is 2^24, or in MySQLese an unsigned mediumint. Unfortunately, like 5 years ago we changed our primary keys in the comment table to unsigned int (32 bits, or 4.1 billion) but neglected to change the index that handles parents. We're awesome! Fixing is a simple ALTER TABLE statement... but on a table that is 16 million rows long, our system will take 3+ hours to do it, during which time there can be no posting. So today, we're disabling threading and will enable it again later tonight. Sorry for the inconvenience. We shall flog ourselves appropriately. Update: 11/10 12:52 GMT by J : It's fixed.

13 of 262 comments (clear)

  1. Point 2.4 of Resign Patterns: Detonator by Smoking · · Score: 2, Informative

    Taken from http://franksworld.com/blog/archive/2005/01/04/600 .aspx

    Chapter 2: Destructional Patterns

    2.4 Detonator

    The Detonator is extremely common, but often undetected. A common
    example is the calculations based on a 2 digit year field. This bomb
    is out there, and waiting to explode!

  2. Comment 16,777,216 does not exist by jamie · · Score: 5, Informative

    Some of you are asking which comment it was that got the cid 16,777,216. The answer is that none did. For redundancy, Slashdot is now running multiple-master replication which skips values for auto-increment. Our db-1 assigns odd-numbered primary key IDs, and db-2 assigns even-numbered. Right now writes are going to db-1 so newly created rows will have only odd IDs.

    The comment that got 2**24-1 was this one, if anyone cares :)

    Sorry about the inconvenience, everyone.

  3. Sounds Familiar by old_skul · · Score: 5, Informative

    I used to work at Comair. Remember, that airline that stranded about 10,000 people in the airport a couple of Christmases ago? Same deal. Program was capable of handling only a certain number of changes. Hopefully your president won't have to resign.

  4. EldavoJohn cid=16786251 reply by DeadCatX2 · · Score: 4, Informative

    Actually, comment 16,777,217 couldn't break it, because that comment's parent cid could have only been 16,777,215. Up until then, there wouldn't have been an overflow value put into the db.

    There's no telling which comment it is, because (16,777,217 + 2n) might not have been a reply, meaning it would come up correctly.

    --
    :(){ :|:& };:
  5. Reply to 16786251 by RAMMS+EIN · · Score: 4, Informative

    Reply to comment number 16786251:

    ``You claim that the 16,777,216th comment would have broke it but I contest that actually the 16,777,217th comment poster would be the culprit. Since it should be able to handle that many comments if it is zero referenced'' ...but it's probably not zero-referenced. Typically, ids in SQL start at 1.

    --
    Please correct me if I got my facts wrong.
  6. This was fortold a few months ago... by Anonymous Coward · · Score: 4, Informative
  7. Re: Why are all 16 million+ comments in a single t by jamie · · Score: 5, Informative

    poot_rootbeer asks why all the comments are in one table, when the data access pattern is such that 90% of our hits are on only the most recent entries in that table.

    The answer is that we used to do it this way but it's a huge pain. In 2000 we converted from having two tables for 'stories', recent and archived, and merged them together. The performance hit was not big, and it made the code so much simpler it was a no-brainer.

    It's the database's job to cache properly whether we split the table or not, and the database does that just fine. The only performance problem could be when there is a rush of inserts, or updates to the same sets of rows, spanning both newer and older portions of the table, and that just doesn't happen.

    If we did want to do this we wouldn't split the tables manually; the code complexity is too high a price to pay. In MySQL 5.0 we would use a MERGE engine, which has issues of its own but would involve smaller changes to our code. That's still not worth it for us. What we're probably going to do is wait for MySQL 5.1 to get out of beta and then do some performance testing on tables partitioned by date and see if that gains us anything. For example, a SELECT on our comments table could be limited with a WHERE clause to only retrieve rows with a date >= the discussion object's date, which for 90% of our queries MySQL 5.1 could optimize to only look at the most recent partition. If the gains turn out to be significant, then since partitioning involves very limited code changes, we'll probably do that. Generally speaking, though, database performance is not a problem for us. So far our main bottlenecks have been CPU and RAM on the webheads. As long as we don't do anything stupid our database performance has been fine, though, as today proves, we are quite capable of being stupid.

    [ Parent ]

  8. Who needs data integrity? by _xeno_ · · Score: 2, Informative

    Ah, MySQL. Where trying to insert a row with a column value larger than the column can actually store results in MySQL clipping it to the max value.

    mysql> create table test (field mediumint unsigned);
    Query OK, 0 rows affected (0.05 sec)

    mysql> insert into test values (20000000000);
    Query OK, 1 row affected, 1 warning (0.02 sec)

    mysql> select * from test;
    +----------+
    | field |
    +----------+
    | 16777215 |
    +----------+
    1 row in set (0.00 sec)

    Fortunately, as of MySQL 5, you can fix this problem.

    mysql> set session sql_mode='TRADITIONAL';
    Query OK, 0 rows affected (0.00 sec)

    mysql> insert into test values (20000000000);
    ERROR 1264 (22003): Out of range value adjusted for column 'field' at row 1

    So, yes, early versions of MySQL had a brain-dead default SQL mode that simply "corrected" invalid column values, but MySQL 5 fixes this.

    Now if only they would add column constraints...

    --
    You are in a maze of twisty little relative jumps, all alike.
  9. So how do I get to post #1? by filesiteguy · · Score: 3, Informative

    I'd really like to see it. I bet it goes something like, "what's this stupid web thingy anyway? I bet it'll never make it to version 2.0..."

  10. Re:Old discussions by DeadCatX2 · · Score: 2, Informative

    Sorry, I think index was a bad choice of words.

    Whenever a post is made and it has a parent cid, that number must be stored in the table.

    If MySQL saturates instead of rolling over (see this comment), then all replies after comment 16,777,215 will have the wrong parent cid, and I don't think there's any way to fix it.

    --
    :(){ :|:& };:
  11. Use MySQL and eventually it'll bite your arse by LizardKing · · Score: 3, Informative

    I guess this is another thing to add to the MySQL gotchas page. Of course, in a decent database engine, like PostgreSQL, if you alter a column data type then the indexes are updated to reflect this.

  12. Re:Congrats taco by The_Wilschon · · Score: 2, Informative
    Does this mean that comment id#16777215 has the longest thread in history?
    Not likely. The parent pointer doesn't just max out (I would think, but I don't know how SQL handles things like this), but rather roll over. So many comments would point at random other comments as their parent. Except that nearly all of these would in fact be in a different article, and would therefore not form a thread.
    --
    SIGSEGV caught, terminating

    wait... not that kind of sig.
  13. Re:Congrats taco by LiquidCoooled · · Score: 2, Informative

    Before they did the fix the parents were pointing at 16777215 (which is why clicking parent displayed a dodgy template.

    The reason why the thread for comment 16777215 is not massive is because the comments now use a double key, ParentID(cid) and StoryID (sid) - displaying a thread is now "select ..... where cid=yourparent and sid=yourstory"

    --
    liqbase :: faster than paper