Jamesday · Slashdot Mirror

Delayed edit visibility on Google Goes to Answers.com · 2005-03-13 23:25 · Score: 2, Insightful

The key problem with that simple version of the proposed feature is a fundamental design flaw: it can't achieve its design objective because new accounts are easy to create and can do the same thing.

Expected reaction: use of new throwaway accounts and loss of the useful anon editor indicator which currently makes it easier to handle vandalism.

Likely consequence: it'll probably make it harder to identify and deal with problems because more of them will be concealed behind throwaway accounts.

Lots of solutions look easy at first, until you think how people will react to them. Then you have to design for that expected reaction. And ideally for a few levels of counter-reaction and reaction beyond that. Some form of delayed visibility is likely to be useful but confining it to only anonymous edits is likely not to be a good approach.

Thanks for your report. Your image removed. on Google Goes to Answers.com · 2005-03-13 23:08 · Score: 2, Interesting

Thanks for your report of a copyright infringment. I've removed the image you uploaded from the en.wikipedia.org Lemonade article, so it'll gradually be removed from all reusers. For the benefit of readers here, here is a copy of the note I just placed on the talk page of the uploader:

A person claiming to be you [http://slashdot.org/comments.pl?sid=142358&cid=11 929566] mentioned on Slashdot.org that you believed that the image [[:Image:Limonadedmg.jpg]] was not licensed with the GFDL but only with the CC-SA license. As the Slashdot post illustrated, that makes life difficult for reusers, who can't expect that the GFDL license will be sufficient. To avoid this, the upload agreement makes all uploads by the creator GFDL licensesd ''in addition to'' any other licenses the uploader may wish to grant.

Please either confirm that you are willing to grant a GFDL license or, at your option, either list it for deletion (wrong license) or let me know so that I can do so. We've no interest at all in compelling you to license it in a way contrary to your wishes but are trying to maintain some consistency for reusers.

The Commons project does accept a broader range of images and you may wish to consider placing it there instead if you don't wish to grant a GFDL license but do still want to make it available for others to use.

You should also consider that your work is arguably a derivative work of the tent design, the logo on the tent and the design of the lemonade squeezer. For that reason, while you may be releasing your portion of the work under one license, you may be making fair use of the work of others, making the combined work fair use. Fair use is not accepted at Commons. It is accepted at en.wikipedia.org but that would require the GFDL license in addition to any others.

Thanks for your assistance in resolving the licensing misunderstanding.

A comparable response should be expected for any similar situations.

It's an interesting precedent on Music Piracy Unit Raids ISP in BitTorrent Assault · 2005-03-10 22:49 · Score: 1

So, we have this precedent. Next step, consumer and artist groups using the same approach on music labels, publishers and copyright associations to invstigate possible price-fixing deals, some of which have already been found through other means. The law is a tool. It can serve more than one party. Perhaps time for some others to use it?

I have plenty of other things to be doing but I expect there are others around who might want to organise and get to work on this. No time like the present.

Well, we do get spikes, they just don't hurt on The Wikipedians Who Make it Happen · 2005-03-08 18:17 · Score: 4, Informative

Really obvious spikes are caused by Yahoo Japan. Extremely fast onset, 300-500 hits per second in less than a minute, then fast decay time over a few hours. One page so the Squids do an excellent job of caching it. The apache web servers/page builders don't normally show a spike at all from that. Slashdot has obviously slower onset, though it's still quite fast. TV also seems to cause fast spikes but we havne't seen enough while we've been able to chart it - previously had the caps set too low for a good measure. Newspapers are far more gentle in their load properties. The Tsunami coverage caused a general rise throughout the day for several weeks.

On the Slashdot/RSS thing, RSS is getting quite a reputaton for really unpleasant surge loads. Something we're factoring in to anything we doing relation to RSS, designing for caching. Not really a surprise if Slashdot has had to do some tweaking.

We were suffering a bit today from the combination of Slashdot, Wired News (Wikipedia Becomes a Way of Life) and Spiegel Online with an overloaded image server. Image server was bouncing around 100% utilization, kept some pages in the queue too long and that hurt overall apache capacity. We've seen far worse and we're getting rid of that bottleneck. As a temporary measure we've asked people to remove some pretty but not content images from a few places. Won't last long, though.

On the fund-raising side, the drive ended early after exceeding its $75,000 target. It's currently at around $95,000 probably with some data still to arrive, close to reaching $100,000, my initial thought of a target. Really good news for those of us doing the capacity and reliability work but it'll take a few months for it to be visible. Thanks to everyone here who helped!

Anyone who wants to spend a bit of money on another useful project might consider sending a bit to Freenode.net, the IRC host. Among other things they host our channels, including our offsite 24/7 IRC NOC and a superb MySQL channel, regularly inhabited by MySQL employees. Providing good service to lots of other open source projects.

Don't worry, Slashdotting is insignificant... on The Wikipedians Who Make it Happen · 2005-03-08 03:59 · Score: 5, Insightful

Don't worry about it. Slashdotting is insignificant to us. Typically adds only 150-300 hits per second. Apache web server CPU use (we're about to buy 10 more), one of our Squid cache servers.

Now, how many places can honestly say that a Slashdotting is insignificant (ducking from CmdrTaco)?:-)

MySQL moving up on Power Outage Takes Wikimedia Down · 2005-02-25 14:02 · Score: 1

Big iron (and expected corporate features) is still an area where MySQL is rapidly evolving. I doubt it'll take two years. Likely less with stimulous from high profile incidents.

Restore from backup for MySQL really means "restore from your backup and replay your binary log until you get back to the point of failure". Or ask MySQL for assistance - they will look at such cases. Neither is as good as I'd like of course - either involves more extended unavailability of data when the site needs to be up, if with incompete data, within minutes or a small number of hours.

On the followup side, additional power lines are being run to our racks and discussion with one RAID controller vendor indicates that a maximum of 20 minutes of battery backup can be expected. That's not long enough for a colo situation, so more followup with their engineers is needed, to see if they can produce something more realistic.

Re:Another indictment of MySql on Power Outage Takes Wikimedia Down · 2005-02-24 17:29 · Score: 1

Here's a portion of a report on the tests LiveJournal did. The chance isn't that small, in part because we have similar equipment, much from the same supplier. There's also a known OS glitch which is a possible factor, though this test doesn't cover that.

----

The client picks random 16kB-aligned offsets on the partition and picks a random 32-bit number which it writes in hex (%08x) over a 16kB range. it reports to the spewserver both BEFORE and AFTER the disk write.

-- the server notes what the client said it was about to do and what it reported doing.

-- let it run for awhile....

-- Pull the power...

-- server notices client hasn't sent anything in 3 seconds, quits, writing out a map of what 32-bit number pattern should be at each sector.

-- power on server

-- copy map file laptop (spewserver) to the server, run spewclient in verify mode. it dumps a histogram of errors per seconds-before-powerloss:

Histogram of seconds before end:
3 31
4 7
5 1
65 1

Well, the 3 seconds is really because the "end" is considered time AFTER the 3 second timeout, so that's kinda a bug. That should read 0,1,2 seconds before, not 3,4,5. But see how there are 31 regions that are bogus at t=0, 7 at t=-1, and 1 at t=-2?

That means something was lying, and we don't buy that hardware until we get it configured so it doesn't lie.

----

As you're probably aware, a system starting up doesn't tell you whether the disk drives are or aren't caching writes. You're probably also aware that some drives and/or controllers and/or drivers have been known to ignore flush requests and cache anyway.

Now, it is possible to design a database so that it can handle such failures in the rest of the system. I discussed that with MySQL while we were in the early stages of recovery to ensure that they were aware of this issue.

It's not only MySQL who are going to get such an approach from me. Two RAID controller vendors are going to as well, since the RAID controllers are supposed to be ensuring that the data is safely written.

The approach I take when asking a vendor to add a feature doesn't include pitching one of their competitors to anyone reading about the incident.

Re:mysql bad at disaster recovery? on Power Outage Takes Wikimedia Down · 2005-02-23 11:07 · Score: 1

You're not the only person wondering about the breaker issue. Last year we had an issue with an overloaded circuit which killed power to a rack so it's an issue we'll be discussing further. This and the things like fire and hurricane are part of why we're heading for more sites - we don't want problems at one place taking us down.

Four of the five RAID controllers have battery backed up write cache. Two controller brands.

There's an issue where Linux won't flush. There's the possibility that the controller didn't remember or didn't flush. There's the possibility that the drives were write caching. Or all of those at once may have happened. Your expectation matches mine. It still didn't happen.

Since there have been three recent prominent cases where it didn't happen on multiple systems from multiple vendors it's likely to be very clear to MySQL that real installations do have this as a problem and they should be addressing it. Will be interesting to see what they come up with.

So far two of the RAID systems have continuing problems which are keeping them from going back into service. One is rebuilding (which suggests that there was drive caching without flushing on command as an issue there). One was giving odd read results, so we're testing it and doing some planned changing of the RAID setup from 4 in 10 and 2 in 1 to 6 in 10 before we try putting it back into service.

Databases aren't the current performance issue though - half of the Squid cache servers we were using in Florida have continuing problems. Hadn't quite finished setting up 6 new ones before the power problem.

Excellent news for me is that we're at 61% of the $75,000 fundraising target already. If that continues and we go well over the target that makes it less necessary for me to be conservative in spending, so I can suggest more redundancy and such. My initial thought on a target was $100,000 and it's not a problem to spend all we get on speed and reliability things.

Hopefully the promising fundraising results so far and our intent to continue operating at least one site from donations will reassure anyone concerned about us becoming too dependent on any single donor. Thanks to those who have donated!

not quite on Power Outage Takes Wikimedia Down · 2005-02-22 15:41 · Score: 1

There are 8 people who have decided to call themselves that and are doing something. There's no broad community action on it and it's not in any way any sort of official editorial team with any official role.

Editing articles to a fixed state seems very unlikely to happen, since it's pretty thoroughly contrary to the method by which the project works and the complete and comprehensive objectives of the project. The general result of people trying to do it is them being barred from the project for uncooperative editing.

Paper and CD are risky targets because they lose the CDA and OCILLA protections which keep wikipedia.org, the Wikimedia Foundation and other contributors very safe from legal action based on content.

Re:Another indictment of MySql on Power Outage Takes Wikimedia Down · 2005-02-22 14:40 · Score: 1

The one which we used was "offline" only in the sense that we never use it for end user requests because it's used for bulk reporting, backup and apache web server work. It was applying transactions at a rate limited by its disk speed.

I expect the mod points will be used as deemed appropriate by those using them, on the basis of their understanding the merits of the posts concerned.

Re:Easy, brain-dead sql db recovery (if possible) on Power Outage Takes Wikimedia Down · 2005-02-22 12:43 · Score: 1

Agreed.

Oracle is still not going to happen. Not going to get into a potential huge recurring license and support fee situation. The cost/benefit trade for it just isn't right for this job - license and support fees can buy other solutions instead.

Re:mysql bad at disaster recovery? on Power Outage Takes Wikimedia Down · 2005-02-22 12:13 · Score: 1

At Wikimedia we're dealing with north of 180GB of data, though we're cutting that with better compression. Rsync from slave to slave on gigabit ethernet runs at anything from 20-40 megabytes per second while the network's servers are pushing out to the internet 80 or so megabits/s and doing about twice that on the internal connections. I expect it to take me 90-120 minutes to copy across all 180GB. Recovery of this sort was one of the reasons we switched from 100 megabit to gigabit for our internal network. We've benefitted in routine operations but this is the first time the investment has paid off in a major problem situation.

I'm curious about the nature of your problem - what's happening or not happening when replication stops, what error messages, if any, what happens when you do stop slave, start slave or a reset slave. Whether it's all slaves at the same time or only one of them.

Wikimedia does use bots for quite a few things. One box is suffering from an FC2-related replication glitch and a script corrects that automatically. Same script automatically kills unacceptably slow queries but since we control everything we can set up rules for that which we know are sufficiently safe for the situation. Harder for your case, perhaps impossible without getting really upset customers.

Re:mysql bad at disaster recovery? on Power Outage Takes Wikimedia Down · 2005-02-22 11:47 · Score: 1

The only one not using a fancy caching controller. Comparatively plain Linux sofware RAID 0 in a box used for last resort recovery and report generation, which never sees normal end user queries. Usual pattern for this box is to be running some heavy reports for 6-12 hours then catching up in replication. Repeat daily. Since it was current, it was online and replicating at the time of the power loss. If it hadn't been actively replicating/writing we'd have had some logs from the master to replay into it to get it current. It's typically doing 25-60 queries per second when it's catching up in replication, limited by disk speed, and it was probably doing that at the time of the power loss.

Wikipedia now read-write on Power Outage Takes Wikimedia Down · 2005-02-22 10:58 · Score: 1

Wikipedia is now read-write on a limited number of servers. Enough for most things but we still have some features disabled as the rest of the database servers catch up. Any data loss was limited, so far as we can tell at present, to the last few seconds at most.

Re:ACID on Power Outage Takes Wikimedia Down · 2005-02-22 10:55 · Score: 1

Since I'm mostly looking after MySQL servers I know I'm not current on all the latest PostgreSQL developments, so I thoght it wise to leave an opening for those who know it better to correct me if I'd missed something new in the last few months.:)

Re:mysql bad at disaster recovery? on Power Outage Takes Wikimedia Down · 2005-02-22 08:01 · Score: 1

The most recent case was a replacement for search where the prototype was in use and doing the same work on a server costing about $2,000 that two and a bit database servers costing $12-14,000 were doing. Java tool. Compatible with the latest java standard. Not allowed to run it apparently because the GNU JVM isn't compatible with the latest standards and at least one board member wants to use the GNU one, instead of using the Sun JVM until it catches up or the search engine gets further along in the prototype stage and gets to the point of doing compatibility work.

ACID on Power Outage Takes Wikimedia Down · 2005-02-22 07:24 · Score: 2, Informative

Except it's now been a few years since MySQL incorporated InnoDB, so maybe it's time to move on and rejoice that it's now one of the free database servers with ACID support? This one happens to come with standard replication and fulltext search. Also with a range of other engines to choose. PostgreSQL, last I knew, doesn't have built in replication, fulltext search and alternative storage engines but has it's own particular strengths. In the end, every end user gets to benefit from the competition between excellent tools. Good for us all to be happy about that.

Re:Coincidence... ;) on Power Outage Takes Wikimedia Down · 2005-02-22 07:15 · Score: 1

Slashdot is pretty popular. Wikipedia really only does 7-10 times its traffic according to Alexa.com. Hard to say which is most undercounted though. Wikipedia probably gets more AOL traffic, Slashdot more people who don't do more than curse at the Alexa toolbar used for collecting the statistics. Neither counted.

Re:Another indictment of MySql on Power Outage Takes Wikimedia Down · 2005-02-22 07:01 · Score: 1

The problem is working out why it didn't do what it is capable of doing and did on one system. Did the grandparent really expect the database to survive OS, controller and/or drives all lying about what they have committed to disk? That's the sort of issue we appear to have.

It is something which is worth trying to protect against.

If you haven't seen it already, take a look at the results from LiveJournal's testing

Re:Another indictment of MySql on Power Outage Takes Wikimedia Down · 2005-02-22 06:51 · Score: 2, Insightful

Depends on the cause. If the database server software was being lied to by the OS, controller or drives I'm not sure just how much I'm inclined to blame the database server sofware.

I am inclined to ask the database server vendor to see if they can find ways to protect against it and I've briefly discussed that already.

Re:ETA for read only service is now 2-4 hours. on Power Outage Takes Wikimedia Down · 2005-02-22 06:31 · Score: 1

Thanks. Don't thank too much though: remember that the technical team is supposed to be preventing this, not recovering from it. But recovering beats not.:)

Looks as though we're still 4-6 hours away from being read-write again. Catching up with the lag on the one we're restoring from is going fine, just takes a while. Seems very unlikely that we've lost any significant amount of data - of the order of fractions of a second to second's worth just before the failure is most likely.

Re:Write-ahead logging on Power Outage Takes Wikimedia Down · 2005-02-22 04:47 · Score: 1

Yes, it was 2.6.9 on Opterons (2.6.9-1.6_FC2smp and 2.6.9-1.681_FC3smp) . Don't know without checking about O_DIRECT. The one which survived completely intact is a P4 on 2.6.9-1.11_FC2.

Re:Integrity? on Power Outage Takes Wikimedia Down · 2005-02-22 03:35 · Score: 1

The best protection possible is having the Foundation have no more rights than any other GFDL licensee. And to be absolutely sure that it's not acting as if it's some sort of association of all contributors or something, in which case someone could argue that it had the rights to do that sort of unfortunate assignment of rights and has no Communications Decency Act or DMCA protection from the acts of those who contribute content. At which point it's conceivable for it to lose a copyright infringment case and all rights to the content except those granted by the GFDL.

One of the unfortunate things it's doing now is registering as a trademark things which are already common law trademarks of the authors of the work Wikipedia. It'll be interesting to see if someone objects to the trademark registration on that basis. I'm certainly considering it, because a successful registration will increase the vulnerability of the name of the work, changing it from an effectively impossible to lose trademark to one the Foundation can lose. Trademark registration for the Foundation being refused because of an existing common law trademark of the contributors would be a very positive result, protecting it from others without adding risk of loss.

There are some who don't understand this sort of risk reduction and suggest silliness like having the Foundation have copyright assigned to it.

This sort of thing is one of the reasons why I try to make it very clear that I am not a "member" of the Foundation and it has no power to act on my behalf in any matter regarding IP rights to what I've contributed. It's the best course to protecting the works.

Re:Write-ahead logging on Power Outage Takes Wikimedia Down · 2005-02-22 03:12 · Score: 1

Happy to oblige, since I really do understand what's happening with the systems I'm one of those looking after. And have been tracking the various investigations LiveJournal did.

I will be chatting with MySQL about this though. Two big sites with nice controllers having the same problem (and in our case, two different vendors) means there's something they should really look into handling better, somehow. Certain that we aren't the only people affected, just the prominent ones who are willing to write about it in public.

Re:170 gigs? on Power Outage Takes Wikimedia Down · 2005-02-22 03:03 · Score: 1

About this time last year, 15,000 RPM SCSI drives had 73GB as their largest available size and cost about $700 each. We have 6 in the master database server, set up in RAID10 with about 210GB of usable space available. 7200 RPM and 10,000 RPM drives were available in larger sizes.

We're definitely purchasing greater capacity for the next batch of master server systems. Thinking in terms of a terrabyte or so and 12-16 drives.

The traffic volume is also intersting. Think of about 200 million selects and 1.2 million inserts/updates per day and so far up to about 1300 hits per second. Slashdotting of us is only about 200-400 hits per second or so and stopped, usually, being a problem around April 2004.

Slashdot Mirror

User: Jamesday

Comments · 325