Too Perfect a Mirror
Carewolf writes "Jeff Mitchell writes on his blog about what almost became 'The Great KDE Disaster Of 2013.' It all started as simple update of the root git server and ended up with a corrupt git repository automatically mirrored to every mirror and deleting every copy of most KDE repositories. It ends by discussing what the problem is with git --mirror and how you can avoid similar problems in the future."
Preferably, before using them? This sounds very much like plain old incompetence, possibly coupled with plain old arrogance. Thinking that using a version control system does absolve one from making backups is just plain old stupid. Then, with what I have seen from the KDE project, that would be consistent.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
This is not a problem with git --mirror: rsync or any other mirroring tool would end up in the same situation.
It's up to the master to deliver the goods and upgrading a master should include performing a test run as well as making a backup prior to the real upgrade. This was a procedural failure, not a software failure. But good to hear disaster was averted.
Good grief!
After all of that, not a single proposed solution is a proper, rotational backup.
This is what rotational backups are FOR. They let you go back months in time, and even do post-corruption, or post-cracking examination of the machine that went down!
Backups do *not* need to be done to tape, but a mirror or a raid card is NOT a backup. This is actually simple, simple stuff, and it seems like the admins at KDE are a bit wet behind the ears, in terms of backups.
They probably think that because backups used to mean tape, that's old tech, and no one does that.
Not so! Many organizations I admin, and many others I know of, simply do off-site rotational backups using rsync + rotation scripts. This is the key part, copies of the data as it changes over time. You *never* overwrite your backups, EVER.
And with proper rotational backups, only the changed data is backed up, so the daily backup size is not as large as you might think. I doubt the entire KDE git tree changes by even 0.1% every day.
Rotational backups -- works like a charm, would completely prevent any concern or issue with a problem like this, and IT IS WHAT YOU NEED TO BE DOING, ALWAYS!
No. Backup is out of scope for version control. Anybody with actual common sense would not expect it to make backups "magically" by itself and check to make sure. Then they would implement backups. But that does actually require said common sense.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Well, so this was _not_ a git failure, as there was an explicit warning that it does not cover this case. Not the fault of git but those that did not bother to find out. That a "mirror" operation does not check the repository is also no surprise at all.
Incidentally, even if git had failed, that is why you have independent and verified backups. A competently designed and managed system can survive the failure of any one component.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
"Not the fault of git but those that did not bother to find out"
No, Git has the integrity check, the integrity check didn't work. If the integrity check had worked as claimed then their backups were solid.
I know people are saying "keep backups", but they're really missing the point. A backup is a copy of something, the more up to date the better, better still if it keeps a historic set of backups. Perhaps with some sort of software to minimize the size, perhaps only keep changes..... you can see where I'm going with this.
Git sync to a lot of drives IS A BACKUP. It is exactly what an ideal backup should be, historic, up to date, minimizes storage. What is that system if it isn't an automatic backup!
Except for this bug, which needs to be fixed, and a little less faith in git too would also be a good thing.
It's really no different than if you use the backup software, and it made careful backups and kept historic copies, and then one day your disk got corrupted, you promptly went to your backups only to find the backup software had been chomping those because it didn't notice the integrity was corrupt and had happily been corrupting the backups it was keeping.
So I see comments saying they didn't have backups OMG! But no, their problem was they only used ONE TYPE OF BACKUP SOFTWARE Git sync. I bet all of you use only ONE type of backup software and are equally vulnerable to this failure.
May I respectfully disagree? I've often seen such focus on what is "out of scope" used to limit cost and to limit the "turf" on which an employer or contractor needs access. But backup is _certainly_ a critical part of source control, just as security is. The ability to replicate a working source control system to other hardware or environments due to failure or corruption of the primary server is critical to any critical source tree. Calling it "out of scope" is like calling security "out of scope". By ignoring the consequences at the design stages of a source control system, very real risks are often taken without even thinking of the possible consequences, and the resources necessary to provide such critical features later can, and often do, multiply the cost of a project in unexpected ways.
A nightly mirror on low-cost hardware with snapshot capability, for example, can provide very useful fallback capability. Even hardlink based softwaer snapshots can work well.. It requires thought to configure correctly, and to schedule the mirrors and make sure they don't conflict with other high bandwidth operations such as tape backup, and to handle "churn" diskspace requirements. And I've had some very good success with partners and clients who took such modest backup tools and saved enormous cost on high-speed tape backup systems high bandwidth connections for remote mirroring facilities, or who had difficulti4es meeting very short backup windows by using the mirror, or the snapshots, to do the tape backups for archival. It does inject a phase delay into the tape backups, and recovery from tape has to be tested, but it's been extremely effective.
Several times, I've found that the problem is a political one. The backup system is often a very expensive, high performance capital cost, or some kind of proprietary "turf" of a manager who is very comfortable with and enamored of it, and they're concerned that adding this layer will make them look foolish for spending the money, or cost them their job as a proprietary owner of critical infrastructure. They already had the political battle purchasing the hardware in the first place and don't care to rehash their previous work. But it's often amazing what staging the backups this way can do for performance and user access to their backed up data. Most restoration cases are due to accidental file deletion or editing, and the users no longer need access to the tape backup system or off-site archival, and only to the snapshots which have read-only access with the same privileges as the original source material.