MIT's New File System Won't Lose Data During Crashes

Define "crash". by Anonymous Coward · 2015-08-24 00:58 · Score: 0

Hardware failure?

Re:Define "crash". by Nerrd · 2015-08-24 07:05 · Score: 1

rtfa - this is clearly a discussion about software and/or power failure, not hardware.
Re:Define "crash". by ChumpusRex2003 · 2015-08-24 07:47 · Score: 3, Insightful

However, considering how many firmware bugs there are in modern SSDs and HDs which cause out-of-order writes, shorn writes, and various other random corruptions and bad behaviour during a power failure or system reset, what would be more interesting is how the file system could cope with one of those.
Re:Define "crash". by gweihir · 2015-08-24 08:44 · Score: 2

As the Linux FS developers found some time ago, many modern disks also blatantly lie about having flushed data to disk. With software, you can then only do heuristic things to reduce the impact, not more. This "magic" MIT FS cannot do any better, and hence is entirely superfluous.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Define "crash". by Darinbob · 2015-08-24 14:50 · Score: 1

That's a snag I've found in the past. So much hardware is consumer grade so that the datasheets are rubbish. You think that your write has succeeded but it's still got data stuck in the cache, and the "flush, dammit" command is actually a no-op. So a never-fails file system I used in the past was getting corrupted, and the developers of it were baffled because they never saw a failure like that before.
So let's say your super reliable file system is mathematically proven, but the axioms are false, what then? Consider a flash drive. Inside it has all of its own logic; when to copy from RAM into the actual flash chips, when to copy and erase sectors behind the scenes invisibly to the user, and so forth. There's a lot of state machine activity going on that the mathematically proven file system knows nothing about. If the synchronize-cache command doesn't quite do what expected then things can break. Write operations can happen out of order also (the real reason for synchronize-cache, which is why you should specify the entire device and not just the most recent block). What if power is killed in the middle of an internal copy-erase-compact cycle, which leaves you at the mercy of whoever designed your device?
Re:Define "crash". by Darinbob · 2015-08-24 14:57 · Score: 3, Insightful

Many consumer devices generally conform to an unwritten standard, not a written one. That unwritten standard is "do whatever is needed to make it work on Windows". This means that mandatory commands might not be fully tested, or sometimes not even implemented. And because these are consumer devices, the manufacturers generally aren't concerned about rare end-user problems, since whoever bought the flash thumb drive on sale isn't going to complain, and the margins are way too small to waste time on reliability. All that matters is getting it to work for most people most of the time.
Re:Define "crash". by gweihir · 2015-08-24 15:45 · Score: 1

I completely agree.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Define "crash". by arglebargle_xiv · 2015-08-24 15:57 · Score: 1

Beware of bugs in the above code; I have only proved it correct, not tried it.

-- Donald Knuth.
This new filesystem sounds like another case of this...
Re:Define "crash". by doccus · 2015-08-25 16:37 · Score: 1

Same as if power is killed in any initial write. I can't accept any drive can be psychic and know what's not yet written I suppose heuristics could fill in some missing data but it surely can't complete all the data that never even hit the cache. I'm no programmer or expert so perhaps I'm missing something here...

But is it useful? by Junta · 2015-08-24 00:58 · Score: 4, Interesting

slow compared to other modern examples, but the researchers say their formal verification can also work with faster designs

If we can accept 'slow', it's not that difficult to build an always consistent filesystem. While they can formally verify a faster design should one exist, there remains the question of whether it's possible to be formally proven to be resilient to data loss while taking some of the utterly required performance shortcuts for acceptable performance. I suspect the answer is that some essential performance 'shortcuts' will fail that verification.

--
XML is like violence. If it doesn't solve the problem, use more.

Re:But is it useful? by FranTaylor · 2015-08-24 01:12 · Score: 1

If we can accept 'slow', it's not that difficult to build an always consistent filesystem.

citation required
Re:But is it useful? by Anonymous Coward · 2015-08-24 01:25 · Score: 5, Funny

A sufficiently slow writing filesystem is indistinguishable from a read-only filesystem. Read-only filesystems are consistent. Therefore, a slow enough writing filesystem is consistent.
For more serious analysis of crash-resistant write sequences, look elsewhere in this discussion.
Re:But is it useful? by FranTaylor · 2015-08-24 01:31 · Score: 1

what does this phrase even mean? why does speed or slowness matter? you can play the video of a disk disaster at any speed you want
Re:But is it useful? by Anonymous Coward · 2015-08-24 01:39 · Score: 0

That's the worst 'proof' I've ever seen on /.
Re:But is it useful? by Anonymous Coward · 2015-08-24 02:11 · Score: 0

You are talking out your ( | )
Re: But is it useful? by Anonymous Coward · 2015-08-24 02:20 · Score: 0

This will be perfect for those poor politicians whose tweets get deleted.
Re:But is it useful? by Darinbob · 2015-08-24 14:59 · Score: 1

Write a block to the storage device.
Apply all necessary flush and synchronization commands, and a few unnecessary ones.
Power off the storage device.
Power the device back on.
Read back the block to ensure it was actually written.
Repeat as necessary until block is confirmed as having been written.
Continue with block number two...
Re:But is it useful? by arglebargle_xiv · 2015-08-24 16:05 · Score: 1

If we can accept 'slow', it's not that difficult to build an always consistent filesystem.
citation required
Sheesh, do we have to do the Googling for you?
Re:But is it useful? by Anonymous Coward · 2015-08-24 19:14 · Score: 0

if it doesn't write before you die of complications of age, it is practically read only.

Well it won't lose data... by frank_adrian314159 · 2015-08-24 00:59 · Score: 1

... until they find a logical flaw in their proofs or the bugs in mechanical verifier(s) that helped them prove the driver correct.

--
That is all.

Re:Well it won't lose data... by Anonymous Coward · 2015-08-24 01:08 · Score: 1, Interesting

Or – more likely – until they find a flaw in their assumptions, like a lower-level software stack that swears "yes, this data has been committed to storage" slightly before that's actually true.
Re:Well it won't lose data... by Anonymous Coward · 2015-08-24 01:10 · Score: 0

Well, the mechanical verifier is very small - 400 lines of verifier code, that is read by many,
still is better than file-system code which span many more lines.
Re:Well it won't lose data... by Anonymous Coward · 2015-08-24 01:31 · Score: 0

How about this as a clue?
Four color theorem: formal proof using Coq was completed in September 2004.
Feit–Thompson theorem: formal proof using Coq was completed in September 2012.
source: https://en.wikipedia.org/wiki/Coq
Re:Well it won't lose data... by Anonymous Coward · 2015-08-24 01:57 · Score: 0

Or a higher level one in the app that's saved your data to a data structure in memory but not yet written it to the filesystem.
Re: Well it won't lose data... by Anonymous Coward · 2015-08-24 05:20 · Score: 0

And then sent all that data to the cloud for "processing and storage"
Re:Well it won't lose data... by Anonymous Coward · 2015-08-24 06:37 · Score: 0

I think this one gets listed under "not our problem".
Re:Well it won't lose data... by gweihir · 2015-08-24 08:47 · Score: 1

That is pretty unlikely, but the whole thing is a worthless stunt anyways. The problem is that they have to use some hardware model and that will have errors. Hence the assurances they claim are purely theoretical, and in practice their thing may well be less reliable than a well-tested file system with data-journal, like ext3.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Well it won't lose data... by Anonymous Coward · 2015-08-25 06:07 · Score: 0

I love these, another "theoreticians say they formally verified X but they probably made a mistake/are pathological liars/have an embarrassing sexual inadequacy" thread.
People say that formal verification is impossible, and indeed it is as long as no assertions can be made about the software underlying the software you're trying to verify. But it should be possible to make those assertions.
Have a look at Leslie Lamport's (highly condescending) "How to Tell a Program from an Automobile." http://research.microsoft.com/en-us/um/people/lamport/pubs/automobile.pdf

I take it... by Anonymous Coward · 2015-08-24 00:59 · Score: 0

... that "crashing" here refers to the program using the file system, not the disk?

Formally verified system fails by Anonymous Coward · 2015-08-24 01:00 · Score: 3, Funny

MIT's new "crash-proof file system" crashed today amid accusations of bugs in the formal proof verification software used to formally verify it.

MIT are now working on a formal verification of the formal verification system, in order to avoid similar formal-verification-related problems in the future.

Re:Formally verified system fails by Anonymous Coward · 2015-08-24 04:52 · Score: 1

It's formal verifications all the way down, until you meet the axiom of choice.
Re:Formally verified system fails by arglebargle_xiv · 2015-08-24 16:37 · Score: 1

MIT's new "crash-proof file system" crashed today amid accusations of bugs in the formal proof verification software used to formally verify it.
So the whole thing was a bit of a Coq-up?

Linux File Systems by Anonymous Coward · 2015-08-24 01:04 · Score: 5, Interesting

I find some of the current file systems to be adequately reliable. Even their performance is acceptable. But, the Linux systems are lacking.

Is there a reliable Linux file system such as EXT4 that has an easy to use copy on write(CoW) feature to allow instant recovery of any file changed at any time?

rm ./test
restore --last test ./

dd if=/dev/random of=./test bs=1M count=10
restore --datetime test ./

Novell Netware FS did all this and more in 1995 FFS! Fifteen years later and Linux doesn't seem to be able to do this. NTFS doesn't seem to be able to do this either. Yet Novell is dead?

Re:Linux File Systems by Anonymous Coward · 2015-08-24 01:11 · Score: 2, Informative

Well, ex-Googler Kent Overstreet recently announced a COW filesystem on lkml:
https://lkml.org/lkml/2015/8/21/22
Not ready for production I would say, but looks interesting.
Re:Linux File Systems by swb · 2015-08-24 01:22 · Score: 4, Interesting

I still think Netware's filesystem permission model was better than anything out there now, at least for filesharing.
The feature I miss the most is allowing traversal through a directory hierarchy a user has no explicit permissions for to get to a folder they do have permissions for. I find the workarounds for this in other filesystems to be extremely ugly.
I think NDS was better in a lot of ways than AD, although it would have been nice if there had been something better than bindery mode for eliminating the need for users to know their fully qualified NDS name.
I also kind of wish TCP/IP had used the network:mac numbering scheme that IPX used. The rest of IPX/SPX I don't need, but there'd be no talk of address exhaustion of IPv4 if that scheme had been adopted, little need for DHCP address assignment and the addressing scheme would scale to the larger broadcast domains enabled by modern switching (avoiding the need to renumber legacy segments completely when they exhausted a /24 space and expansion via mask reduction wasn't possible due to linear numbering on adjacent segments).
Re:Linux File Systems by Anonymous Coward · 2015-08-24 01:27 · Score: 0

ZFS, BTRFS could offer a similar feature.
You can snapshot any directory/subvolume you want, anytime, but they aren't true versioning file systems.
Re:Linux File Systems by Christian+Smith · 2015-08-24 01:32 · Score: 3, Interesting

I find some of the current file systems to be adequately reliable. Even their performance is acceptable. But, the Linux systems are lacking.
Is there a reliable Linux file system such as EXT4 that has an easy to use copy on write(CoW) feature to allow instant recovery of any file changed at any time?
NILFS2 provides continuous point in time snapshots, which can be selectively mounted and data recovered. Not quite as instant recovery as your use case examples, but it's only a few commands/wrapper scripts away.
Re:Linux File Systems by Anonymous Coward · 2015-08-24 01:37 · Score: 0

Windows servers have supported the traverse permission since at least 2008. https://technet.microsoft.com/en-us/library/Cc787794(v=WS.10).aspx .
And dear god no on IPX/SPX . That's what killed Novell, no one wanted to run a 2nd, seperate layer 3 networks to get to the internet.
Re:Linux File Systems by Anonymous Coward · 2015-08-24 01:38 · Score: 0

+11111111111111111111111111111111111...
NetWare's file permission inheritance and traversal is still way better than what we have today in that space; I still miss it regularly.
I don't miss NDS as often, but it was still a better directory service than AD is today.
Re:Linux File Systems by nbvb · 2015-08-24 02:47 · Score: 3, Interesting

Bah, forget NetWare, VINES and StreetTalk did everything you ask for and then some way before NDS was even a thought.
VINES' ACL's were beautifully granular ...
Re:Linux File Systems by allo · 2015-08-24 02:50 · Score: 1

nilfs2 does.
Re:Linux File Systems by swb · 2015-08-24 02:51 · Score: 2

It's a kludge, though, in NTFS, in Netware it just worked.
IPX/SPX as a layer 3 protocol isn't what I wanted, I wanted TCPv4 with a network prefix and the MAC as the node address. Clients could derive their address automatically from network traffic by picking up the network address from the wire and they already knew their MAC address.
Although to be honest, IPX/SPX even as a secondary protocol wasn't that bad to support in a mixed environment. We had no issues with TCP, IPX *and* CrappleTalk on 512k frame relay links. We retired internetworking on IPX when the last Netware servers went away but were stuck with AppleTalk (circa 2003).
Re:Linux File Systems by LDAPMAN · 2015-08-24 03:01 · Score: 2

It still is: https://dl.netiq.com/Download?...
I build multi-million user IAM systems on it for a living.
Re:Linux File Systems by Anonymous Coward · 2015-08-24 03:03 · Score: 0

NTFS does have something like this with Previous Versions, but it takes manually turning it on, scheduling it to snapshot volumes, and making sure there is enough space on each snapshotted volume to handle most oopses.
So far, there are two filesystems I've worked with which are excellent at handling snapshots and undeletes with few headaches. NetApp's WAFL, and EMC's Isilon. WAFL is nice because it creates hourly, daily, weekly, and monthly snapshots which are easy to access. Isilon's snapshot functionality (SnapshotIQ) also creates a .snapshot directory, but one needs to configure it beforehand of when to take snapshots and how long to retain them.
I would agree that the Novell filesystem had a lot of cool features. One that was nice is that if a user was not given access to a file or folder, it didn't show up. Windows sort of has this with file sharing, by turning on access based enumeration, but Novell was better at this since it was done locally.
As for filesystems, the one company that needs a kick in the tush to upgrade their antediluvian filesystem is Apple. HFS+ is one of the biggest kludges out there, and it is a miracle that it has held up this long. Ideally, Apple needs to bite the bullet and go ZFS, but if that can't be done, work on XSan or knock on Symantec's door, grab the code for Veritas's filesystem and use that (hell, Symantec sits on enough IP to create a second computing revolution, but they do little to nothing with it.) Posting AC... but flames or no, Apple is the window-licker when it comes to OS filesystems as of now.
Re:Linux File Systems by Anonymous Coward · 2015-08-24 03:07 · Score: 0

rm /bin/rm
alias rm mv ~/recycle
alias restore mv ~/recycle/
Re:Linux File Systems by LordLimecat · 2015-08-24 03:59 · Score: 1

How is it a kludge? You have the option in NTFS of having that everyone:traverse permission propogating through the directory heirarchy, but it isnt required. It sounds like Netware just assumes that traversal rights are implicit, where there may be scenarios where it is not desired.
Re:Linux File Systems by Anonymous Coward · 2015-08-24 04:01 · Score: 0

No. Snapshots offer lengthy windows of opportunity for loss. And especially in teh case of Microsoft Windows where the snapshots aren't even turned on or scheduled by default, Previous Versions is a primitive and pathetic attempt at what I'm talking about.And scheduling previous version snapshots every couple of minutes enjoys a crushing effect on system performance.
Copy on Write , al la Netware FS, means that each file is "snapshotted"(recycle binned or whatever you want to call it) anytime/every time that it is modified. If there is a file write/delete the file is copied. If it gets deleted, the previous copy is available. No file is ever missed, unless you configured it with the Purge Immediate attribute(tmp files), it is protected by default until volume space is required.
Re:Linux File Systems by Bengie · 2015-08-24 04:41 · Score: 2

A true versioning file system would crumble under an IO workload with many small updates. The problem with btrfs is you can't snapshot two subvolumes in sync with eachother. Snapshots are per subvolume. In ZFS, snapshots are at the pool level, allowing volumes to be in perfect sync.
Re:Linux File Systems by psmears · 2015-08-24 04:50 · Score: 1

I wanted TCPv4 with a network prefix and the MAC as the node address. Clients could derive their address automatically from network traffic by picking up the network address from the wire and they already knew their MAC address.
You mean more or less exactly what IPv6 does (in many cases)?

--
Need to type accents and special characters in Windows? Use FrKeys
Re:Linux File Systems by AvitarX · 2015-08-24 06:23 · Score: 1

That's how I read it, wishing that v4 had done that already.

--
Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
Re:Linux File Systems by vovin · 2015-08-24 06:28 · Score: 1

You are looking for NILFS2.
Of course you use more disk because you nothing gets deleted...
Re:Linux File Systems by Anonymous Coward · 2015-08-24 06:42 · Score: 0

Snapshots are filesystem level in ZFS, not pool level.
Re:Linux File Systems by Bengie · 2015-08-24 08:14 · Score: 1

I think you mean "volume" instead of "filesystem". All snapshots are relative to the pool. You can create a snapshot of several volumes at once in perfect sync because the snapshot is actually at the pool level, but only attached to the relative volume that you're looking at. When you have mounted volumes inside of volumes, not only will the data of the current volume be part of the snapshot, but all of the data in the mounted volumes.

BTRFS stores snapshots in the volume, ZFS stores them in the pool. If you snapshot a BRTFS sub-volume, you will only get that sub-volume and nothing else. You can make a script to loop through the mounted sub-volumes within a sub-volume, but there is no guarantee they will be all in sync. Even worse, if a sub-volume later gets unmounted and moved somewhere else, or the sub-volume is deleted in BTRFS, all data for that sub-volume is gone, including it's snapshots. In ZFS, if you have a snapshot of parent volume and there are mounted child volumes, and you snapshot at the parent level, everything is part of that snapshot, including the contents of the child volumes.
Re:Linux File Systems by mattack2 · 2015-08-24 09:12 · Score: 1

Novell Netware FS did all this and more in 1995 FFS! Fifteen years later ...
I think your Novell system needs a new clock battery or something...
Re:Linux File Systems by Anonymous Coward · 2015-08-24 10:45 · Score: 0

I think you mean "volume" instead of "filesystem". All snapshots are relative to the pool. You can create a snapshot of several volumes at once in perfect sync because the snapshot is actually at the pool level, but only attached to the relative volume that you're looking at. When you have mounted volumes inside of volumes, not only will the data of the current volume be part of the snapshot, but all of the data in the mounted volumes.
This is mostly incorrect, or at least poorly put. ZFS snapshots are applied to 'datasets', which are either filesystems or zvols (virtual block devices) of a pool. If I have many datasets underneath my pool's root (which is itself a filesystem dataset), I can snapshot the root without snapshotting descendants. You *can* snapshot all descendant datasets with the -r option, but this is optional.
Re:Linux File Systems by Anonymous Coward · 2015-08-24 12:36 · Score: 0

Nope, you should bust out the zdb and you'll see that the snapshot DMU object is very much attached to a dataset or zvol.
Re:Linux File Systems by arglebargle_xiv · 2015-08-24 16:50 · Score: 1

Bah, forget this newfangled crap, Multics was doing this half a century ago (which includes use of RAID tape drives, checkpointing, ACLs, and other recent innovations).
Re:Linux File Systems by Anonymous Coward · 2015-08-24 22:44 · Score: 0

Well Windows/NTFS has had "Bypass traverse checking" for ages which is actually set by default, so no workaround needed there (it's not Linux, but neither was Netwere ...)
Re:Linux File Systems by Bengie · 2015-08-25 02:51 · Score: 1

I already said that. The snapshot is "attached" to the dataset, but the snapshot is scoped to the pool.
Re: Linux File Systems by bill_mcgonigle · 2015-08-25 05:02 · Score: 1

what is the command to snapshot two zvols atomically?

--
My God, it's Full of Source!
OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
Re:Linux File Systems by ebvwfbw · 2015-08-25 14:32 · Score: 1

Really AC? A simple un-delete trick and you're impressed? It can be done with EXT as well, however why would you even want to do that? This is why you deleted it in the first place. Don't delete stuff you want. You should have learned that back in 1995. You know, I have a little course you can take from me. $5,000 for one weeks worth of training. This is where I have you type in for about a day, and delete your file. You must then type it in again the second day, to delete it again. By the 5th day you should be over your unintentional deletion problem. We'll all go out to Ruby Tuesdays or something for dinner and laugh at you.
This isn't why Novell died. It was a marketing tsunami from Microsoft to sell a network services they couldn't give away before. Repackage it, call it "New Technology" and people bought their BS. I remember we were replacing a single Novell server with up to 5 much newer, much more memory, disk, etc machines to do barely the same job as the old reliable Novell server. Here we are all these years later and M$ still isn't much better. Unix still takes rules. Even in Government agencies, sure they have a few MS boxes, about as many or more Linux machines. The lightweight stuff is on the M$ boxes, the real computing is still on the Linux machines. Always will be.
Re: Linux File Systems by Bengie · 2015-08-28 05:18 · Score: 1

If you use a ZFS mount to mount a volume under a volume, and you snapshot the parent volume, you get the child volume snapshotted atomically. All datasets in ZFS share the same uberblock, so they can all be in perfect sync.

It's a matter of distance by aglider · 2015-08-24 01:06 · Score: 2

Not a matter of proof. The distance between perfect design and buggy implementation. IMHO.

--
Sent as ripples into the electromagnetic field. No single photon has been harmed in the process.

Journaled File System? by marciot · 2015-08-24 01:11 · Score: 1

i thought Journaled file systems already possessed this feature.

Re:Journaled File System? by FranTaylor · 2015-08-24 01:15 · Score: 1

i thought Journaled file systems already possessed this feature.
just like air bags and seat belts have eliminated all deaths on the road
Re:Journaled File System? by Anonymous Coward · 2015-08-24 01:23 · Score: 0

Along with Copy On Write I think that covers most of the possibilities (old block is valid and usable until the (hopefully replicated) metadata is updated after the write finishes successfully).
They probably figured that on the software side filesystem bugs cause the most data loss, and thus focused on making super tight code.
Re:Journaled File System? by Darinbob · 2015-08-24 15:06 · Score: 1

You need to ensure that blocks are written to the media in the correct order. Or at least that everying before a synchronization point was completely written to the media. But even that is not always true because devices will lie and claim to have flushed data when they have not. So you also need to ensure that your underlying block based device is operating correctly, and that can be tricky when it's a third party device.

Is it really THAT hard? by ledow · 2015-08-24 01:13 · Score: 4, Interesting

Write zero to a flag.
Write data to temporary area.
Calculate checksum and keep with temporary area.
When write is complete, signal application.
Copy data from temporary area when convenient.
Check checksum from temporary to permanent is the same.
Mark flag when finished.

If you crash before you write the zero, you don't have anything to write anyway.
If you crash mid-write, you've not signalled the application that you've done anything anyway. And you can checksum to see if you crashed JUST BEFORE the end, or half-way through.
If you crash mid-copy, your next restart should spot the temporary area being full with a zero-flag (meaning you haven't properly written it yet). Resume from the copy stage. Checksum will double-check this for you.
If you crash post-copy, pre-flagging, you end up doing the copy twice, big deal.
If you crash post-flagging, your filesystem is consistent.

I'm sure that things like error-handling are much more complex (what if you have space for the initial copy but not the full copy? What if the device goes read-only mid-way through?) but in terms of consistency is it really all that hard?

The problem is that somewhere, somehow, applications are waiting for you to confirm the write, and you can either delay (which slows everything down), or lie (which breaks consistency). Past that, it doesn't really matter. And if you get cut-off before you can confirm the write, data will be lost EVEN ON A PERFECT FILESYSTEM. You might be filesystem-consistent, but it won't reflect everything that was written.

Journalling doesn't need to be mathematically-proven, just logically thought through. But fast journalling filesystems are damn hard, as these guys have found out.

Re:Is it really THAT hard? by Anonymous Coward · 2015-08-24 01:21 · Score: 0

I'm glad you have it all sorted out. Where may I download your operating system good sir?
Re:Is it really THAT hard? by Anonymous Coward · 2015-08-24 01:32 · Score: 0

So you're saying All You Have To Do Is....?
Re:Is it really THAT hard? by Anonymous Coward · 2015-08-24 01:40 · Score: 0

Yes, it's THAT hard.
Your outline is a good start, but it's not a proof, nor anywhere close to one. Having a proof is a big deal. Hand-wavers on Slashdot may (and do) disagree, but that's to be expected.
Re:Is it really THAT hard? by Guspaz · 2015-08-24 02:37 · Score: 4, Insightful

> When write is complete, signal application.
How do you know the write was complete? Most storage hardware lies about completing the write. The ZFS folks found this out the hard way: their filesystem was supposed to survive arbitrary power failures, and on a limited set of hardware that was true. In reality, most drives/controllers say they've committed the write to disk when it's still in their cache.
Any filesystem that claims to survive crashes needs to take into account that any write confirmation could be a lie, and that any data it has written in the past may still be in a volatile cache.
Re:Is it really THAT hard? by Anonymous Coward · 2015-08-24 03:54 · Score: 0

What happens when your disk has a hardware errror and starts reporting incorrect values for your flags or your checksums? what happens when your operator plugs the computer into a EU electric plug instead of a US plug, and the disk starts doing random crazy things?
Not to say your idea isn't good. But the real world is CRAZY GOOD at finding ways to screw things up.
Re:Is it really THAT hard? by fnj · 2015-08-24 07:14 · Score: 2

You can't expect software to paper over BUSTED HARDWARE. If a disk drive flat out lies about status, expose the goddam manufacturer and sue him out of existence. If you think anyone can paper over the scenario you just outlined, then what about this - what about a disk drive that NEVER WRITES ANYTHING but lies and says everything is going hunky dory? Pretty damn sure there's nothing you can do in that scenario.
I've heard that story about the "drives that lie about about write-to-physical-media-complete" many times. As far as I am concerned it is apocryphal. Never once have I seen the manufacturers and model numbers specified. I think it probably came from a long long time ago (in technology time). Finally I sure as hell have no idea how "examples exist" turned into "most storage hardware" as the tale is spun and respun over the years.
Re:Is it really THAT hard? by ShooterNeo · 2015-08-24 07:38 · Score: 1

I think you're basically right. I read about an even simpler system. They wanted to prove if a robotic surgical arm, controlled by a multi-axis joystick, would always no matter what move only when the surgeon commanded it.
So basically, you read the joystick position sensor for an axis. You multiply by a coefficient, usually less than 1. You send that number to the arm controller. Arm controller tries to move to that position.
Smooth, linear, no discontinuities...you technically only need to check about 3 states (the edges and the middle) for a linear system without discontinuities. If it works, you're done. (for each axis so if it's 3 axis that's 27 states to check)
Somehow their system had bugs in it that they found with formal verification. Not sure how...
Re:Is it really THAT hard? by Anonymous Coward · 2015-08-24 08:02 · Score: 0

The MIT paper specifically is not about bad hardware. N
Re:Is it really THAT hard? by operagost · 2015-08-24 08:15 · Score: 1

The disk buffer is indeed a write cache, and modern disk driver software uses the ATA FLUSH CACHE or SAS/SCSI FUA commands to enforce physical writes.

--

Gamingmuseum.com: Give your 3D accelerator a rest.
Re:Is it really THAT hard? by Anonymous Coward · 2015-08-24 08:26 · Score: 0

...If you think anyone can paper over the scenario you just outlined, then what about this - what about a disk drive that NEVER WRITES ANYTHING but lies and says everything is going hunky dory? Pretty damn sure there's nothing you can do in that scenario....
You mean like those fake Chinese USB flashdrives? Example: https://www.youtube.com/watch?v=HaHokQ5EWv0
Pretty sure any filesystem is completely hosed when the underlying hardware lies.
Re:Is it really THAT hard? by Guspaz · 2015-08-24 09:56 · Score: 1

Then the MIT paper ignores reality, and is therefore useless.
You can prove your system is technically correct all you want, but if it doesn't work in the real world (where most hardware lies about write commits), then it's going to work in theory and fail in practice.
Re:Is it really THAT hard? by Anonymous Coward · 2015-08-24 17:51 · Score: 0

Easier to just never lose power. Ever.
Spend 200 bucks on a battery backed ups and a deep cycle marine battery.
Re:Is it really THAT hard? by ledow · 2015-08-24 19:19 · Score: 1

What on earth makes you think that any algorithm, proof or technique can account for hardware failure of any kind? That's what RAID, etc. are for and are still far from a guarantee.
Plus, kind of the point of a checksum is to ensure the integrity (to a certain probability) of data. If either the checksum or data change, they will no longer match up - short of a billions-to-one random chance that you can't do anything about anyway. Incorporate the flag into the data that you checksum and that's covered.
You cannot mathematically prove that any single bit will ever be written to disk whatsoever. All you can do is prove that you won't lose bits that you've confirmed as written **unless there is hardware failure**. That's it. Hell, we can't even prove that we're talking to a real device rather than an emulated one that just discards random bits.

Not exactly new or news... by bradgoodman · 2015-08-24 01:15 · Score: 2

No specifics. This has been done a million times with journaling filesystem (and block layers) - no idea why this is better or different. But what about disk failure? But what about data loss? But what about (undetected) data corruption (at the disk)? What about unexpected power hits that could drop a disk or tear a write? Not even going to get into snapshotting, disaster recovery, etc. There's a lot more to this than "surviving a crash".

Re:Not exactly new or news... by Anonymous Coward · 2015-08-24 01:31 · Score: 0

I'm wondering if their proof requires the disk to actually write the data in the same order as the filesystem sends write commands to the disk. This is not necessarily true for disks that support command queuing, and certainly not true for SSDs.
Re:Not exactly new or news... by FranTaylor · 2015-08-24 01:35 · Score: 1

But what about disk failure?

this is like expecting your seat belt to keep you safe during the apocalypse
Re:Not exactly new or news... by FranTaylor · 2015-08-24 01:44 · Score: 1

I'm wondering if their proof requires the disk to actually write the data in the same order as the filesystem sends write commands to the disk. This is not necessarily true for disks that support command queuing, and certainly not true for SSDs.
RTFA:
"You literally have to consider every instruction or every disk operation"
Re:Not exactly new or news... by bradgoodman · 2015-08-24 03:51 · Score: 1

No. It's not. You think [your favorite bank] puts all their financial data on a plain 'old off-the-shelf [Insert brand here] and assumes that it'll all be good? They use multi-million dollar systems which do mirroring, integrity checking, verification, etc. "High-end" storage and filesystems systems do things like verification and checking at multiple levels (end-to-end, drive, block, filesystem, array, etc) so a $100 disk drive doesn't corrupt data and take down a $100 billion dollar bank. As for the apocalypse scenario - yep - they need to account for that too. That's what "Disaster Recovery", snapshotting and [long distance] asynchronous replication are all about. Reality speaking - an errant nuke or natural disaster can't take down a $100b bank - it can't even loose track of a single [large] transaction.
Re:Not exactly new or news... by Anonymous Coward · 2015-08-24 04:09 · Score: 0

This has been done a million times
Uh, can you provide ONE other machine verifiable proof of the correctness a filesystem implementation? (Hint: no, you can't.) The hard part is writing the proof, not the code. Try writing some non-trivial proofs in Coq, then come back to this thread and apologize when you realize exactly how stupid you sound.
Re:Not exactly new or news... by Bengie · 2015-08-24 05:09 · Score: 1

You need to read in-between the lines.

"MIT's New File System Won't Lose Data During Crashes" can be read as "MIT's New File System Won't be at fault for lost data once committed during any interruption of writes"

ZFS does the same thing, minus the proofs. If you do a sync write and ZFS says it completed, then that data is not going to be lost due to any fault of ZFS. But what if someone threw all of your harddrives into lava? Again, not the fault of ZFS. Same idea.

Rule of thumb, if your FS needs FSCK, it can probably lose data given the right kind of interruption.
Re:Not exactly new or news... by bradgoodman · 2015-08-24 05:17 · Score: 1

That's not the point. Having a machine verifiable proof of a correct filesystem which would protect itself in a "crash", is still subject to problems beneath it, or problems other than "crashes". There are other filesystems that can do this (without the "machine verifiable proof" - and they are vulnerable in real-world scenarios- not because they "lack the proof" - but because there are a zillion other weak links in the chain).
Re:Not exactly new or news... by Anonymous Coward · 2015-08-24 06:07 · Score: 0

No, you're still missing the point. This project is about writing provably correct software. Read a little bit of Chlipala's work. http://adam.chlipala.net/
A correct filesystem does not require infallible hardware. If hardware was infallible, writing correct filesystems would be easy!

Yeah rigth by Anonymous Coward · 2015-08-24 01:18 · Score: 0

I don't believe this one bit.

So when something crashes and corrupts a bit in the block of memory containing where on the disk to write the data to just at the wrong moment so this *correct* code then faithfully does what it is asked and overwrites the file index with a picture of a cat, nothing will be broken?

Re:Yeah rigth by wonkey_monkey · 2015-08-24 01:21 · Score: 1

I don't believe this one bit.
That's why we have checksums.

--
systemd is Roko's Basilisk.
Re:Yeah rigth by FranTaylor · 2015-08-24 01:33 · Score: 0

That's why we have checksums.
checksums will only detect a single bit error, you've got to do better than that
Re:Yeah rigth by wonkey_monkey · 2015-08-24 02:44 · Score: 1

checksums will only detect a single bit error
Nope.

--
systemd is Roko's Basilisk.
Re: Yeah rigth by BronsCon · 2015-08-24 04:14 · Score: 1

Some checkums can only be guaranteed to catch a single bit error but, in practice, it is relatively rare that multiple-bit errors "line up" just right to create a correct checksum. There are better checksums available, as well, which will catch multiple-bit errors, as well as identify which segment of the data contains the error. Someone more familiar with the topic will have to name these, as I haven't studied them in years.

--
APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
Re:Yeah rigth by WaffleMonster · 2015-08-24 06:24 · Score: 1

That's why we have checksums.
There is no guarantee checksums will detect failure there is only a probability.
Re:Yeah rigth by wonkey_monkey · 2015-08-24 07:23 · Score: 1

There is no guarantee that a Slashdot reader will detect a joke; there is only a probability.
Did anyone get it?

--
systemd is Roko's Basilisk.

Tough environments by unencode200x · 2015-08-24 01:24 · Score: 1

I've seen RAID groups fail sort of violently (granted in some tough environments) where one disk crashed and so did the others next two it. Three out of five disks in a RAID 5 gone. Only option was backup. How would any filesystem survive that?

--

Chance favors the prepared mind.
Perfect is the enemy of good.

Re:Tough environments by FranTaylor · 2015-08-24 01:32 · Score: 1

the stars will implode and the universe will come to an end, how would any filesystem survive that?
Re:Tough environments by unencode200x · 2015-08-24 01:43 · Score: 1

Exactly! Seriously, I've seen this. Being in IT for almost 20 years you see edge cases, dumb things, and just plain bad luck.

--

Chance favors the prepared mind.
Perfect is the enemy of good.
Re:Tough environments by TheDarkMaster · 2015-08-24 01:56 · Score: 1

I've seen RAID groups fail sort of violently (granted in some tough environments)

Missile hit? :-D

--
Religion: The greatest weapon of mass destruction of all time
Re:Tough environments by Anonymous Coward · 2015-08-24 02:16 · Score: 0

I read the summary... You should try it.
It's not about surviving a hardware fault, it's about guaranteeing a consistent file system all every point in time, so no problems will occur in case of a power loss or hot unplug without unmounting the filesystem first.
The very same thing that journalling filesystems like ext3/4 have done for years, but with formal proof (something you usually only see at places like NASA).
Re:Tough environments by Guspaz · 2015-08-24 02:39 · Score: 1

You joke, but I've seen IOPS drop in a RAID array because somebody was talking loudly next to the server. It was kind of fun to shout at the server and watch the disk activity drop. For testing purposes, of course.
Re:Tough environments by fuzzyfuzzyfungus · 2015-08-24 02:52 · Score: 1

It sounds like a quip; but it's the truth: more redundancy. Nothing is going to allow a system(whether it is all handled by the filesystem or the work is divided between a RAID controller and a filesystem) to recover a chunk of data that the universe violently removes except another copy of that data or something from which it can be deterministically inferred(even if you accept the 'random number generator and a lot of patience' mode of data recovery, you still need to know when you've recovered the correct data).

If you have a full proof of the behavior of the system, you can at least know what sorts of failures, and where, will cause loss, and of what; but any real-world implementation is going to ultimately rest on a judgement about acceptable risk vs. acceptable cost of redundant hardware. Had the system in your post been running 5 disks in RAID1, it would have been totally fine with the loss of up to 4 disks; but not terribly efficient. Elegant math can make detecting errors more likely, allow for redundancy that is more efficient than naive copying, provide insight into what steps in the process are points of failure, how severe they are, and so on; but unless you have an ideal computer to run it on, it will be a matter of some degree of chance, though ideally a well understood degree knowingly chosen.
Re:Tough environments by mlts · 2015-08-24 03:35 · Score: 3, Interesting

I personally encountered a drive array driver cause an entire array to get overwritten by garbage. I was quite glad that I had tape backups of the computers and the shared array, so a recovery was fairly easy (especially with IBM sysback.)
Filesystems are one piece of a puzzle, but an important one. If that array decided to just write some garbage in a few sectors, almost no filesystem would notice that, allowing propagating of corrupted data to backups. The only two that might notice it would be a background ZFS task doing a scrub and noticing a 64 bit checksum is off, or ReFS doing something similar. Without RAID-Z2, the damage can't be repaired... but it can be found and relevant people notified.
Re:Tough environments by Bengie · 2015-08-24 05:11 · Score: 1

Their term of "Crash" is different than yours. You assume "crash" means the Universe self-destructed. They just assume the writes were interrupted, like power failure or your kernel locked up, not your harddrives dying.
Re:Tough environments by unencode200x · 2015-08-24 05:23 · Score: 1

Nothing quite that interesting. One was in a school. Their server room was near the gym. The other was in a factory. The server room was in the middle of the shop.

--

Chance favors the prepared mind.
Perfect is the enemy of good.
Re:Tough environments by 0dugo0 · 2015-08-24 06:49 · Score: 1

Ever had to deal with power losses and noticed how often power goes down a second time before really coming back up? Ever noticed how bad (journalling) filesystems deal with power failures during repair?
Re:Tough environments by Zak3056 · 2015-08-24 08:30 · Score: 1

I've seen RAID groups fail sort of violently (granted in some tough environments) where one disk crashed and so did the others next two it. Three out of five disks in a RAID 5 gone. Only option was backup. How would any filesystem survive that?
It is not the responsibility of the file system to maintain data integrity in the face of catastrophic failure of the underlying storage hardware.

--
What part of "shall not be infringed" is so hard to understand?
Re:Tough environments by Anonymous Coward · 2015-08-24 10:56 · Score: 0

Shouting in the Datacenter:
https://www.youtube.com/watch?v=tDacjrSCeq4
Re:Tough environments by Guspaz · 2015-08-24 14:08 · Score: 1

Pretty much that. Ours were RAID arrays on Linux, and we induced it without being quite as close as he was, but pretty much the same deal. This was more than ten years ago, so maybe modern disks would handle it better, but it looks like a few years later that guy in your video was still seeing the same behaviour.
Re:Tough environments by david_thornley · 2015-08-25 09:35 · Score: 1

My wife once worked on a system that was in the next room from the skate sharpener. She'd never seen flames coming out of a hard disk before.

--
"When you have eliminated the unacceptable, whatever is left, however improbable, must be the truthiness" - Holmes
Re:Tough environments by unencode200x · 2015-08-26 02:57 · Score: 1

Wow, that would be scary.

I remember one time about 10 years ago we got a handful of new HP servers in and were going through the burn-in process. Quite literally, apparently, as one of them had a RAID controller who's capacitors exploded quite violently setting off fire alarms and making us run for fire extinguishers when when we fired it up. (pun intended..).

--

Chance favors the prepared mind.
Perfect is the enemy of good.
Re:Tough environments by unencode200x · 2015-08-26 02:59 · Score: 1

Agreed. The article does talk about catastrophic hardware failures too. Just thought it seemed a bit misleading and scant on details is all.

--

Chance favors the prepared mind.
Perfect is the enemy of good.

Because by Anonymous Coward · 2015-08-24 01:24 · Score: 0

Because proofs don't have bugs. Riiiight.

Two General's Problem by Anonymous Coward · 2015-08-24 01:25 · Score: 1

I think any file system could be imagined as a simple case of a database system. You "commit" a file change and you must be sure that the change is written to disk before proceeding.

So, any database system has a well know logical limitation named "Two General's Problem"

https://en.wikipedia.org/wiki/Two_Generals%27_Problem

The implication of this is that in a database system you cannot guarantee a fully automated recovery; always there is a remaining possibility that some changes should be rolled back or commited by hand.
So what the database manager does is to minimize this lost work; what do you want to redo by hand: 1 week, 1 day, 1 minute, 1 second of work?

Re:Two General's Problem by Guspaz · 2015-08-24 02:40 · Score: 1

You can never be sure that a write is committed to disk, because most hardware lies about that.
Re:Two General's Problem by fnj · 2015-08-24 07:24 · Score: 1

You can never be sure that a write is committed to disk, because most hardware lies about that.
I suppose you can find authoritative references for that claim, complete with manufacturer names and drive model numbers?

Code not available, will it ever be? by badger.foo · 2015-08-24 01:31 · Score: 2

It's now August, the conference where they'll be presenting their work is in October, and the article is a tad short on specifics. They've done a formally verified formal verification of a filesystem. if it works, that's excellent news of course, but I'd wait until we have seen the thing work and with actual code to examine before making any comments or bets on how useful this is going to be. And this being an open source-oriented site, we should be asking whether the code will indeed be available under any kind of usable open source license.

--
-- That grumpy BSD guy - http://bsdly.blogspot.com/

Re:Code not available, will it ever be? by Anonymous Coward · 2015-08-24 01:40 · Score: 0

I'm gonna go out on a limb and say we'll see it released under the MIT license.

MIT researchers live in ivory towers by alexhs · 2015-08-24 01:42 · Score: 4, Insightful

Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth

If the hard drive firmware is not proven, this FS won't be any better than ZFS and others.
Writing safe file systems is the easy part (even trivial using synchronous writes, when you consider their design is "slow").
The impossible part is dealing with firmwares that are known to lie (including pretending than a write is synchronous): how could you not lose data if the drive didn't ever write the data to the platters in the first place ?

--
I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.

Re:MIT researchers live in ivory towers by Anonymous Coward · 2015-08-24 04:24 · Score: 0

On top of that is hardware bugs. I dealt one set of flash where if the flash was 'young' it would always work. But when old and you yanked the power at the right time it would half commit a page. Usually the page which told it where everything else was.
Re:MIT researchers live in ivory towers by KingMotley · 2015-08-24 09:48 · Score: 1

Not impossible. Tell it to write to disk, wait for it to say it has. Then cut power to the drive, wait 30 seconds, reestablish power, then ask for the data back. If it isn't the same, repeat until it is. It'll be slow, and likely kill your drives, but you can be reasonably sure the drive did indeed write the data.

Real-time by tepples · 2015-08-24 01:45 · Score: 1

what does this phrase even mean? why does speed or slowness matter?

Speed means the ability to finish reading and writing all data associated with a job before the job's soft real-time deadline has expired.

Re:Real-time by FranTaylor · 2015-08-24 01:49 · Score: 0

before the job's soft real-time deadline has expired.
what does this have to do with anything?
Re:Real-time by tepples · 2015-08-24 02:02 · Score: 1

If reading or writing files in a particular file system is slow enough that it makes applications painful to use, the file system won't pass into widespread use.
Re:Real-time by Darinbob · 2015-08-24 15:00 · Score: 1

And yet the FAT16 file system was popular for a very long time.

Proving Coq itself by tepples · 2015-08-24 01:48 · Score: 0

formal proof using Coq

How do you prove Coq is correct, that it doesn't have a Ken Thompson bug (see "Reflections on Trusting Trust") causing a false positive on proof of Coq's own correctness? Is there an independently written implementation of Coq's proof language suitable for David A. Wheeler's "diverse double-compiling" method? And how do you prove that the hardware on which Coq is run doesn't have a flaw that affects Coq's correctness?

Re:Proving Coq itself by TheRaven64 · 2015-08-24 02:37 · Score: 4, Interesting

If you actually want an answer, I suggest that you go and read the Coq papers. You might learn something.

--
I am TheRaven on Soylent News
Re:Proving Coq itself by Anonymous Coward · 2015-08-24 13:57 · Score: 0

> How do you prove Coq is correct
How do you prove you're really a person and not just a brain in a vat, being fed synthetic sensory data?
Re:Proving Coq itself by arglebargle_xiv · 2015-08-24 16:34 · Score: 1

How do you prove Coq is correct
You use the formal proof assistant Pusi to prove it correct. So what you do is feed Coq into Pusi, and then nine months later the proof drops out.

ZFS? Hello?!? by Anonymous Coward · 2015-08-24 01:49 · Score: 1

I guess that in the MIT village, the villagers have not yet discovered that something like that exists and is being used in production for the past ten years, and is known as "ZFS".

Good job guys... Glad to know why you are counted among the elite computer science institutions... *FACEPALM*

Re:ZFS? Hello?!? by Anonymous Coward · 2015-08-24 04:12 · Score: 0

Where's your proof that ZFS is crash tolerant? The proof is the hard part, not the code.

chmod 751 some_directory by tepples · 2015-08-24 01:57 · Score: 4, Informative

The feature I miss the most is allowing traversal through a directory hierarchy a user has no explicit permissions for to get to a folder they do have permissions for. I find the workarounds for this in other filesystems to be extremely ugly.

In POSIX, that's represented in a directory's mode bits with octal digit 1 (4: list files, 2: create or delete files, 1: traverse). What do you find ugly about mode 751 (owner create or delete, owner and group list, world traverse)?

I also kind of wish TCP/IP had used the network:mac numbering scheme that IPX used.

It does now. An IPv6 address is divided into a 48- or 56-bit network, a 16- or 8-bit subnet, and a 64-bit machine identifier commonly derived from the MAC.

Re:chmod 751 some_directory by Anonymous Coward · 2015-08-24 04:21 · Score: 0

In POSIX, that's represented in a directory's mode bits with octal digit 1 (4: list files, 2: create or delete files, 1: traverse). What do you find ugly about mode 751 (owner create or delete, owner and group list, world traverse)?
It is 'ugly' because it is using different bits for many items and depends on who you are and which posix call you are using. You even said it yourself, create OR delete files. What if i want to let people only create but not delete? I could see something like that in a an app creates a log file. But then the user is not allowed to remove the log file.
The netware file permissions (and by extension SMB) is amazingly robust. Where permissions are explicit and binary. In many ways the posix one is kinda simplistic. It works mostly in most cases. But if you want to do something odd where you are stuck with an 'or' state you are kinda hosed. It gets really odd when you try to map between systems.
It also lets you setup groupings where you can omit someone. Where as posix could perhaps allow someone you do not want in just by means of 'highest permission' and I belong to the right group.
I find it actually one of the few shortcomings of most linux/bsd filesystems. It is simple. But that does come at a interesting cost of lack of flexibility.
Re:chmod 751 some_directory by Anonymous Coward · 2015-08-24 04:45 · Score: 0

All common Linux FS (EXT2/3/4, XFS, JFS, BRTFS, etc) and probably all common BSD/OS X (though I'm not sure) support more fine-grained access control in the form of POSIX Access Control Lists (ACL). I'm not 100% sure if it supports all of your use cases, but it certainly supports the 'grant access to every member of this group except A, B and C', since the permission model is essentially 'most specific match wins'.
Here's the relevant end-user ACL documentation for FreeBSD
Re:chmod 751 some_directory by Anonymous Coward · 2015-08-24 07:38 · Score: 0

What if i want to let people only create but not delete?
I know this has nothing to do with you other points, but realistically what would be the point of that?
Unless you have some way to say append, but nothing else it would be entirely pointless.
The user could delete the contents of the log file and achieve the same effect as deleting it.
Re:chmod 751 some_directory by Actually,+I+do+RTFA · 2015-08-24 09:22 · Score: 1

Well, for example, I can think of a situation for: create, NOT delete, NOT modifiy, NOT read. If there is a shared area where people are putting resumes, or other submissions. You don't want htem to affect or compete with one already there. Nor read the others.
I can see a log file you can append-only to.
There's a lot of interesting cases if the permissions are cut fine enough.

--
Your ad here. Ask me how!

Now get rid of "save file" at user level. by Impy+the+Impiuos+Imp · 2015-08-24 02:07 · Score: 1

Stupidly written programs will still have problems, even if there is no corruption at the filesystem level -- a partially written save state, written in two parts with file closure (or to two separate files, more likely), will be corrupt from the program's point of view, if the computer resets between the two writes.

This also suggests a need for a transaction-like system at the OS level -- you open and write a ton of data, and then the crash occurs when almost done. The OS must have been writing to a temp file rather than the "real" one. Unless that is too high level in this context -- the article has no technical details at all.

--
(-1: Post disagrees with my already-settled worldview) is not a valid mod option.

This is standard for mission-critical S/W by sirwired · 2015-08-24 02:30 · Score: 1

The idea of evaluating each step of the program for "what happens if this fails" is standard software engineering technique for mission-critical software. (That't not to say it's always actually done, just that it is the standard.) This method is hardly revolutionary (or even evolutionary.)

Re:This is standard for mission-critical S/W by FranTaylor · 2015-08-24 02:38 · Score: 1

slashdot has truly degenerated into a cesspool when you consider that this drivel is one of the more insightful posts
Re:This is standard for mission-critical S/W by Anonymous Coward · 2015-08-24 04:46 · Score: 0

(That't not to say it's always actually done, just that it is the standard.)
Could you please a) name the standard, and b) indicate for which systems exactly it is intended, and c) according to what regulation or "typical" contracts it is mandatory to be applied?
Otherwise just being in some "standard", and "not being applied" isn't claiming much, except that some standard may not be applicable to everything.
Re:This is standard for mission-critical S/W by sirwired · 2015-08-26 01:47 · Score: 1

The standard doesn't have a name; it's not some useless crap in an ISO catalog, it's just a way truly mission-critical software is built.
I know that NASA has used it for the flight-control computers for the manned space-flight program, and IBM uses it for some of the more vital parts of their mainframe operating systems.
I didn't say it was applicable to everything, just that as a software-engineering technique, it's not at all new.

What's their point? by allo · 2015-08-24 02:48 · Score: 1

Any journalling FS should provide a consistent state, checksumming FS like BTRFS or ZFS even provable.
So they won't lose data, which is written and in a consistent state, and unwritten data cannot be saved by definition.

Yeah, right. by dargaud · 2015-08-24 02:58 · Score: 4, Interesting

In the words of Knuth the Great: "Beware of bugs in the above code; I have only proved it correct, not tried it."

It reminds me of a story from the late 80s (?) at a tech conference. The makers of a real-time OS with real-time snapshots would periodically pull the plug on their systems, plug it back in and it would resume exactly what it was doing, to the delight and amazement of all the techies in the assistance. In the much larger and much more expensive booth in front of them was a richer vendor. The techies started coaxing them to do the same. After much hand wringing, they did, and after a very long rebuild time the system came back as a mess. Conclusion: the 1st vendor went out of business, the 2nd one is still very big.

--
Non-Linux Penguins ?

Re:Yeah, right. by Anonymous Coward · 2015-08-24 07:24 · Score: 0

Huh?? Are you saying that the better solution didn't win out, or are you trying to make some other point?
Re:Yeah, right. by LWATCDR · 2015-08-24 07:45 · Score: 1

That is often true.
x86 vs the 68k, MS-DOS vs Amiga, MS-DOS vs GEMTOS. and so on.
marketing often beats better tech.

--
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.

If it was easy wouldn't it already be done? by sjbe · 2015-08-24 03:39 · Score: 2

Why am I hearing Jeremy Clarkson asking "how hard can it be?" just before utterly screwing something up?

Perhaps there is just a tad more difficulty to it than you are considering?

Re:If it was easy wouldn't it already be done? by operagost · 2015-08-24 08:24 · Score: 1

Mental image of Clarkson rolling that Reliant Robin... OHHH THERE GOES MY DATA!

--

Gamingmuseum.com: Give your 3D accelerator a rest.
Re:If it was easy wouldn't it already be done? by Darinbob · 2015-08-24 15:08 · Score: 1

I can never tell if Clarkson is a smart person pretending to be stupid, or a stupid person pretending to be smart. Because a stupid person could never pretend to be that stupid.

Formal proofs of software are useless by jgotts · 2015-08-24 03:40 · Score: 1

Hi, MIT guys, formal proofs of filesystems are useless because you cannot incorporate physical systems into formal proofs. Real filesystems exist on real hardware.

I guarantee that your file system will fail if I start ripping cables out. A suitably strong EMP will take it out. In fact, I bet I could nuke your filesystem if I used my ham radio transceiver too close to the device. Other things that would destroy your filesystem include floods, earthquakes, and a lightning strike.

I began writing this by stating that formal proofs of software are useless, but I don't really believe that to be true. I strongly believe that we should strive for software correctness. Any techniques that can we use to make software better are worth pursuing.

But it has to be remembered that software cannot be isolated. When we do develop a true AI, it will escape and destroy us, probably within milliseconds of an unexpected hardware event. No matter how rigorously the beast is programmed!

Re:Formal proofs of software are useless by Anonymous Coward · 2015-08-24 04:17 · Score: 0

It's a proof that the filesystem code performs correctly, not a proof that hard drives never fail.
Re:Formal proofs of software are useless by Anonymous Coward · 2015-08-24 12:38 · Score: 0

When we do develop a true AI, it will escape and destroy us, probably within milliseconds of an unexpected hardware event.
True AI is probably here because "true" could just be marketing. Probably you meant "hard" AI or "strong" AI, one that is sentient, conscious, somehow solves the hard problem of consciousness, and other difficulties standing in the way, like how biological consciousness works in order to model it. Sort of also has to be installed into self-mobile and environment sensing units, otherwise the mind is aways locked in its machine. I think that artificial mind in the first place, though a worthy goal for the merits of the discovery, is impossible to achieve without some biohybrid brain material, philosophically and mathematically impossible otherwise. No amount of clever code and lightspeed memory and processors will ever amount to what a jellyfish mind experiences let alone any higher life-form. But I don't want to discourage you or anyone. AI is very very very very cool. Really, it varies.

Matmatical proof MIT is a dunce factory by Anonymous Coward · 2015-08-24 06:16 · Score: 0

"Well, what if I crash now? What now? What now" ... Lookup the answer on Wikipedia?

https://en.wikipedia.org/wiki/...

So.... by Anonymous Coward · 2015-08-24 06:47 · Score: 0

Can it write to a full LUN and never lose data?

GPL Reliable File System For Embedded by Anonymous Coward · 2015-08-24 06:57 · Score: 0

Datalight Releases Open Source Version of Reliance Edge, a Tiny, Power Fail-safe File System for FreeRTOS

http://www.datalight.com/news/datalight-releases-open-source-version-of-reliance-edge,-a-tiny,-power-fail-safe-file-system-for-freertos

Reliance Edge Fail-Safe File System-A Transactional Embedded File System for FreeRTOS

http://www.freertos.org/FreeRTOS-Plus/Fail_Safe_File_System/Reliance_Edge_Fail_Safe_File_System.shtml

Key Logic's KeyKOS by Sits · 2015-08-24 07:42 · Score: 2

I've just spent the past few minutes trying to (re)find this story. Is it the KeyKOS recovery speed story from 1990?

Re:Key Logic's KeyKOS by dargaud · 2015-08-24 09:05 · Score: 1

Yes, that's the story, thank you.

--
Non-Linux Penguins ?

ZFS has had this since 2005 by Anonymous Coward · 2015-08-24 07:51 · Score: 0

Nice try though presenting this as "new".

AI by Anonymous Coward · 2015-08-24 11:27 · Score: 0

That's why we need to make sure that dozens of AIs come online at the same time: They'll be forced to become good citizens with each other, which has a chance of them including us in their concept of civilization.

Formal Proof? by Anonymous Coward · 2015-08-24 11:39 · Score: 0

I suspect that an architecture can be formally proved. But a specific implementation? No.

It is axiomatic that any non-trivial software system has bugs. Of what use is an architectural proof in the face of implementation bugs? This is the difference between design and implementation. They may have all the best intentions in their design but the implementation can sink the whole ship.

Slow and safe, beats fast and unsafe! by Anonymous Coward · 2015-08-25 05:09 · Score: 0

I would take slow filesystem any day, if it can guarantee my data never gets corrupt by bit rot, current spikes, etc and guarantee recovery of crashed disks, etc. A filesystem that guarantees data integrity is very very hard to make. Researchers have found flaws in all filesystems such as XFS, ext4, NTFS, etc - except for ZFS. That is the only filesystem that has passed the data integrity research by at least two research papers. No flaws in ZFS. No other filesystem has done that.

If the filesystem is fast, it is unsafe. For instance, you can do a chkdsk of a large XFS raid array, maybe 10 TB in just a couple of minutes. The ZFS scrub would take many hours on the same size array. The difference? XFS is fast because fsck only checks the metadata, the data is never checked. The data might be corrupt after a successful fsck. ZFS scrub takes many hours because it traverses and checks everything; data and metadata and what not. When ZFS scrub is successful, you know your data is safe. Now choose, fsck in a few minutes, or scrub in many hours.

It reminds me of a Linux driver for NTFS which was very fast, much faster than NTFS. It cut corners so it did not do safety checks.

He is smart but plays dumb by sjbe · 2015-08-25 23:28 · Score: 1

I can never tell if Clarkson is a smart person pretending to be stupid, or a stupid person pretending to be smart.

I've heard from numerous movie directors that people who play dumb characters actually have to be quite smart. Clarkson isn't stupid (generally) and remember that Top Gear is/was a scripted show and people watch it primarily because it is funny. I think I remember Conan OBrian saying that a lot of comedy is throwing out your dignity and hoping you'll get it back.

Hardware Reliability by herbierobinson · 2015-08-30 05:08 · Score: 1

As others have pointed out, the formal verification does make the software provably reliable, but does nothing to protect against hardware issues. Just as a datapoint, the Stratus VOS operating system has been checksumming at the block driver level since the OS was written in 1980. It has detected failures on every generation of hardware it has been used with since. Some of the failures we have seen: Undetected transient media errors (the error correction/checking isn't perfect); Flaky I/O busses; bugs in RAID firmware.

These real world failures will swamp the potential failures caused by the software not being verified.

--
An engineer who ran for Congress. http://herbrobinson.us

Slashdot Mirror

MIT's New File System Won't Lose Data During Crashes

168 comments