Developer Shares A Recoverable Container Format That's File System Agnostic (github.com)

← Back to Stories (view on slashdot.org)

Developer Shares A Recoverable Container Format That's File System Agnostic (github.com)

Posted by EditorDavid on Saturday April 29, 2017 @10:40AM from the building-with-blocks dept.

Long-time Slashdot reader MarcoPon writes: I created a thing: SeqBox. It's an archive/container format (and corresponding suite of tools) with some interesting and unique features. Basically an SBX file is composed of a series of sector-sized blocks with a small header with a recognizable signature, integrity check, info about the file they belong to, and a sequence number. The results of this encoding is the ability to recover an SBX container even if the file system is corrupted, completely lost or just unknown, no matter how much the file is fragmented.

27 of 133 comments (clear)

Min score:

Reason:

Sort:

Re:Not to seems like a philistine... by MarcoPon · 2017-04-29 11:00 · Score: 2

It's a bit of a different thing. Think about a digital camera that could save on a SDCard both a JPEG and a JPEG in a SBX container. If the SD file systems get corrupted (maybe the batteries given up just when writing), your chances or getting back the JPEGs are so-so (depends on how much/if they are fragmented), but you could surely recover the SBX files.

--

SeqBox
Re:why? by MarcoPon · 2017-04-29 11:10 · Score: 4, Interesting

That's an interesting property, but what's the use case?
I can't say I know them all, or even the best/killer ones, but I listed some on the readme. Probably the most immediate/interesting application would be on a digital camera, for photos/video.

Can apps read files inside an sbx container?
Yes. The blocks are of a fixed size, so the format is seekable and reading from it is far simpler than, say, reading from a ZIP file.

--

SeqBox
Re:Fail? by MarcoPon · 2017-04-29 11:17 · Score: 4, Informative

The default block size used is 512 bytes, which is a suitable sub-multiple of every sector size used by most system after the CP/M days. One example of system that doesn't plays well with is Amiga Old File System (which use 488 bytes per blocks, IIRC). Actually it's the only FS/platform that I found not working, among the ones I managed to test (a bit over 20, I listed them in the readme, just above the tech spec).

--

SeqBox
Re: If you can compact encrypted images... by KiloByte · 2017-04-29 11:28 · Score: 2

There's no way it can. LUKS is great but it wastes tons of disk space on vms.
It can! Just turn on discard (and have the system inside issue trim commands). This does have an impact on encryption, though, which might or might not be acceptable for you: it is possible to tell used from unused disk space, which leaks information about usage patterns inside the VM.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:What about files stored in MFT? by MarcoPon · 2017-04-29 11:29 · Score: 2

I think the limit for NTFS is something like 640 bytes. A 1 byte file encoded in SeqBox format would occupy at least 1 block for the data, plus 1 for the metadata (with attributes like file name, date, size, etc.), so 1024 bytes minimum.

--

SeqBox
Re:Encrypted signatures? by MarcoPon · 2017-04-29 11:54 · Score: 2

Yes. You could supply a password that is hashed and the the hash is XOR-ed against each block, signature included. It's not really strong encryption by any means (that could be implemented in a later version, at the moment I just wanted to keep it simple), but it's probably obfuscated enough to avoid detection. Especially if one didn't expect an SBX container being there, and/or isn't prepared to go trough a lot of data collection to find some blocks with the same first 4 bytes (different depending on the password used) sprayed around.

--

SeqBox
Re:What about files stored in MFT? by JoeyRox · 2017-04-29 11:57 · Score: 2

The MFT limit is closer to 1K. But if your minimum size is 1K that should be fine.

Next question - for the encoding of the file, you're putting a 16-byte header in front of every blocksize-piece of data, correct? If that's the case, and if you're storing the entire block of original data after that pre-pended header, then how are you assuring that the spill-over piece of data will be on a contiguous block on the disk? For example, say you're encoding a single 4096 byte file using a 4K blocksize. The SBX-equivalent size would be 4,112 bytes; how are you assuring the final 16 bytes of data are on a contiguous disk block to the first 4K?
Re:why? by KiloByte · 2017-04-29 11:58 · Score: 3, Informative

So the only failure mode this protects from is corruption of metadata while every data block remains intact. On any sane filesystem, that sounds useless: the only cases this might happen are filesystems that can't handle unclean shutdown (FAT, ext2) or the disk lies about barriers. And those cameras that still use FAT have software you can't update, so you can't install that SBX thingy -- if you could, you'd be better off switching to a better filesystem.
In its present state, I'd suggest you scrap the whole project, it's a waste of time.
On the other hand, it would be an entirely different story if you added some form of erasure code that operates on amounts of data bigger than a single sector (most storage devices already have per-sector erasure codes).

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:What about files stored in MFT? by MarcoPon · 2017-04-29 12:04 · Score: 2

Assuming 4K block, a 4096 bytes would end up occupying 12228 bytes. 4096 for the metadata block, then 2 data block of 4096. The last one, would contain just 16 useful pad, and the rest would be just padding (with 0x1A bytes). Of course with very small files this isn't efficient, but it's not usually a problem. Overhead is just a bit over 3% with the default 512 bytes block, and less than 1% if 4KB blocks are forced.

--

SeqBox
Re: Not to seems like a philistine... by Anonymous Coward · 2017-04-29 12:25 · Score: 2, Interesting

Flash storage uses wear leveling, which can fail. If that happens, the flash chip can mostly still be read, but all erase blocks on the chip will be in a "random" order compared to the lost logical wear leveled position. Then you want a way to recover the logical order of the blocks, which this format allows you to do.
Re:Not to seems like a philistine... by slashrio · 2017-04-29 12:25 · Score: 2

What will be written first? If it's the SBX, then why wouldn't the battery give up while writing the SBX file? Your picture will be lost.

--
"Trump!!", the new Godwin.
This would be great for SSDs by Gravis+Zero · 2017-04-29 12:27 · Score: 4, Informative

Unlike HDD controllers, SSD controller do wear-leveling, so there is no guarantee that your data will be written as as a contiguous block of memory (regardless of what the filesystem says), only that it will be in 4096 byte blocks. Recovering deleted data from a SSD is no simple task because it means you need to know or guess the controller behavior for wear-leveling in order to go back and find the order of previously written data. With this you would be able to just read the raw memory even after the controller has been reset and still be able to recover the data. I think it would be a nice option to have a filesystem be able to encode user files in something like this highly recoverable format. The only real problem is that the file has to be completely rewritten even if you only modify part it in order to differentiate the new version from the old version.

--
Anons need not reply. Questions end with a question mark.
Re:Fail? by raftpeople · 2017-04-29 12:31 · Score: 2

AS400 has 520 byte sectors, 512 for user data and 8 bytes for system data.
Re:Not to seems like a philistine... by MarcoPon · 2017-04-29 12:39 · Score: 3, Informative

That one yes, but not the others already saved. A fragmented JPEG instead is pretty difficult to recover if the file system is inconsistent. The usual recover tools would easily find the first fragment, and then proceed from there collecting sectors in sequence, which may or may not contain the right data.

--

SeqBox
What it does and why it's (partially) useful by Excelcia · 2017-04-29 12:49 · Score: 5, Insightful

There is some confusion as to what this is actually doing.
Most filesystems have use special structures to store the name and location of your files on the drive. Directories, cluster bitmaps, etc etc. The reason why it's difficult at best to recover files from a hard drive when parts of the filesystem have been damaged is that it's difficult to identify where on your hard drive the files are. Besides the special filesystem directories, no where else stores information on what is stored where. If you lose the directory it's hard to tell one file's data from another on your hard drive.
That is where SBX comes in. What it does is make sure that every physical sector that stores data for a particular file is labelled with a number that identifies that file, and a sequence number so you can reconstruct what order that piece is in the original file. Really, for the amount of overhead, something like that should be embedded into every filesystem. Basically a distributed backup of all the filesystem metadata.
Some people are criticizing this that is solves non problems. I disagree. While it isn't the solution to global warming, it is both simple and clever (and will thus suffer from a lot of people who will disparage it out of a "well anyone could have thought of that" attitude). It won't save you from a full hardware crash. It won't save you from physically bad sectors in that file. What it will save you from is accidental deletion and from loss of the filesystem's metadata structures. How often does this happen? Twice to me from failures of a whole-disc-encryption system driver.
I wouldn't use this for every file, but for critical ones, sure. Why not. The problem is, where it is most useful, for very volatile files that change a lot (databases etc) between backups, is where it can't really be used until/unless different applications start supporting it. So it unfortunately has limited use in the places where it would really help the most. Like I said above, this sort of thing really needs to get rolled into a filesystem. The amount of overhead it costs is meaningless in today's storage environment.
1. Re:What it does and why it's (partially) useful by MatthiasF · 2017-04-29 20:16 · Score: 2
  
  Don't ZFS, ReiserFS and Btrfs all already have something similar inherent in their file systems?
2. Re:What it does and why it's (partially) useful by thegarbz · 2017-04-29 21:41 · Score: 2
  
  It would seem like it but: a) this doesn't need to be applied at a filesystem level and b) it isn't encumbered by licensing issues, a dead project, or an experimental filesystem, in respective order.
  Okay so it is actually experimental, but not be filesystem wide it is also much simpler and able to contain failures.
Re: If you can compact encrypted images... by Miamicanes · 2017-04-29 13:04 · Score: 5, Insightful

If you can meaningfully compact *anything* that's encrypted, the encryption was improperly implemented. You *always* want to compact files prior to encryption, and a well-encrypted compressed file should be statistically indistinguishable from random noise.
Re: Not to seems like a philistine... by Zontar+The+Mindless · 2017-04-29 13:12 · Score: 2

"A camera that writes to an SD card using a journaling filesystem?"
(To get the right effect, read aloud in the tone of voice one might use for saying, "A planet where apes evolved from men?")

--
Il n'y a pas de Planet B.
Re:Fail? by MarcoPon · 2017-04-29 13:18 · Score: 2

I see. It surely is a corner case, but an interesting one. My only experiences with AS400 was seeing some at some customer premises, and occasionally having to transfer some file from/to but nothing more. I remember an external IBM 5 1/4" floppy drive that was the length of an arm, and similar cost! :) Oh and BTW, SBXScan can scan and collect block positions from multiple images, even if it would be of no help in this situation. The idea was that if one keep 2 or more copies of an SBX file on different media, and all copies become corrupted in some ways (bit-rot, random blocks gone bad, something like that), it's still possible to collect all the good blocks from every available source, and perhaps still restore the complete container.

--

SeqBox
Re:What about files stored in MFT? by MarcoPon · 2017-04-29 14:08 · Score: 2

I'd say for archival or read only. It's trivial to read from the file directly, and for some application a simple plug-in may do the job (like there are audio players that can read audio files inside ZIP archives, for example).
About the the separate file with hashes, instead, the main issue would be that if the file system is in an inconsistent / damaged state, that file too would be inaccessible. So it would need to be kept somewhere else, and that would complicate things a lot.

--

SeqBox
Re:why? by MarcoPon · 2017-04-29 14:20 · Score: 3, Informative

They are two entirely different thing, but they could "help each other" in some ways.
PAR (or a RAR archive + recovery records, etc.) try to address the problem of losing some small parts of a file (due for example to physical errors), using some amount of redundancy. SeqBox try to address the issue of identifying and reassembling all part of a file, when they are all still on the physical media, but without the file systems indexes / structures to locate them (es. after a quick format, zero writes on the first sectors, etc.).
If you combine the two, creating an SBX container of a RAR + recovery records for example, you get both qualities.

--

SeqBox
Re:A couple of problems by MarcoPon · 2017-04-29 14:38 · Score: 2

You obviously tried to keep the per-block header small to minimize overhead. But that has caused questionable decisions that may make this format less useful than it could be.
It's surely a compromise, but I think it's pretty sensible for the present version (but some variations can surely be implemented as different versions, to better suite different scenarios).

Firstly, at 48 bits, the UID is a bit short. If UIDs are chosen randomly and with even distribution, there's a 1 in 1000 chance of a duplicate UID with just 750000 files.
That seems a bit off, 48bit assuming even distribution would give 281,474,976,710,656. But again, 750.000 files would seems an enormous number for the practical uses I was thinking about at the moment.

Secondly, the block sequence number is a 32bit value, so 4 billion blocks in a file max. With this format, files are limited to 2TB.
Yes, 2TB with the 512 block, or 16TB with 4K block. It's not good for everything but it's probably good for a lot of case. But again, it can be easily upgraded if needed.

Thirdly, the 2-byte checksum is too small.
A 2 byte CRC seems to be plenty for 512 bytes, and even for 4KB isn't bad. Here is not about detecting tampering, but just to distinguish a random block that just happen to have the right signature, with a specific UID & sequence number, from a real SBX block.
That said, again I agree that for applications that could envision a lot of files with far larger sizes, a different kind of header, with expanded fields (and some other things) would be better suited.

--

SeqBox
Beware padding oracle with compression& encryp by raymorris · 2017-04-29 16:17 · Score: 2

Compression before encryption often results in a padding oracle or other problems. If you're designing a system that is supposed to be secure, avoid compression until you fully understand the issues. Avoid compressing and encrypting chosen plaintext at all - you'll never be sure you understand all of the issues with that.
Re:why? by KiloByte · 2017-04-30 01:01 · Score: 2

A more interesting feature might be a firmware update to a spinning disk that cuts the drive capacity exactly in half. Basically a hard drive is going to have probably double the drive read heads as platters. Just store a copy of the data on a different head/platter surface, ideally, if possible, on a different platter. Add the firmware update to deal with the details. 1/2 the capacity, 1/2 the data rate, but some degree of added redundancy.
You're looking for btrfs -dDUP then. It does what you describe, and unlike any other filesystem except ZFS, the data is checksummed so it can be recovered even in case of a silent corruption (which happens way more often than people notice).
Obviously you want the same for metadata (-mDUP) which happens to be the default for rotational media. This pretty much renders the format described in the article pointless -- metadata corruption due to hardware failures that don't kill the entire disk is pretty much gone.
This doesn't quite work on typical flash media, though -- often the FTL will remap writes and put them into the same erase block, meaning you lost performance and capacity for no safety gain.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Re:why? by Immerman · 2017-04-30 01:43 · Score: 2

Indeed. Give it built in redundancy so that the data could be recovered reliably after almost any not-completely-terminal disk failure, and *then* you'd have something I'd be extremely interested in. Can't tell you how much archived data I've lost over the years due to "bit rot"
Yeah, I should have had it archived in three different locations, but who actually does that for personal data?

--
--- Most topics have many sides worth arguing, allow me to take one opposite you.
Re:why? by KiloByte · 2017-04-30 02:14 · Score: 2

Yeah, I should have had it archived in three different locations, but who actually does that for personal data?
From what I've seen, a typical intelligent person learns about the importance of backups after around 30 data loss events.

--
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.