Slashdot Mirror


User: MarcoPon

MarcoPon's activity in the archive.

Stories
0
Comments
115
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 115

  1. Re:Also, that name, Seqbox on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    DOH! :)

  2. Re:Hash the block number with the password on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    Of course, as I wrote the current password feature isn't a strong encryption by any means, but more just a way to implement a sort of mild obfuscation, just to show that data hiding it could also be a uses case. In time I plan to implement a proper block encryption scheme.

  3. I don't miss paper at all, for books. on As Print Surges, Ebook Sales Plunge Nearly 20% (cnn.com) · · Score: 2

    As an avid reader, I like my front illuminated ebook reader very much, thank you. And I don't regret a bit having to bring with me the latest big book (often not very well printed, or with a too small or too largh font) on the train to/from work to read.
    Manuals & tech info are an entirely different thing, of course, at least until I can get a big, flexible (as in bendable and unbreakable) speedy ereader.

  4. Re: A couple of problems on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    There's no 48 bit checksum in the specs. The 48 bit field is a UID that's used to id a file, and it could be anything "unique enough" for the job. For example in the case of a digital camera that saves photo on an SD, that could be even just the progressive number used in the filenames. Or could be an hash, or just a random number. A giant file repository wasn't exactly the first thing on my mind when I developed the concept. 2TB/16TB could be a limit in some situations, undoubtedly, but again it's reasonable enough in a wide variety of other.
    But note that the specs include a version number and so can be flexible. It's simple to add a new version with far larger limits, extending the size of the header, if needed, and that was the plan depending on the interest and usage that could come up.

  5. I was searching what to use as a file system name for the AS400 one, to add a note to the readme, but then I read again what you wrote in your first reply (sorry, was around 04:00am here): "520 bytes sectors, 512 for user data". If that's the case, then SeqBox should work just fine. The essential thing is that an SBX block, of 512 bytes, remains whole/integral, and that seems to be the case. The only things to keep in mind would be to use an adequate step with SBXScan when searching for the blocks (520 or 522 in this case, or even just 1).

  6. Re:Use-case of distributed pieces? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    Well, it's probably not the first use case I would think of, but it will surely work and I think I have mentioned splitting in the readme. Just as SBXScan can locate and collect all the good SBX blocks from different images/devices (imagine having a copy of the same SBX file on different media, to add physical redundancy), it can surely reassemble a container from different pieces. The only restriction is that the splitting need to happens on block boundaries (512 bytes by default), which should not really be a problem.
    Then again, I'm not sure if/how this would be better than other splitting tools, but yes, it could used like that.

  7. I'm probably familiar with most of those concepts, but it's nice to see them all in one well presented document which is also a sort of nice historic artifact. I think I stumbled upon it a long time ago and then lost track of it. Thanks.

  8. Re:What about files stored in MFT? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    I'll definitely experiment with creating a file with just the hashes + metadata, to be then encoded in a SBX, so that it will be possible to do both things (standalone SBX file, or normal file + just the hashes in an SBX). Thanks for the nice discussion (and sorry for the delay, was about 05:00am here :) ).

  9. Re:Apple Lisa/early-Mac "tags" on steroids on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    Yes, it was a bit of a common things on older system, from the times where the mass storage hardware was far from precise and reliable, to do things like check that the drive seek really landed the head in the requested track. At least the Mac implementation still kept the usual 512 bytes of useful data per sector (at the price of less common hardware), while for example Amiga OFS end up with an odd 488 usable bytes per sector (but with common hardware).

  10. Re:What it does and why it's (partially) useful on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    It's somehow similar, but not really the same thing. From what I gathered (but I'm surely not an expert in ReiserFS), ReiserFS can scan the disk to locate and recover its file structures / indexes, that in turn would enable to find the files again. SeqBox enable one to recover a SBX container without even considering any FS structure, but instead just scanning the raw bytes for the file itself, because each of its blocks is made recognizable. You can zero out all the file system info, partition table, etc. and still find an SBX file.

  11. Re:What about files stored in MFT? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    As for the SBX of hashes not being locatable due to metadata corruption, you can avoid that by applying a header to the SBX blocks themselves.

    OK, I see what you mean.

  12. Re:A couple of problems on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 2

    You obviously tried to keep the per-block header small to minimize overhead. But that has caused questionable decisions that may make this format less useful than it could be.

    It's surely a compromise, but I think it's pretty sensible for the present version (but some variations can surely be implemented as different versions, to better suite different scenarios).

    Firstly, at 48 bits, the UID is a bit short. If UIDs are chosen randomly and with even distribution, there's a 1 in 1000 chance of a duplicate UID with just 750000 files.

    That seems a bit off, 48bit assuming even distribution would give 281,474,976,710,656. But again, 750.000 files would seems an enormous number for the practical uses I was thinking about at the moment.

    Secondly, the block sequence number is a 32bit value, so 4 billion blocks in a file max. With this format, files are limited to 2TB.

    Yes, 2TB with the 512 block, or 16TB with 4K block. It's not good for everything but it's probably good for a lot of case. But again, it can be easily upgraded if needed.

    Thirdly, the 2-byte checksum is too small.

    A 2 byte CRC seems to be plenty for 512 bytes, and even for 4KB isn't bad. Here is not about detecting tampering, but just to distinguish a random block that just happen to have the right signature, with a specific UID & sequence number, from a real SBX block.
    That said, again I agree that for applications that could envision a lot of files with far larger sizes, a different kind of header, with expanded fields (and some other things) would be better suited.

  13. They are two entirely different thing, but they could "help each other" in some ways.
    PAR (or a RAR archive + recovery records, etc.) try to address the problem of losing some small parts of a file (due for example to physical errors), using some amount of redundancy. SeqBox try to address the issue of identifying and reassembling all part of a file, when they are all still on the physical media, but without the file systems indexes / structures to locate them (es. after a quick format, zero writes on the first sectors, etc.).
    If you combine the two, creating an SBX container of a RAR + recovery records for example, you get both qualities.

  14. Re:What about files stored in MFT? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 2

    I'd say for archival or read only. It's trivial to read from the file directly, and for some application a simple plug-in may do the job (like there are audio players that can read audio files inside ZIP archives, for example).
    About the the separate file with hashes, instead, the main issue would be that if the file system is in an inconsistent / damaged state, that file too would be inaccessible. So it would need to be kept somewhere else, and that would complicate things a lot.

  15. I see. It surely is a corner case, but an interesting one. My only experiences with AS400 was seeing some at some customer premises, and occasionally having to transfer some file from/to but nothing more. I remember an external IBM 5 1/4" floppy drive that was the length of an arm, and similar cost! :) Oh and BTW, SBXScan can scan and collect block positions from multiple images, even if it would be of no help in this situation. The idea was that if one keep 2 or more copies of an SBX file on different media, and all copies become corrupted in some ways (bit-rot, random blocks gone bad, something like that), it's still possible to collect all the good blocks from every available source, and perhaps still restore the complete container.

  16. Re:What about files stored in MFT? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    It certainly possible in some scenario. In others it could be more practical to create a SBX file, and then keep just that one, instead of having the original file and another file to keep track of, for a larger total overhead.

  17. Re:This would be great for SSDs on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    Sure. In general, 512 byte as a sector size is considered legacy, nowadays. More so, I don't see many chances of a 512 byte block being broken down further.

  18. Re:What about files stored in MFT? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    I thought about something along that lines (I was considering blake2b/s as a faster hash), but I choose to do it this way considering that the SBX file could, at least in some case like the digital camera one, be the only copy, as it's easily decoded on the fly and seekable. But keeping just a list of hashes and some metadata is surely interesting too.

  19. Interesting, didn't know that. Of course it's possible to create a new block version with a suitable block size (the 3 current ones differs just in the block size: 512, 128 or 4K - mostly for experimenting and verifying that the tools worked correctly with different versions), but that would be a bit like cheating. Will add the AS400 to the not working list, thanks.

  20. Re:forward error correction? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    I thought about it, but at least initially I choose to keep it simple. One can always process the file in some way before creating the SBX, for example creating a RAR archive with recovery records, and then encoding that one.

  21. Re:Not to seems like a philistine... on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 3, Informative

    That one yes, but not the others already saved. A fragmented JPEG instead is pretty difficult to recover if the file system is inconsistent. The usual recover tools would easily find the first fragment, and then proceed from there collecting sectors in sequence, which may or may not contain the right data.

  22. Re:This would be great for SSDs on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    Note that the 4KB block has just come up in some examples, but the default blocksize is 512, and I think that's reasonable to assume that a block of that size will not be broken down in a smaller parts.

  23. Re:What about files stored in MFT? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 1

    Yes. Just keep in mind that the default blocksize is 512 bytes, as a reasonable compromise between overhead and compatibility with most file systems / platforms, older ones included. You can check the readme.md for the complete file specs, near the end.

  24. Yes, exactly. And it's not just about data recovery. For example, I'm not sure how "generally useful" that would be, but I found myself using it for extracting/exchanging data from disk images formatted with a foreign file system.

  25. Re:What about files stored in MFT? on Developer Shares A Recoverable Container Format That's File System Agnostic (github.com) · · Score: 2

    Assuming 4K block, a 4096 bytes would end up occupying 12228 bytes. 4096 for the metadata block, then 2 data block of 4096. The last one, would contain just 16 useful pad, and the rest would be just padding (with 0x1A bytes). Of course with very small files this isn't efficient, but it's not usually a problem. Overhead is just a bit over 3% with the default 512 bytes block, and less than 1% if 4KB blocks are forced.