Slashdot Mirror


Rethinking the Nature of Files

An anonymous reader writes "Two recent papers, one from Microsoft Research and one from University of Wisconsin (PDF), are providing a refreshing take on rethinking 'what a file is.' This could have major implications for the next-gen file system design, and will probably cause a stir among Slashdotters, given that it will affect the programmatic interface. The first paper has some hints as to what went wrong with the previous WinFS approach. Quoting the first paper: 'For over 40 years the notion of the file, as devised by pioneers in the field of computing, has proved robust and has remained unchallenged. Yet this concept is not a given, but serves as a boundary object between users and engineers. In the current landscape, this boundary is showing signs of slippage, and we propose the boundary object be reconstituted. New abstractions of file are needed, which reflect what users seek to do with their digital data, and which allow engineers to solve the networking, storage and data management problems that ensue when files move from the PC on to the networked world of today. We suggest that one aspect of this adaptation is to encompass metadata within a file abstraction; another has to do what such a shift would mean for enduring user actions such as "copy" and "delete" applicable to the deriving file types. We finish by arguing that there is an especial need to support the notion of "ownership" that adequately serves both users and engineers as they engage with the world of networked sociality. '"

369 comments

  1. There is no "issue." *I* own my files and data by elrous0 · · Score: 2, Insightful

    I'm sorry, but MS issuing a paper on the "issues of file ownership" and the cloud sends a little chill up my spine. Makes me think that engineering may not be the only impetus behind their paper. It also makes me wonder if someone isn't looking to take a little more "ownership" of what has traditionally been considered *my* data.

    It's bad enough I'm already forced into "buying" software and media that I can never resell. Now they want my fucking Word files too I guess.

    --
    SJW: Someone who has run out of real oppression, and has to fake it.
    1. Re:There is no "issue." *I* own my files and data by fuzzyfuzzyfungus · · Score: 4, Insightful

      Don't worry, user, of course you own those little files of yours.

      We just want to install some robust Technological Protection Measures to preserve your ownership of those files across all devices and platforms and legal systems aligned with international norms... Totally harmless, nothing to worry about.

    2. Re:There is no "issue." *I* own my files and data by Hartree · · Score: 1, Insightful

      Microsoft: All your files^h^h^h^h^hdata are belong to us!

    3. Re:There is no "issue." *I* own my files and data by imric · · Score: 2

      Well since the {xxAA} already owns most modern works of art and all performances forever (with the blessings of our government), and companies already own ideas (thoughts), it stands to reason that Microsoft would want to own the results of any actions facilitated by software written by them as well. I mean, how can they continue to expand their market if they don't? Be REASONABLE! I mean, this can get rid of any ambiguity about ownership and remove copyright and patent issues forever! It's simple - "All your files are belong to us"!

      --
      Paranoia is a Survival Trait!
    4. Re:There is no "issue." *I* own my files and data by Nick+Fel · · Score: 1

      Please. What about a file you're working collaboratively on in the cloud? Do you own that? That's obviously the kind of thing they're talking about.

    5. Re:There is no "issue." *I* own my files and data by CharlyFoxtrot · · Score: 3, Insightful

      You should read the article, you are illustrating their point. They talk about how users associate ownership with having a file on a known physical location and how in order for people to feel comfortable with cloud storage the definition of file needs to be redefined in a way that people feel they have ownership over data that exists "out there".

      "[...] ownership is what we are thinking of, when ownership stands as proxy for what used to be knowledge of location and responsibility for that location. What was once a relationship between a user and a physical thing now needs to stand as a relationship between a user and a digital thing. Just what this ownership might be and how it might function in terms of what is specified in this new entity we are thinking of, one that somehow has the properties we have described above and which also allows this new characteristic, we have begun to outline but a beginning is all it is."

      Part of this is the ability to be able to delete their data even when it has been put out there in the wild.

      "A boundary object needs to be developed that can bridge the abstraction of the user and the one of the engineer, who needs to worry about where this thing that keeps growing and changing, and where the locale of storage changes too, such that when a user says ‘delete’, the thing whatever it is and wherever the entities constitutive of it are, are indeed, done away with."

      This is a paper talking about your concerns and how to address them.

      --
      If all else fails, immortality can always be assured by spectacular error.
    6. Re:There is no "issue." *I* own my files and data by i+kan+reed · · Score: 1

      My thoughts too. This sounded like Microsoft trying to justify the idea of embedding DRM directly into their next filesystem.

    7. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      In most instances when people proposes changes to files or file systems it goes nowhere. I don't see this going far. I could write a paper on vehicles, the definition of vehicles, point out that wouldn't it be nice if cars could fly, claim that the concept of roads is all wrong, but it wouldn't make any difference.

    8. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 2, Insightful

      A quote from the conclusion of the article:

      A boundary object needs to be developed that can bridge the abstraction of the user and the one of the engineer, who needs to worry about where this thing that keeps growing and changing, and where the locale of storage changes too, such that when a user says ‘delete’, the thing whatever it is and wherever the entities constitutive of it are, are indeed, done away with.

      I'm sorry, but that sounds a *lot* like DRMing every file to me, with a central service controlling every file (how else could you implement such a system?). The authors even admit as much a few sentences later:

      At first reading one might think they are alluding to digital rights management.

      Of course, they seem to deny that this is DRM. But that's sure what it sounds like to me. And DRM needs some sort of central service to work, which I'm sure MS will be happy to provide of course.

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    9. Re:There is no "issue." *I* own my files and data by Short+Circuit · · Score: 2

      I was amused when I discovered that the Xen hypervisor allows you to emulate a TPM in software. I didn't dig into it enough to find out if there were a way to extract stored data from within the dom0.

      What's that about a secure keystore again?

    10. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 5, Informative

      No, they're talking about DRM. They try to deny it a few sentences later, but how else would you implement a system where any given file downloaded off the web could be deleted by a central authority at any time?

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    11. Re:There is no "issue." *I* own my files and data by imric · · Score: 1, Insightful

      Of COURSE they are. They are trying to find a different way to market it - since DRM has no user benefits and users actively dislike it, they 'need' to redefine the issue so users have no choice.

      This is marketing.

      --
      Paranoia is a Survival Trait!
    12. Re:There is no "issue." *I* own my files and data by Nick+Fel · · Score: 1

      What it sounds like to me is separating the notions of files from the notion of storage. So only the engineer and the underlying system needs to worry about whether your data is on your hard drive, or the cloud, or a pen drive. Instead, the user can just worry about their text/image/video, wherever it happens to be. Of course, it doesn't help that Richard Harper (a social scientist) writes such horrifically ponderous text.

    13. Re:There is no "issue." *I* own my files and data by CharlyFoxtrot · · Score: 2

      There's nothing wrong with DRM when it's used to protect my ownership of my files. Would you be opposed to a DRM scheme that would allow you to totally and irrevocably delete a picture you posted to Facebook because it allows you to retain total ownership ? The problem with DRM is when it's used to take away rights you traditionally hold, i.e. when DRM is used to reduce your ownership instead of increasing it.

      --
      If all else fails, immortality can always be assured by spectacular error.
    14. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 1

      Yep, sounds like a very elaborate way to justify DRM, while denying that it's DRM. It walks like a duck, quacks like a duck, and swims like a duck--but MS is issuing a paper to let us know it's *not* a duck. It's a new *file paradigm*, see.

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    15. Re:There is no "issue." *I* own my files and data by CharlyFoxtrot · · Score: 2

      But the ability of a user to delete his "cloud" files would be a benefit. DRM is only evil when it gives a third party control over your stuff, not when it gives you control over your own stuff.

      --
      If all else fails, immortality can always be assured by spectacular error.
    16. Re:There is no "issue." *I* own my files and data by SharkLaser · · Score: 1

      RIAA/MPAA doesn't own them. They represent some (not all) of those who do on certain matters, like piracy.

    17. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 1

      That's because it is Digital Rights Management.

      They're basically trying top find a DRM scheme that serves the needs of business. That is: doesn't get in the way of viewing files, tracks who is respoonsible for that file, and grants the file owner the ability to manage the file regardless of location.

    18. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 1

      There's nothing wrong with DRM when it's used to protect my ownership of my files.

      Yeah? And who do you think is going to run the central system that administers all this DRM? You, or MS? And if MS is running it (and it's on your system too), what makes you so sure it's still *your* data? Is there something stopping them from deleting it anytime they want on your system too?

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    19. Re:There is no "issue." *I* own my files and data by SharkLaser · · Score: 1

      Would you be opposed to a DRM scheme that would allow you to totally and irrevocably delete a picture you posted to Facebook because it allows you to retain total ownership ?

      But information wants to be freeeeee!!

    20. Re:There is no "issue." *I* own my files and data by Ultra64 · · Score: 1

      Jesus Christ, there should be a limit to paranoia. They are obviously talking about ownership in the context of the filesystem. Ie. file X is owned by user ID 1

    21. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 2

      DRM is only evil when it gives a third party control

      Who do you think is going to be running the central service that administers all this DRM?

      I'll give you a hint. It rhymes with Picrosoft.

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    22. Re:There is no "issue." *I* own my files and data by EdZ · · Score: 1

      To me, it sounds like they've looked at ZFS and thought "hey, that sounds like a good idea". Abstracted storage (bits of your files could end up split up and spread multiple times redundantly across physical volumes, but files will still respond to all the usual operators), lots of metadata (including a history of changes to files), built in error-checking, etc.

      The future is here, and unfortunately is currently owned by Oracle.

    23. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 1

      You need to read the paper (I know it's sacrilegious to say that on /.). They *start off* by talking about file systems, but by the end it moves very much into the cloud and the internet and advocates for a thinly-veiled DRM system for all files, under the guise of "this will allow users to delete and control their files anywhere, even in the cloud or on the internet."

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    24. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      There's plenty of software TPM emulators, but you won't have a signed endorsement key so it won't do you much good.

    25. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      Any files 'on the cloud' that you own, you can already delete. Any files on the cloud that are theirs, you can't. I don't see how this is supposed to change that.

    26. Re:There is no "issue." *I* own my files and data by Dog-Cow · · Score: 0

      This is DRM the exact same way that ACLs and permission bits are. I guess Linux is completely evil because it supports DRM!!!!!1111oneone!

    27. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 2

      I am nominating your post for the Irony Awards. I think you're a shoe-in this year.

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    28. Re:There is no "issue." *I* own my files and data by digitig · · Score: 3, Informative

      And if instead of a picture it was a music track or a book? And if you charged the customer for access to it? And you could still delete it after they had "bought" it? And how does that look from the other side of the fence? How is your sort of DRM any different from the "bad" sort?

      --
      Quidnam Latine loqui modo coepi?
    29. Re:There is no "issue." *I* own my files and data by imric · · Score: 1

      Sure.

      --
      Paranoia is a Survival Trait!
    30. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 1

      Either way, you're an immature piece of shit for assuming that because MS is associated with the idea that it must be wrong, bad and/or evil.

      Quite funny, considering that I'm frequently accused of being an MS apologist.

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    31. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 1

      Looks like someone needs a hug.

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    32. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      Once someone else has downloaded it, it isn't your file any more.

    33. Re:There is no "issue." *I* own my files and data by marcosdumay · · Score: 1

      I'm certain Microsoft has only good intentions with this... I mean, they've never did anything wrong in the past, and all...

      But even so, it is still DRM. That means, you'll still be prohibited to use your computer in any unauthorized way, you won't be able to transmit data to unauthorizes devices or computers, you won't be able to communicate with anybody that isn't using Windows, and the government will still be able to force the central controller to delete any data you have that they don't want you to (did you record that time a cop asked you a bribe? forget, it's gone). That is the way DRM works, there is no way around it.

    34. Re:There is no "issue." *I* own my files and data by imric · · Score: 1

      *shrug* Well, that was a mature, well reasoned response to a post that attempted to put a humorous spin on a very real problem. The FACT that the 'C'loud is an attempt to own data that YOU produce and then rent it back to you is, of course, totally reasonable to you, right?

      But that's OK. I guess you are one of the ones who capitalize the word 'cloud' to make it 'New'.

      --
      Paranoia is a Survival Trait!
    35. Re:There is no "issue." *I* own my files and data by imric · · Score: 1

      Yes, because MS has earned the trust of their customers, and claiming that anyone that disagrees with them is "an immature piece of shit" is adult discourse.

      Go back down to the basement and get some sleep, your Mom doesn't like t when you are cranky like this. You shouldn't stay up playing World of Warcraft for so long on a school night.

      --
      Paranoia is a Survival Trait!
    36. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      And, conversely, your naivety seems disingenuous at best. Perhaps you're paid by the company which sounds like Picrosoft? Such hate you have, I enjoy poking you.

    37. Re:There is no "issue." *I* own my files and data by StuartHankins · · Score: 3, Insightful

      +1 Insightful. Allowing Microsoft to do this sort of thing would be a horrible mistake. They've shown they can't be trusted too many times. Maybe the kids weren't aware when this stuff started, but I still remember the tricks Microsoft played... and are still playing. Boo on them forever in my book.

      Poetic justice would have Apple purchase Microsoft and break it into divisions.

    38. Re:There is no "issue." *I* own my files and data by StuartHankins · · Score: 2

      Yes I would be opposed. Nothing is 100% secure and having all my files disappear would be unacceptable. My files, my ownership, on my machines. That's how I like it.

    39. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      Such a silly troll. Did you buy that UID? Or do you work for Microsoft? I think I will haunt you.

    40. Re:There is no "issue." *I* own my files and data by J'raxis · · Score: 1

      Would you be opposed to a DRM scheme that would allow you to totally and irrevocably delete a picture you posted to Facebook because it allows you to retain total ownership ?

      Yup, I would be, just like I'm opposed to any examples of a false sense of security. Once a piece of information is out there, it's out there. Any attempt to delete it should be viewed as merely that: An attempt, with an assumption that there still are, and always will be, copies out there. Anything more and you're just kidding yourself. Even if there be software controls in place preventing copying, you can't assume that they weren't bypassed by someone trying to make a copy. Even if some form of encryption be used, you can't be sure someone hasn't illegitimately acquired the keys.

      You want your data 100% private, don't distribute it on any "open" communication channels.

    41. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      And you're calling other people immature? Own a mirror?

    42. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      Would you be opposed to a DRM scheme that would allow you to totally and irrevocably delete a picture you posted to Facebook because it allows you to retain total ownership ?

      Begging the question fail. Yes, I would be opposed to a general scheme that would allow this (not that one could really ever exist in the fully general case... government and military users and all). Even your stupid little example has no benefit unless you can somehow delete every copy that existed after being downloaded from Facebook. If you could do that effectively, it means there must be something so deeply rooted in commodity computing devices that general purpose computers do not exist anymore. Sorry, but that is a cost for so little benefit (covering up your worthless life's drunken stupidity) that I would never support it.

      Go fuck yourself.

    43. Re:There is no "issue." *I* own my files and data by h4rr4r · · Score: 1

      Nope, neither of the collaborators do. The guy who owns the cloud does. For proof look at who can delete it permanently.

    44. Re:There is no "issue." *I* own my files and data by fuzzyfuzzyfungus · · Score: 1

      Incidentally, how does endorsement key signing get handled in field applications of TPMs?

      Obviously, an endorsement key that is simply unsigned would fail; but signing stuff is easy, it's just that most people won't care about your signature.

      Does the vendor of the application choose which TPM vendors(of which there are a decent number, and which change sometimes) to trust the signatures of? Do the SSL CAs, who've done such a fantastic job, get roped in to this as roots of trust?

      I recognize that it would be(barring a decapping and chip-level analysis of a genuine TPM) impossible for my soft-TPM or microcontroller-attached-to-the-LPC-bus to impersonate a, say, Infineon TPM; but it should be entirely possible for it to show as a genuine, signed, SporePoint Security LLC, TPM, signed with a set of keys generated by me.

      Who sets the 'trusted' signers?

    45. Re:There is no "issue." *I* own my files and data by i+kan+reed · · Score: 1

      How do you know it doesn't involve remote crypto schemes. It's plainly implied to be a network supported system. You think linux has a chance in hell of being supported on this pony ride, you've missed the point. If file decryption can only be handled by operating systems connected a central authorization databases, Microsoft wins by denying linux users any access at all to said system. I give a hefty probability that Linux won't support this DRM at all.

    46. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      Yeah? And who do you think is going to run the central system that administers all this DRM? You, or MS? And if MS is running it (and it's on your system too), what makes you so sure it's still *your* data? Is there something stopping them from deleting it anytime they want on your system too?

      Oh my God TERRISTS!!!! data retention laws passed by our benevolent older brother government will ensure that no data in the cloud ever, ever, ever gets deleted. If you delete something from the cloud you can never get it back, but the govt retains access in perpetuity.

    47. Re:There is no "issue." *I* own my files and data by biodata · · Score: 4, Insightful

      The cloud idea likes to project an illusion of it not mattering where the file is, but it is predicated on (more or less) limitless bandwidth with near zero latency, and limitless local storage/cache. If the file you want is not on the local hard disk then it isn't. If your OS needs to fetch it behind the scenes then you need to wait until it arrives. Yes you might think you don't want to know where the file is physically, but when it takes ten minutes to open a file that should take ten seconds, you will probably want to know why (oh, it's in another country and the network is busy because everyone is watching some new TV prog, i see now). Not knowing where the file is just means needing to ask all the time. Is it really better not to know, than just knowing in the first place, and making sure it is where you need it to be? Bandwidth will never be unlimited and latency will never be zero. We are routinely working on 10GB files now where I work, and you always need to know where they are, and to care because however big the pipes are and how ever big the disk space and the RAM, the data streams grow even faster. The technologies underlying data capture devices obey their own version of Moore's law, frequently with higher multiplicities.

      --
      Korma: Good
    48. Re:There is no "issue." *I* own my files and data by grumbel · · Score: 1

      While MS might do evil things with it, the notion of ownership could actually be quite useful for files. For backup it would for example be great to be able to quickly tell which files on my system are actually written by me, which where automatically created by my computer and which are part of the distribution. Thus I would only need to backup those files that are actually unique, not all the stuff that is trivial to recover from public sources.

      It would also make things like Creative Commons easier, as the file itself could track who created it and modified it, so I wouldn't need to manually carry that meta-information which me, which is often easily lost or incomplete when the work of many different people is combined together.

      But of course the notion of ownership alone wouldn't be that big an improvement, in a perfect world each file should essentially be a Git-like repository that keeps track of all the changes done to it.

    49. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      Want a real treat? Read it's posting history.

    50. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      I'm sorry, but MS issuing a paper on the "issues of file ownership" and the cloud sends a little chill up my spine.

      I'm sorry, but the issue in question occurred the first time you published anything on any medium (whether a website or a pamphlet protesting the Stamp Act) or when you first used storage that wasn't fully under your own control (whether a team's git repository or a railroad depot located on someone else's land near your farm). These are not 2011 issues; people has to face the consequences of doing these things hundreds of years ago. If Microsoft's paper chilled your spine, I think your spine was already pretty chilly. :-) There's nothing wrong with Microsoft discussing the fact that other people have the capacity to quote your pamphlet, or the complexities of answering "where is the grain?" when a portion of it happens to be in transit in between your farm and the depot.

      It also makes me wonder if someone isn't looking to take a little more "ownership" of what has traditionally been considered *my* data.

      Looking to take a little more?! Did you read their example of a photo on Facebook and the "likes" metadata? Hey, it's great that you view Microsoft with suspicion, that's totally justified. But you're reading way too much into this. We're already in a world where lots of people are trying to get you to host "your" data on their servers. Since lots of money-spending people have proven themselves incapable of Just Saying No to these offers, it is a fact of life whether you join them or not. If you're going to provide some related goods or services to those people, you might as well study what all they're doing. And if you think Microsoft doesn't want to sell something to Facebook's users...

    51. Re:There is no "issue." *I* own my files and data by Gazzonyx · · Score: 1

      You should ask TheRaven64 on Slashdot about this. He wrote the book on Xen, literally, IIRC.

      --

      If I mod you up, it doesn't necessarily mean I agree with what you've said, sorry.

    52. Re:There is no "issue." *I* own my files and data by Short+Circuit · · Score: 1

      Is he LSC or CT? I've got the book, and interact with LSC on IRC regularly; he's one of the principals behind prgmr.com, of which I'm a customer. (And I've got the book...)

    53. Re:There is no "issue." *I* own my files and data by Dripdry · · Score: 2

      Before you know it we'll have to send in Tron to stop the Microsoft Control Program
      (showing my age here)

      --
      -
    54. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      The issue is once you've given out the information their copy is no longer yours to assert control of. Your "right" asserted through DRM comes at the expense of their "right" of township. It's an inherent conflict created by copyright. It's in a fiscal entities interest to limit your rights to what ever extent maximizes their profit. They could intentionally perpetually create new distribution system and stop supporting old systems you fully invested in.

      Copyright advocates would extract "their" content from your mind if they could. Heaven forbid they should ever find a way.

    55. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      Poetic justice would have Apple purchase Microsoft and break it into divisions.

      You can't be fucking serous. Apple has proved 10 times more draconian when it comes to controlling what the user can/can't do. Yes, let's please give control of the "cloud" to Apple so they can raise their "walled gardens for idiot users" all around it. Brilliant.

      Did your parents have any offspring that lived?

    56. Re:There is no "issue." *I* own my files and data by sexconker · · Score: 1

      Who sets the 'trusted' signers?

      You do. Or, you should.
      Trust that is automated or extendable is stupid.
      But it's easier and faster so that's what we have.

    57. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      No, they're talking about DRM.

      Let us not forget about the good sides to this, such as when we publish a file with a copyright disclaimer in < html > in a page in a link, but still manages to get overlooked and sucked up by vampires, leeches and morons (or perhaps web crawlers?).

      Example: Artwork on a collab site gets inadvertently re-published by Google Images and ends up in a corporate slideshow background. Or how about the infamous (or hardly known?) Amen Break.

      The implications of cross-platform implied ownership on the file-system level are only MOSTLY bad, not ALL bad. And we'll always have "FILEFab Qt" to crack that nutshell too.

      -Tres

    58. Re:There is no "issue." *I* own my files and data by Curunir_wolf · · Score: 1

      What it sounds like to me is separating the notions of files from the notion of storage. So only the engineer and the underlying system needs to worry about whether your data is on your hard drive, or the cloud, or a pen drive. Instead, the user can just worry about their text/image/video, wherever it happens to be.

      Yea, that's kind of what it sounded like to me, too, and it's an absolutely horrible idea. Creating symlinks or somehow abstracting the storage location for convenience is one thing, but discovering the location in those cases is trivial. Claiming the user doesn't need to know about the storage and hiding it so deep it requires multiple tools or deeply obscured properties is a nightmarish scenario.

      Microsoft already thinks this is a good idea, though, and it's already causing headaches. I've had to deal with issues from "volume shadow storage", those crazy library linkages (whatever they call them), and the user profile "virtualStore", that seems to want to redirect files sometimes for some unknown reason but constantly displays them as if they were somewhere else. More of this insanity is not a good thing.

      --
      "Somebody has to do something. It's just incredibly pathetic it has to be us."
      --- Jerry Garcia
    59. Re:There is no "issue." *I* own my files and data by elrous0 · · Score: 1

      Well, he does fight for the users.

      --
      SJW: Someone who has run out of real oppression, and has to fake it.
    60. Re:There is no "issue." *I* own my files and data by Gazzonyx · · Score: 1

      I'm really not sure.

      --

      If I mod you up, it doesn't necessarily mean I agree with what you've said, sorry.

    61. Re:There is no "issue." *I* own my files and data by hazah · · Score: 1

      I concur. Hilariously ironic. Everyone *else* is the hater, of course.

    62. Re:There is no "issue." *I* own my files and data by ET3D · · Score: 1

      That's exactly the ownership that Microsoft is writing about. Do you want Facebook to be able to publish your photos even after you've deleted them (and possibly deleted your account)? The article also mentions Amazon's Kindle lending. Buyers of e-books want to be able to lend and resell their books, they want ownership.

      Sure, all these things require DRM, but just because it can be used in annoying manners doesn't mean it's not useful for people in general.

    63. Re:There is no "issue." *I* own my files and data by jd · · Score: 1

      There's a full logging filesystem for Linux these days and there are countless patches for all the attribute and access schemes out there (plus the ones already in Linux, including those via the LSM), so it's just possible Microsoft is trying to catch up with Linux as well as Solaris.

      This is one of those times where we are actually ahead of the game, not just in Linux but in many other F/L/OSS OS'. It would be good if we can actually STAY ahead of the curve rather than be overtaken.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    64. Re:There is no "issue." *I* own my files and data by spitzak · · Score: 1

      Would you be opposed to a DRM scheme that would allow you to totally and irrevocably delete a picture you posted to Facebook

      You mean a scheme where I can affect every computer that looked at that picture (and thus made a copy to local storage)?

      Damn right I would be VERY VERY strongly opposed to such a scheme!

    65. Re:There is no "issue." *I* own my files and data by CharlyFoxtrot · · Score: 1

      So don't use it. Any such scheme would simply allow a file to be "free to copy" too. But people have to make their minds up, the same who are very strongly opposed to any kind of DRM scheme are often also the ones lamenting the loss of privacy and malappropriation of photos through sites such as Facebook but one is a potential (partial) remedy for the other.

      --
      If all else fails, immortality can always be assured by spectacular error.
    66. Re:There is no "issue." *I* own my files and data by colinrichardday · · Score: 1

      Hey, this is slashdot. Would you mind taking your facts and logic elsewhere?

      Sorry, I don't have mod points.

    67. Re:There is no "issue." *I* own my files and data by colinrichardday · · Score: 1

      Of what value is ownership without access? To continue the car analogy of a previous poster, if someone tows my car 10 miles away, it's still my car, but I have to walk 10 miles to get it. Some people have gigabyte-sized files, of what use is ownership if bandwidth/latency constraints make it difficult to get those files?

    68. Re:There is no "issue." *I* own my files and data by CharlyFoxtrot · · Score: 1

      I don't see why the master key to your files couldn't be both managed on the local system(s) and on a cloud server belonging to MS, Apple, Microsoft, FSF, or any other entity. Besides they can already delete all your (remotely stored) data now, the problem is you can't reliably do so. That's the problem they are trying to solve.

      --
      If all else fails, immortality can always be assured by spectacular error.
    69. Re:There is no "issue." *I* own my files and data by CharlyFoxtrot · · Score: 1

      The problem with that isn't that you are being charged for access. People pay for access to a resource all the time. The problem there is that your ownership is being reduced: where you used to own the track or book you are now renting. That's not actually a problem with the technology but with the mindset of the content cartels.

      --
      If all else fails, immortality can always be assured by spectacular error.
    70. Re:There is no "issue." *I* own my files and data by Millennium · · Score: 1

      DRM is only evil when it gives a third party control over your stuff...

      ...which is the entire point of DRM.

      ...not when it gives you control over your own stuff.

      This has never happened. Furthermore, it never will. That would go against the point of the exercise. It's occasionally sold as a "potential" "future" benefit, but no one has any intention of ever implementing it.

    71. Re:There is no "issue." *I* own my files and data by gambino21 · · Score: 1

      I would expect to be able to delete a picture from my profile just as a good usability practice, without having anything to do with ownership of the picture. However, I would not expect that I would be able to delete all copies of the picture, regardless of who owns it. So if I understand your question correctly, then yes, I would oppose a DRM scheme that would try to force Facebook to comply with a request to delete all copies of the picture.

    72. Re:There is no "issue." *I* own my files and data by CharlyFoxtrot · · Score: 1

      There's no reason why it should be MS. There could be certification authorities run by anyone, you could run your own even. IF the technology was developed right.

      --
      If all else fails, immortality can always be assured by spectacular error.
    73. Re:There is no "issue." *I* own my files and data by Grishnakh · · Score: 1

      Exactly. If you want absolute control of critical data, it's simple: don't give to anyone else. Keep it to yourself, on machines that you manage, and nowhere else.

    74. Re:There is no "issue." *I* own my files and data by Grishnakh · · Score: 1

      Would you be opposed to a DRM scheme that would allow you to totally and irrevocably delete a picture you posted to Facebook because it allows you to retain total ownership ?

      Yes, I would, because 1) it's a stupid idea, and 2) it removes freedom. Why should Facebook be required to set up their systems to honor this scheme in the first place? If they don't implement it voluntarily, then it wouldn't work, and you wouldn't be able to delete the picture anyway.

      It's very simple: if you don't want Facebook to have a copy of your embarrassing picture, then don't upload it to Facebook. If you do stupid things with your information and get burned, that's your own fault.

    75. Re:There is no "issue." *I* own my files and data by Grishnakh · · Score: 1

      There is no "malappropriation of photos" in Facebook. What the hell is wrong with you? If you don't want your embarrassing photos on Facebook, then don't upload them to Facebook.

    76. Re:There is no "issue." *I* own my files and data by digitig · · Score: 1

      But you recognise the point: it's the same technology. That's why I'd be opposed to it, because even though I know I am an information-sharing saint and am sure you are too, a lot of information is provided by jerkwads.

      --
      Quidnam Latine loqui modo coepi?
    77. Re:There is no "issue." *I* own my files and data by tepples · · Score: 1

      So don't use it.

      Even if I don't use it, I don't appreciate my computer being crippled so that other people can use it.

    78. Re:There is no "issue." *I* own my files and data by sjames · · Score: 1

      DRM is made to be broken. That is especially true when you expect the DRM vendor to protect your data from their prying eyes.

    79. Re:There is no "issue." *I* own my files and data by AlexReidy · · Score: 1

      I'm fairly sure that even if Microsoft decided to make some bogus statement that they owned your Word files, you would still have the upper hand in the case that they decided to be difficult and take them away from you. If you wrote something original down on your Word document, the copyright of it will naturally belong to you, so you can just turn around and call copyright infringement on them. Though I suppose they could *take* "their" file instead of copying it, but you could always get technical with them down to the byte code and say that technically moving any digital information is copying it to a new location thus infringing on the copyright. Yeah, we can play unnecessary games too, Microsoft. To completely prevent any problems, you could just print the document (unless the paper company decides that they own the paper that is sold to you). But really, I doubt Microsoft would do something like this.

    80. Re:There is no "issue." *I* own my files and data by rsborg · · Score: 1

      Don't worry, user, of course you own those little files of yours.

      s/user/license holder/

      If you think "you" are the intended interest party, you are sorely mistaken. Microsoft's real customers are the content companies, hardware manufacturers and large corporations/organizations.

      In fact, this rant this applies equally to most political discussion... you think Obama/Romney/etc are really interested in your vote? They know your vote will be influenced by marketing dollars controlled by larger entities anyway. It's all a bunch of foxes deciding the seating plan for dinner at the chicken coop.

      --
      Make sure everyone's vote counts: Verified Voting
    81. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      If that is really what they are talking about burn the file to dvd or whatever format is out at the time

    82. Re:There is no "issue." *I* own my files and data by Anonymous Coward · · Score: 0

      How do they do physical presence tests? Virtually?

  2. BeOS was there already by frnic · · Score: 1

    Sounds familiar...

    1. Re:BeOS was there already by Anonymous Coward · · Score: 0

      Interestingly, NTFS has better support for compiling Haiku with attributes than does ext* or HFS+. BtrFS will probably support xattr, but it's still broken garbage right now.

    2. Re:BeOS was there already by Anonymous Coward · · Score: 0

      I was under the impression that ext2 supported attributes, but you need to enable them with mount flags.

    3. Re:BeOS was there already by Anonymous Coward · · Score: 0

      What about reiserfs (metadata) and plan9 (transparent network)?

    4. Re:BeOS was there already by jd · · Score: 1

      The nature of files was also revisited in Plan 9/Inferno, and has been occasionally re-examined in countless other OS'. This isn't to contradict what you said, but rather to reinforce the point that Microsoft is very late in the game.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
  3. Ugh by Anrego · · Score: 2

    I couldn’t make it through the first paper. It came across as meandering and very academic. Didn’t try the second

    Either way, of all the stuff that is currently broken, files are one of the few things that still mostly work. Yes would be nice to have more standardization and maybe metadata, but I don’t foresee it happening. And yes users sometimes get confused, but the generally figure stuff out.. and nothing described in the article seemed any more intuitive and would probably be just as miss-understood by users.

    We’ll end up with 10 different standards, and no one will bother keeping metadata accurate on all their files. At best metadata is useful for a single person on a small subset of files where they find it useful. Everything else, the only metadata anyone is going to care about (and be bothered to enter) is title, which is served fairly effectively by the file name.

    1. Re:Ugh by fuzzyfuzzyfungus · · Score: 2

      Are you saying that quoting Wittgenstein in a paper that is ostensibly concerned with file structures is pretentious, content-free twaddle?

      Couldn't be...

    2. Re:Ugh by Anonymous Coward · · Score: 0

      should try the 2nd one. it looks more technical

    3. Re:Ugh by JoeMerchant · · Score: 1

      We’ll end up with 10 different standards, and no one will bother keeping metadata accurate on all their files.

      Concur, look at ID3 tags on audio files... any reason to believe that human behavior will improve in other areas?

    4. Re:Ugh by Anonymous Coward · · Score: 1

      Are you saying that quoting Wittgenstein in a paper that is ostensibly concerned with file structures is pretentious, content-free twaddle?

      In my experience, anything that quotes Wittgenstein is pretentious twaddle. And quite frequently content-free as well!

    5. Re:Ugh by Hatta · · Score: 1

      Hmm, all the ID3 tags on my MP3s are in order. Get your files from good people, and you'll get good metadata.

      --
      Give me Classic Slashdot or give me death!
    6. Re:Ugh by Hatta · · Score: 1

      Either way, of all the stuff that is currently broken, files are one of the few things that still mostly work.

      Exactly. It's time to screw that up now.

      --
      Give me Classic Slashdot or give me death!
    7. Re:Ugh by JoeMerchant · · Score: 1

      I've made my own MP3s from my CD collection, over a span of years, most of them are in good shape now, but it was not always so, early software didn't make it easy to do and later it just wasn't worth the effort to go back and fix. I made a pass at it about 2 years ago and I think I got from A to about T before terminal ennui set in.

    8. Re:Ugh by Anonymous Coward · · Score: 0

      I couldn’t make it through the first paper. It came across as meandering and very academic.

      An academic paper coming across as academic...you don't say

    9. Re:Ugh by Anrego · · Score: 1

      Well, there is a point at which an acedemic paper is so acedemic that it distinguishes itself as distinctly (or in my phrasing, very) acedemic. The meandering point was more relevant however. Reading it felt like watching a weatherman who won't just tell you what the damn temperature is gonna be.

    10. Re:Ugh by Hatta · · Score: 1

      I did most of my personal CD ripping with crip about 10 years ago. Super easy, and as automated possible. CDDB's been around since the mid-90s.

      I haven't bought many CDs in the past 10 years though. Generally I get my music in the form of live recordings from db.etree.org or archive.org. I reward the artist by going to shows.

      --
      Give me Classic Slashdot or give me death!
    11. Re:Ugh by gman003 · · Score: 1

      I keep all the ID3 tags I care about in order as well. Artist, definitely, as well as album and release year. Genre I don't do quite so specifically (I don't split it beyond "Rock" or "Metal" or "Soundtrack" or so on).

      There is one weird bit, though. Several files have the times. It claims to be, say, 26 minutes, but it only has 5 minutes or so of data. Since it's only a very few songs, it's not a big deal, and I've never cared enough to figure out how to fix it.

    12. Re:Ugh by Tetsujin · · Score: 1

      Hmm, all the ID3 tags on my MP3s are in order. Get your files from good people, and you'll get good metadata.

      Yeah seriously. And you know what? When I had an MP3 file whose metadata wasn't in order... It messed up the sorting in iTunes (back when I used iTunes) - and so I'd invariably edit the metadata to fix it.

      If you have a UI that incorporates the idea of metadata and relies on it, and helps you work with it, you're much more likely to maintain it properly.

      --
      Bow-ties are cool.
    13. Re:Ugh by jd · · Score: 1

      So you're saying their point was academic?

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    14. Re:Ugh by Anonymous Coward · · Score: 0

      Or just use MusicBrainz Picard. I know it's not complete, and YMMV, but about 99% of my collection is there. And I can always submit data when it's not and get that warm tingly feeling that I'm helping my fellow men.

    15. Re:Ugh by Anonymous Coward · · Score: 0

      I propose using content-free grammars to define the universal language. Twaddle that, Wittgenstein!

    16. Re:Ugh by The+Askylist · · Score: 1

      7.

    17. Re:Ugh by Grishnakh · · Score: 1

      And all this shows why you don't need the filesystem to track metadata, all you have to do is embed it into the file.

  4. Auto deleting files... by klubar · · Score: 4, Interesting

    I've always thought it would be useful if you could mark as file as automatically deleting at a certain date. If you create a temporary file, it would be nice to flag it as "delete after 60 days" so it doesn't need attention in the future. (The same functionality would be really useful for emials...I want to save this email until after the event (or whatever it's about) and then have it automatically deleted.) I once saw the file functionality on a custom Cray operating system in the 1977.

    1. Re:Auto deleting files... by The+MAZZTer · · Score: 1

      If your file system starts to fill up Disk Cleanup will wipe your Temp folder of old files. So this functionality sort of already exists, it's just not automatic (you have to answer the "disk is filling up" prompt) and doesn't delete files unless it needs to.

    2. Re:Auto deleting files... by The+MAZZTer · · Score: 1

      And of course I'm thinking of Windows, not 'nix, but Ubuntu does have a similar tool and functionality as well. Not sure about other 'nixes.

    3. Re:Auto deleting files... by somersault · · Score: 1

      I think "archiving" rather than deleting after a certain date would be vastly preferable. Then just clear out the archive folder manually from time to time. Otherwise there's too much potential for user error/confusion. In a networked environment especially that could cause some headaches.

      I think with email thing you should set up a meeting/calendar event rather than have a plain email. Then you get reminders of it up until the event, and afterwards it's out of your hair, unless you want to review it on the calendar..

      --
      which is totally what she said
    4. Re:Auto deleting files... by fuzzyfuzzyfungus · · Score: 1

      That's the sort of function that would probably be the work of a weekend to add if you just wanted it to work on your computer(crudest case, just a wrapper that automatically creates a cron job/scheduled task to delete at the desired time in the future; if you wanted it to still work if the file is moved/copied you'd need a metadata facililty and a scrubber task that kills files at their marked expiration times).

      Now, on the other hand, if you want a system that is even possible for random 3rd party systems and devices to voluntarily adhere to(even after http uploads, metadata getting sheared off by a trip across a fat32 flash drive, handling for both HFS+ and NTFS metadata storage variants, etc, etc. support for mobile devices, web services where files are blobs in a DB, etc) You Have Fun With That, as they say.

      And, of course, if you want 'trusted' expiration on random 3rd party systems, nothing short of a dystopian step back from general purpose computing will do...

    5. Re:Auto deleting files... by grasshoppa · · Score: 1

      I'd take this one step further; There are certain classifications of files that need special behaviors ( encryption, reliability, ect.. ) as well as special permissions ( if it's an evidence file, it belongs to the investigators by default, ect... ). That's why I'd like file system tags. Where if you tag a file with "HR Policy", it will auto-assign the correct permissions AND assign special behaviors ( file is deleted 7 years after implementation, it's not allowed to be copied off the network, it's encrypted, ect... ).

      One of the largest issues I've seen in corporate culture is the proliferation of data in file shares. Often, file shares are filled with cruft that no one has any idea if anyone else is using. This leads to "Don't touch it!", and file share quotas that balloon out of control. An automated system of this nature would help curb that.

      --
      Mod me down with all of your hatred and your journey towards the dark side will be complete!
    6. Re:Auto deleting files... by deniable · · Score: 2

      Lots of people have 'temp' files that don't live in %TEMP^%. I had to move *important* data for one of our units a couple of months ago and saw a file 'To do December 2002' or some such. Things like that should have expiry dates.

    7. Re:Auto deleting files... by deniable · · Score: 1

      When you get to that level you use document management systems that have security and features like retention and disposal schedules. As for cruft, we end up not being allowed to delete files because nobody can tell us who owns it or can make a decision.

    8. Re:Auto deleting files... by Gordonjcp · · Score: 1

      You could probably do this with extended attributes and then something like inotify to watch when a file has an expiry date set. Every so often a cron job would check if something has passed its expiry date and delete or archive it.

    9. Re:Auto deleting files... by argStyopa · · Score: 1

      I'd bet you could get a government subsidy for developing this file system.

      Bonus points (from the gov'ts point of view) if you could set the delete-date to be file-creation date.

      --
      -Styopa
    10. Re:Auto deleting files... by grasshoppa · · Score: 1

      I have looked in to those, but they are almost universally more complex than needed. My user base tends to be a bit...low on the technical knowledge. I don't want to introduce a complex system like that, only to be stuck in perpetual training hell for the duration of it's deployment.

      A simple drag/drop system, where I can "tag" a file would be the simplest system I can envision that would do the job I need. It certainly wouldn't be as full featured as many of the document management systems out there, but it would be within reach of the average user. And we wouldn't have to keep training the same material, over and over again.

      --
      Mod me down with all of your hatred and your journey towards the dark side will be complete!
    11. Re:Auto deleting files... by wisnoskij · · Score: 1

      Neither of those would be a change to the concept of a file.
      The "delete after 60 days" would simply be accomplished by sending the request to a application who would store all requests and then every so often check to see if a file needs deleting. This is really the only way to accomplish your functionality, because if you stored the data in the file then the entire HD would have to be scanned constantly looking at all the files until one went out of date.
      The same would happen with emails except that the deletion app in question would pretty much have to be the email manager.

      --
      Troll is not a replacement for I disagree.
    12. Re:Auto deleting files... by Anonymous Coward · · Score: 0

      The temp file API in Win32 acually allows temp files to be deleted automatically, it's just that nobody uses it properly.

    13. Re:Auto deleting files... by Anrego · · Score: 1

      On *nix, this tends to be the idea of the /tmp directory. A consistent place to put stuff that is temporary, and yes, many different strategies for keeping it clean.

      The approach I use (on Gentoo, but this can work on any distro), is my /tmp file system is a ramdisk. In other words, it's effectively cleared out every time I reboot (which isn't often.. but I have a lot of ram and temp files are small..). This is also better if you use a SSD. Generally files in /tmp are safe to delete on reboot, at the very least I've never had a problem.

      Also telling firefox (or whatever browser you prefer) to store all cache in ram is also a good way to prevent a buildup of all these silly files. If you restart your browser a lot, you can also set the cache directory to a ramdisk (similar to /tmp), however I've never found this necessary.

    14. Re:Auto deleting files... by gzipped_tar · · Score: 1

      Modern distros do that by calling a script from cron that invokes tmpwatch in a way specified by the user. But that's not the point. The point is that UNIX philosophy values the property of "doing one thing and doing it well" and abhors over-integration of functionalities. If you want automated, event-driven operations on the files, grab an application that exclusively performs the job. Just leave the filesystem and OS alone.

      --
      Colorless green Cthulhu waits dreaming furiously.
    15. Re:Auto deleting files... by Anonymous Coward · · Score: 1

      if only current files had an 'archive' bit. that would clear things up.

    16. Re:Auto deleting files... by Anonymous Coward · · Score: 0

      good point, but i think the general computing trend these days is that "nothing should be deleted", and given the current size of harddrives, this trend is all the more appealing:

      you cannot delete your online account (virtually all websites implement this)
      you cannot delete your payment history
      you cannot delete your publically made posts
      etc...

      the result is that there are giant data stores building up everywhere, containing an incredible amount of information that is completely fucking useless and should be deleted, except that companies take the information, do math on it, and sell the results to advertisers.

      take this post for example, after the story is archived, and since it probably wont be modded > 3, its probably useless and should be discarded. but it wont be. it will exist until the end of slashdot.

      stupid.

    17. Re:Auto deleting files... by Khopesh · · Score: 2

      I have coworkers that do this (on Posix systems). They prefix temporary files' names with commas. Then all they need is a daily cron job like this:

      0 4 * * * * find $HOME -name ',*' -mtime +30 2>/dev/null |xargs rm -rf

      Voilà!

      --
      Use my userscript to add story images to Slashdot. There's no going back.
    18. Re:Auto deleting files... by grumbel · · Score: 1

      I would like to see it done the other way: Never ever delete anything. I'd like to see file systems getting the ability to archive content, so that it can still be retrieved, but doesn't clutter up the current workspace. Today drives are gigantic that one is never ever going to fill them up with regular text documents, considering how much effort gets put into writing these documents it's rather ridiculous how insecurely they are stored, two wrong clicks can destroy days or weeks of work and no current OS has a proper build in versioning/undo system at the file system level.

      I'd like to see the filesystem becoming more a permanent log-book of the users activity, saving not only every document version, but essentially every undo-step, every visited webpage and just plain everything the user does. For issues of privacy one could still provide a "wipe" like tool or something similar to Chromes Incognito mode. We have the storage to essentially never lose a file ever again, yet file systems are so primitive that they don't take any benefit of that and data loss is still a regular occurrence.

    19. Re:Auto deleting files... by GreatBunzinni · · Score: 1

      If you need to auto-delete files after $TIME then schedule a cron job to delete the file in a specific date. You don't need a completely refurbished file system that hijacks your files in order to do trivial tasks.

      --
      Slashdot, fix your code or at least hire someone who is competent at it to do it for you.
    20. Re:Auto deleting files... by Anonymous Coward · · Score: 0

      Um, what? Are we talking about the same "temp file API" that includes GetTempFileName, the documentation for which says "to delete the file, call DeleteFile"?

    21. Re:Auto deleting files... by Arlet · · Score: 1

      They better not have filenames with spaces in them, like ", ~"

    22. Re:Auto deleting files... by Anonymous Coward · · Score: 0

      Expiration date was explicit metadata on at least one file system I've used; Digital's VMS. It also had explicit versioning

    23. Re:Auto deleting files... by nine-times · · Score: 1

      As an IT guy, I'd hate this feature. Users would set this on all kinds of files because they thought they were doing something clever, and then it would be my problem to figure out how to recover them.

    24. Re:Auto deleting files... by Anonymous Coward · · Score: 0

      This can be done better on modern posix systems using posix extended attributes, this has the advantage that the filenames remain unchanged.

    25. Re:Auto deleting files... by Anonymous Coward · · Score: 0

      Star Wars (1977) called. It wants its The back. I see you have it.

    26. Re:Auto deleting files... by Bent+Mind · · Score: 1

      I'm also on Gentoo. However, my /tmp is on the hard drive. Gentoo has an automatic setting that wipes /tmp upon booting the system.

      --
      Request a Linux Shockwave player here: http://www.macromedia.com/support/email/wishform/
  5. Translation: All Our Data are Belong to Them by Anonymous Coward · · Score: 1

    The current understanding of a file is too conducive to local storage and user ownership for giant corporations who want to assume control of our data and rent it back to us for monthly fees or advertising intrusions.

    The delete function is a feature. It means I do not want that data to exist any more. I wonder why Google or Facebook might have a problem with that.

  6. ummm by Anonymous Coward · · Score: 1

    How is this any different from Files-11 (VMS native FS), NTFS, or HFS+?

  7. Hmm where have I seen by OzPeter · · Score: 2

    We suggest that one aspect of this adaptation is to encompass metadata within a file abstraction

    this before? Are resource forks coming back into vogue?

    --
    I am Slashdot. Are you Slashdot as well?
    1. Re:Hmm where have I seen by Prosthetic_Lips · · Score: 1

      That was the first thing I thought of, also. Sounds like the original Mac file system and their "file with payload" idea. It had good aspects and bad aspects, but it got weird when your file type got wrongly assigned somehow.

    2. Re:Hmm where have I seen by Jonner · · Score: 1

      We suggest that one aspect of this adaptation is to encompass metadata within a file abstraction

      this before? Are resource forks coming back into vogue?

      A little known feature that NTFS has had for long time is support for multiple streams per file, not just two. Some Microsoft apps use that I think, but I doubt many third party ones do. I was more interested in Reiser's approach of being able to treat the same name as either a normal file or a directory containing more ordinary files of metadata. Unfortunately, the fact that he's a convicted murder and doesn't seem to get along that well with other people generally has limited the dissemination of his interesting ideas. I can't even seem to find copies of his essays about ideas for future work on Reiser4 that used to be available at namesys.com

  8. an especial need by blackmesadude · · Score: 2

    really?

    1. Re:an especial need by Anonymous Coward · · Score: 0

      Si.

    2. Re:an especial need by Anonymous Coward · · Score: 0

      Perhaps he meant to write e-special?

  9. Also, by isopropanol · · Score: 1

    How is this any different from Files-11 (VMS native FS), NTFS, or HFS+?

    (re-posting my AC comment, logged in this time)

    1. Re:Also, by SJHillman · · Score: 1

      Those are file systems, which is how the OS keeps tracks of the files - not the files themselves. My understanding is that they're talking about the files themselves. Let's try a bad car analogy. The file system is where the cars are kept. It can be a parking lot, a garage or a field marked with cones. The cars are kept there in some sort of order so that you can go back to find your car later. The files are the cars themselves. You can take a car from a parking garage to a parking lot (IE: copying from ext3 to NTFS). What they're thinking about is the cars, not the parking lot. Then again, maybe I completely misunderstand this all.

    2. Re:Also, by GIL_Dude · · Score: 1

      And, they are saying the metadata should travel with the file - and not be a bolted on construct supported in different ways by different file systems. To continue your analogy, the car should still say "Toyota" and "Camry" on it even when it is moved from the parking garage to the parking lot. It should still have other metadata like "2006", the info on the door sticker like the curb weight, etc. Past implementations of this at an OS level have been a bit hit or miss with some file systems supporting an add on structure for meta data and others not supporting it. (This is not to say that some file formats don't already have this built in - certainly some do).

    3. Re:Also, by Millennium · · Score: 2

      The thing is, those particular file systems also use a different notion of what a file is than what Unix folks are used to. One major example of this is that on these systems, a file can contain multiple streams of data, which both NTFS and HFS+ call forks. NTFS doesn't use forks much, but Macs used them heavily in the pre-OSX days (not so much anymore).

      Files-11 and HFS+ also support a notion of files as being containers of discrete data records, rather than streams of bytes. Again, Macs used this concept heavily in the pre-OSX days, mostly when dealing with a file's resource fork, but it's not as common anymore.

    4. Re:Also, by petermgreen · · Score: 1

      Files-11 and HFS+ also support a notion of files as being containers of discrete data records, rather than streams of bytes. Again, Macs used this concept heavily in the pre-OSX days, mostly when dealing with a file's resource fork, but it's not as common anymore.

      And the reason it's not as common is that it made cross platform compatibility a nightmare. Everything can handle files that are a simple sequence of octets but many platforms and/or filesystems can't support anything more than that and when they can they often do so in incompatible ways.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    5. Re:Also, by Nadaka · · Score: 1

      We already have files that include their own metadata. Its called using a file type that includes its own metadata. A JRXML however will remember the zoom level and view position of iReports when it was last saved. A DOCX file stores the registered user of MS Word date of creation among other things. However every file type is different and there can be no universal metadata format. A simple text file can only contain the metadata you put in it, and there is no need to make a simple text file impossible.

    6. Re:Also, by MikeBabcock · · Score: 1

      While I agree that we need xattr style metadata that describes the file in some circumstances, I think most use-cases are simply misunderstanding the concept of a proper file structure.

      That said, it would be great to be able to do interesting things from the *nix command-line like:

      cat somecomplexfile.dat}meta=text | grep somethinginteresting

      Where "}" is a delimiter I invented that allows selection of subsections of a file.

      A formatted data file with subsections and metadata can be a pain to manipulate with good old CLI tools, and being able to specify a subsection of that data to parse or manipulate while ignoring the rest would be quite nice.

      --
      - Michael T. Babcock (Yes, I blog)
    7. Re:Also, by Coren22 · · Score: 1

      The ownership portion should even be part of this analogy.

      When I move my car from home to work, I don't suddenly lose ownership and control over it. My license plate doesn't suddenly change to a work license plate.

      The ownership portion of the idea sounds like they are trying to allow you to retain full ownership rights and protection of your files, even when they are stored in the Google, Amazon or whoever's cloud. The files don't magically become Google's property just because they are stored there. Microsoft is trying to design a framework for you to retain your rights.

      --
      APK likes to ask for responses to the same things over and over. Maybe he just likes the responses?
    8. Re:Also, by Guy+Harris · · Score: 1

      The thing is, those particular file systems also use a different notion of what a file is than what Unix folks are used to. One major example of this is that on these systems, a file can contain multiple streams of data, which both NTFS and HFS+ call forks.

      (...and UFS and ZFS on newer versions of Solaris call "extended attributes", even though they are arbitrary-sized named streams.)

      Files-11 and HFS+ also support a notion of files as being containers of discrete data records, rather than streams of bytes.

      Files-11, or, at least, RMS (no, not that RMS :-)), was probably inspired by OS/360 and successors in that regard. However, at the lowest layer of Files-11 (QIO), a file could be accessed as an array of fixed-length blocks; the record-oriented stuff ran atop that (in userland in RSX-11; in, as I remember, executive mode in VMS). It sounds as if you're talking about the Resource Manager in Mac OS; I don't know whether that was implemented atop "resource fork as seekable byte stream" in classic Mac OS, but it's definitely implemented that way in Mac OS X.

    9. Re:Also, by spitzak · · Score: 1

      You can use '/' for "a delimiter I invented that allows selection of subsections of a file"

      This is an OLD idea and I'm not sure why it has never really been implemented. Basically *all* files are directories. They also have a single block of data that you get if you "open" the file and read it. Since there is no difference between a file and a directory, you can reuse the delimiter for both.

    10. Re:Also, by lister+king+of+smeg · · Score: 1

      sure we all know that microsoft wants to enable you and has no selfish motivation to use this against you like by saying you do not own a file that they do, no they wouldn't make you buy there software to view your own data, or use it to make stronger drm. Microsoft is just a benevolent force for good really really it is.

      --
      ---Saying gnome 3 is better than windows 8 not so much a compliment as it is damning with light praise.
    11. Re:Also, by Grishnakh · · Score: 1

      Which to me sounds like a total waste of time. If you want to "retain your rights", then don't send your files to other people or organizations. This scheme isn't going to "protect" any "rights"; it would be trivial to just strip off or ignore the metadata and look at the file directly if the ownership rights and protections are preventing you from accessing the files.

      The only way to protect data is to keep it to yourself.

    12. Re:Also, by Guy+Harris · · Score: 1

      While I agree that we need xattr style metadata that describes the file in some circumstances, I think most use-cases are simply misunderstanding the concept of a proper file structure.

      That said, it would be great to be able to do interesting things from the *nix command-line like:

      cat somecomplexfile.dat}meta=text | grep somethinginteresting

      Where "}" is a delimiter I invented that allows selection of subsections of a file.

      Or just

      grep somethinginteresting}meta=text

      .

    13. Re:Also, by uninformedLuddite · · Score: 1

      I would still rather use emacs

      --
      The new right fascists are bilingual. They speak English and Bullshit.
  10. In short.. by bitflusher · · Score: 1

    Is a file the (a)original data or (b) the original data + annotations data in other databases. What does a user expect when he/she creates a "copy" of a file. I have never seen a discussion like: "I downloaded a picture from facebook and now all the likes and comments are missing!" Suppose it will pop-up in the near future..

    1. Re:In short.. by Anonymous Coward · · Score: 0

      Excellent point.
      Something similar happens when end users try to tell me that their music is "in" kazaa so they can't uninstall it or they'll lose their music.

    2. Re:In short.. by buchanmilne · · Score: 1

      Maybe first Facebook can support displaying existing comment fields and meta-data on photos like some other photo hosting sites.

    3. Re:In short.. by deniable · · Score: 1

      I'd like to not have the problem of creation date being later than modified date. Happens a lot when people move files around.

    4. Re:In short.. by dotancohen · · Score: 1

      I have never seen a discussion like: "I downloaded a picture from facebook and now all the likes and comments are missing!"

      I have actually run into this when moving photos from BrilliantPhoto (Windows-only) to F-Spot in 2005, and then again when I moved from F-Spot to Digikam some years later. The solution lies in _embedded_ metadata fields such as IPTC or XMP. Some implementations try to use "sidecar" XML files for metadata. It does not work very well as then _all_ applications which use the file must know about the sidecar files and fully support them. In practice that never happens. Now these researchers want the applications and devices that couldn't figure out sidecar files to understand filesystem-specific metadata?

      --
      It is dangerous to be right when the government is wrong.
  11. Broader definition? by lorinc · · Score: 0

    Have you ever heard of Unix? You know, that strange system were files are more than just collections of bytes.
    Devices can be files, IPC can be files, even kernel hooks can be modeled by files...

    1. Re:Broader definition? by Anonymous Coward · · Score: 0

      Not really. Devices and IPCs can have file handles, so you can mix them inside select(), but they are not files. Files are collections of bytes, even in UNIX. But file handles can represent non-files. And directories on the file-system can host non-files as well, which is simply a perverse way of giving them names.

      It's little different than in any other OS - all OSes need to solve the "how do I wait on a network packet *or* a message on a pipe at the same time?" problem, so they all have to have an underlying abstraction. In Windows the abstraction is named HANDLE. Windows takes this as far as allowing you to wait on a semaphore and a socket in the same call ... but no-one thinks a semaphore is a "file". Windows also embeds the filesystem namespace inside a larger namespace, rather than declaring the filesystem namespace as the universe and placing non-files into it.

    2. Re:Broader definition? by deniable · · Score: 1

      How does Unix ship metadata within an arbitrary file type?

  12. Correct use of files by piripiri · · Score: 0

    In the *NIX world, we don't have much problems with files, as everything is a file. But it's clear that when in Windows, a directory move is not atomic (each child is moved one after each other), I can understand they say current implementation is broken.

    1. Re:Correct use of files by Anonymous Coward · · Score: 0

      You're confusing directory cut-and-paste in Explorer and moving a directory using MoveFile(). Explorer does a lot of extra work to support paste properly when the destination is on a separate volume, and for cases where the directory name is already taken in the destination. The underlying OS has MoveFile, which assumes the directory name is not taken in the destination, and only works on the same volume, and it is perfectly atomic. Granted, the Windows GUI could sure as hell use a feature that made it use MoveFile when it could, but flaws in the GUI are not flaws in the underlying filesystem.

    2. Re:Correct use of files by spitzak · · Score: 1

      Wrong. At least at the WIN32 level, you can mv a directory (use the rename() call) and it only changes the entry for the directory itself, not recursively for all the contents. It is not atomic, which is just Microsoft being incredibly stupid, but there are other calls in newer versions of WIN32 that are (Microsoft refuses to fix the rename() function in the library to call these new ones as they want to discourage portability between Unix/Windows).

    3. Re:Correct use of files by Guy+Harris · · Score: 1

      I'm not talking about GUI at all. http://stackoverflow.com/questions/167414/is-an-atomic-file-rename-with-overwrite-possible-on-windows

      That's not just a file-vs-directory issue, it's a rename() vs. MoveFile() issue. If the target exists, rename() attempts to remove it, but MoveFile() fails. That's even true for files. For files, but not directories, MoveFileTransacted() can be told to overwrite the target if it exists (I say "overwrite" because the description of MoveFileTransacted() says "If a file named lpNewFileName exists, the function replaces its contents with the contents of the lpExistingFileName file").

      Oh, and if a required attempt by rename() to remove the destination fails, the rename() fails, so the target directory had better be empty if you're moving something in its place. If you're renaming or moving a directory, and the destination doesn't exist, and the source and destination are on the same file system, both rename() and MoveFile() are atomic, even if the directory being moved is non-empty.

      In any case, it's not as if this is Not A Problem on UN*X and A Problem on Windows, much less being solely due to MoveFile() not supporting atomic moves of directories within a file system if the target name already exists.

  13. Queue DRM in... by Anonymous Coward · · Score: 0

    3... 2... 1...

  14. Keeping collections orgainized by Anonymous Coward · · Score: 0

    Meta data would really help keep the porn collection sorted.

  15. I like fuzzy folder structures... by Oswald+McWeany · · Score: 1

    I like fuzzy folder structures where I can tag, or label files and find them in any tag/label.

    Like one does with g-mail or photo managing software. If I have schematics for the pentagon- I want to be able to tag those files as "Pentagon" and "Schematics" and "Operation Zesty Lemon". No matter which tag I look under I can retrieve my files easily.

    --
    "That's the way to do it" - Punch
    1. Re:I like fuzzy folder structures... by wertarbyte · · Score: 5, Interesting

      DOCUMENT=~/myschematics.pdf
      SHAID=$(sha512sum "$DOCUMENT" | cut -f1 -d' ')
      mkdir heap
      mv "$DOCUMENT" "heap/$SHAID"
      mkdir tags
      mkdir tags/Schematics
      mkdir tags/Pentagon
      mkdir tags/Operation_Zesty_Lemon

      ln "heap/$SHAID" tags/Pentagon/
      ln "heap/$SHAID" tags/Schematics/
      ln "heap/$SHAID" tags/Operation_Zesty_Lemon/

      --
      Life is just nature's way of keeping meat fresh.
    2. Re:I like fuzzy folder structures... by Anonymous Coward · · Score: 0

      Now all you have to do is find a way to handle files with the same name without renaming them to something useless like a hash.

    3. Re:I like fuzzy folder structures... by L4t3r4lu5 · · Score: 1

      Now all you have to do is find a way to handle files with the same name without renaming them to something useless like a hash.

      Sounds like you could use an index.

      Wow, that was a difficult problem to solve.

      --
      Finally had enough. Come see us over at https://soylentnews.org/
    4. Re:I like fuzzy folder structures... by Inda · · Score: 1

      And what if you have a file that doesn't falling into the tag "Pentagon", "Schematics" or "OZL"?

      Ah yes, that's what the "Misc" tag is for. It's sort of a catch-all tag that 99.8% of people use. Don;t delay, create it today.

      --
      This post contains benzene, nitrosamines, formaldehyde and hydrogen cyanide.
    5. Re:I like fuzzy folder structures... by dotancohen · · Score: 2

      That will break as soon as I edit the file with a non-supported application (that doesn't know to update the stored SHA1 hash). This is why it is important to implement the feature at the filesystem level.

      --
      It is dangerous to be right when the government is wrong.
    6. Re:I like fuzzy folder structures... by Just+Some+Guy · · Score: 1

      I think you just re-invented Git.

      --
      Dewey, what part of this looks like authorities should be involved?
    7. Re:I like fuzzy folder structures... by Anonymous Coward · · Score: 0

      I'm happy with what we have and like you I add anything extra as needed.

      I've often thought it might be interesting to have a universal open standards file header that can store metadata portably. And a database driven FS capable of providing all manner of indexes and retrieval options. It would be easy to set all of this up. Unfortunately, maintaining all the metadata is impractical. You also have much more complexity, performance issues, privacy concerns, a huge amount of FS overhead data, etc.

      Sometimes simple is best.

    8. Re:I like fuzzy folder structures... by Anonymous Coward · · Score: 0

      Your suggestion doesn't provide the user with a way to perform boolean operations with tags, which is the whole point of having them. Another issue is that your trick breaks if anyone edits a hard link with a program which doesn't preserve the original file, such as VIM.

    9. Re:I like fuzzy folder structures... by djdanlib · · Score: 1

      That's clever and might work as a stopgap solution, but it's still waaaaaay more cumbersome than the GMail interface and requires you to use a terminal in a particular breed of OS. Not to mention all the maintenance you have to do when you rename or delete a file... plus having to do this for every file you create... It could get out of hand. I'd like to see it built into something more user-friendly, and I bet someone could do it. Ubuntu might be a good place to do that. Imagine having a "we had it first" for something the common mom & pop user would like!

    10. Re:I like fuzzy folder structures... by wertarbyte · · Score: 1

      There is no need to update the SHA1-Hash. It's only needed to create a unique storage id, and it should probably just be chosen random instead of generated from the file content to avoind collisions if files start from the same template.

      --
      Life is just nature's way of keeping meat fresh.
    11. Re:I like fuzzy folder structures... by dotancohen · · Score: 1

      What you describe is a GUID, not a hash:
      http://en.wikipedia.org/wiki/Globally_unique_identifier

      GP said "hash".

      --
      It is dangerous to be right when the government is wrong.
  16. Are they confusing form with function? by petes_PoV · · Score: 1, Insightful

    A file is essentially just a collection of data - no more and no less. To try and add attributes to that makes little sense and seems as futile as trying to say that each collection of molecules should have a tag saying what it is, who it belongs to and what it's for. Sure, you can add abstractions and structure on top of the basic form, but when you do that you are adding a layer - not redefining the basic building block.

    --
    politicians are like babies' nappies: they should both be changed regularly and for the same reasons
    1. Re:Are they confusing form with function? by hedwards · · Score: 1, Insightful

      To be honest, this sounds like MS is inventing something that Apple already invented. Apple has had forked files for how many years now? With one fork for the data and a resource fork for the icon and a few related pieces of information.

      Personally, I don't like it, it's non-standard and requires special steps to work with at times, and I'm don't really understand why it's needed in the first place. If it's really that big of a problem you can always zip up the meta data file and the data file and call it a day, but for most purposes I'd rather than the data not get corrupted when the meta data does.

    2. Re:Are they confusing form with function? by Narcocide · · Score: 1

      Confusing it? No.

      Purposefully obscuring it? Yes, remorselessly.

    3. Re:Are they confusing form with function? by SuricouRaven · · Score: 2

      NTFS supports the same thing, it's just that hardly anyone ever uses it. Including Microsoft.

    4. Re:Are they confusing form with function? by ultranova · · Score: 1

      But files are not molecules, they are a sequence of bytes. And we do exactly this with other sequences of bytes; it's the whole idead behind object-oriented programming. So, one possible way of "extending" files would be for them to define a type and type-dependent operations; for example, an image file could define "getWidth", "getHeight" and "get24ColourRectangle" functions for reading it, a text processor file could define "getContentsAsAnUtf8String", etc.

      Whether or not this would be a good idea is another matter. And the Microsoft proposal, at the very least, seems to be yet another attempt to push remote-delete DRM ("there is an especial need to support the notion of âownershipâ(TM) that adequately serves both users and engineers as they engage with the world of networked sociality").

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

    5. Re:Are they confusing form with function? by MikeBabcock · · Score: 1, Informative

      NTFS already has resource forks as well. Almost nobody uses them but they're there.

      --
      - Michael T. Babcock (Yes, I blog)
    6. Re:Are they confusing form with function? by 0123456 · · Score: 1

      So, one possible way of "extending" files would be for them to define a type and type-dependent operations; for example, an image file could define "getWidth", "getHeight" and "get24ColourRectangle" functions for reading it, a text processor file could define "getContentsAsAnUtf8String", etc.

      How is a file going to define any kind of operation? Are you seriously suggesting we should run arbitrary code from some random file downloaded off the Internet?

    7. Re:Are they confusing form with function? by Anonymous Coward · · Score: 0

      Virii does!

    8. Re:Are they confusing form with function? by jelle · · Score: 0

      MSDOS and it's FAT filesystem (and their predecessors) had it too, and called them 'file extensions', They uses things like 'exe' for the apps, 'ico' for the related icons, 'jpg' for the photo, 'txt' for the comments, 'doc' for the related documents, 'bat' for task descriptions, etc, etc.

      They just aren't used like that much...

      And instead of using what already exists, it's much better to reinvent the wheel and give it a whole new name.

      (/sarcasm, or not?)

      --
      --- Hindsight is 20/20, but walking backwards is not the answer.
    9. Re:Are they confusing form with function? by corbettw · · Score: 1

      A file is essentially just a collection of data - no more and no less. To try and add attributes to that makes little sense and seems as futile as trying to say that each collection of molecules should have a tag saying what it is, who it belongs to and what it's for.

      I disagree. I think being able to tag individual files using metadata (perhaps stored on the inode for the file, perhaps in a relational database, perhaps in some other form) could be incredibly useful. For instance, suppose you have a number of photos on your local PC. Being able to tag each photo with information on who the members are in that photo, in such a way that when you transfer the photo to someone else that information is saved and transmitted, too, would be incredibly useful.

      --
      God invented whiskey so the Irish would not rule the world.
    10. Re:Are they confusing form with function? by Anonymous Coward · · Score: 0

      About the only use case for resource forks that I've noticed, is to attach metadata indicating from which 'security zone' the file originated. This is the mechanism by which you can get a warning that a file has originated from another machine than yours.

      Just in case anyone's interested is all.

    11. Re:Are they confusing form with function? by petes_PoV · · Score: 1
      Yes, but that metadata (name, size, creation/modification etc.) are attributes held outside of the file. The file itself is just pure data and I believe should stay that way.
      As soon as you start defining a format for files, you run into trouble when someone wants to add another another feature/attrib. Do you maintain backwards compatability with "old" file formats, do you need to create a new format and go through a standards definition process?

      Then you find that different people interpret the file-attributes in different ways. Even if there are standards (which Microsoft is so, so wonderfully good at adhering to, they never add their own proprietary extensions at all ;)) they will get buggy implementations that will need fixing or working around.

      --
      politicians are like babies' nappies: they should both be changed regularly and for the same reasons
    12. Re:Are they confusing form with function? by jandrese · · Score: 1

      I think the biggest user of those forks is viruses trying to hide their data.

      --

      I read the internet for the articles.
    13. Re:Are they confusing form with function? by MikeBabcock · · Score: 1

      I believe you misunderstand the concept of file streams.

      Extensions are a form of file typing, like MIME is. The extension you mention above is just a part of the naming of the file. Streams are not (although can be accessed as such from applications that do not understand streams natively, such as notepad).

      --
      - Michael T. Babcock (Yes, I blog)
    14. Re:Are they confusing form with function? by Anonymous Coward · · Score: 0

      Your data has no attributes? That's some strange data you have.

    15. Re:Are they confusing form with function? by Anonymous Coward · · Score: 0

      A file is essentially just a collection of data - no more and no less. To try and add attributes to that makes little sense

      Agreed.

      File systems already provide all the metadata we need for general file usage. A file has a name, size, date of last modification, and a read-only attribute. Some file systems provide some more, and that's fine. But I have found that the basic attributes are all I really need in general.

      All other metadata is stored within the file's data. For example, a HTML file may contain metadata -- such metadata is correctly implemented as part of the HTML format.

      For those interested in better metadata (a praiseworthy goal), I recommend exploring standard systems for defining metadata within the file's data, such as the Dublin Core initiative. These metadata systems will give you the file information you want, without destroying the simplicity and generality of existing file systems.

    16. Re:Are they confusing form with function? by ultranova · · Score: 1

      How is a file going to define any kind of operation? Are you seriously suggesting we should run arbitrary code from some random file downloaded off the Internet?

      Like I said, files would include type info (for example MIMEtype). This would allow the operating system to load the appropriate library to handle the operations the file supports. Eeven including the code in the file itself is not a problem if we run it in a separate memory space with no access to anything except the byte sequence of the file and the input/output buffer of the operation.

      --

      Forget magic. Any technology distinguishable from divine power is insufficiently advanced.

  17. Needed: Unlimited Most Recent Files List by Compaqt · · Score: 1

    Every so often, someone steps up to the plate to get rid of the file metaphor because people can't find their files.

    But they don't need to abstract away the notion of files.

    Here's what to do: Give us an unlimited Most Recently Used (MRU) list. That's both for files and folders. Not the 9 or so in OpenOffice. How much space would it take to save some inodes?

    You should be able to go back in time and answer the question "What file was a I working on a week ago?"

    If you do that, you might not even need continual disk-thrashing full indexing.

    --
    I'm not a lawyer, but I play one on the Internet. Blog
    1. Re:Needed: Unlimited Most Recent Files List by jedidiah · · Score: 1

      It could stand to be a bit more granular than that though.

      A MRU per directory could be very handy. Is very handy infact. I have implemented this myself for certain use cases.

      Metadata is useful but there's really no good way to handle it that won't break things and serve as a compatability barrier. Simpler abstractions are useful because there are fewer moving parts and less things that can go wrong and fewer options for Vendor A to do something different from Vendor B.

      Everyone pointing out how other vendors have already solved this problem is a great illustration of that.

      --
      A Pirate and a Puritan look the same on a balance sheet.
    2. Re:Needed: Unlimited Most Recent Files List by TaoPhoenix · · Score: 1

      Nah, "Unlimited" is bad. That's just like throwing everything into "My Documents". (I hate MS's schemes, programs randomly pick their default location between My Documents, Downloads, /Programs/Data/____ , /_____/____/Temporary/Content/___/____/ThisMail/____ (5% of my job at work is spent recovering stuff that people "open" out of their email. Yuk!

      I do "Drive Reads" that gather all files on the disk into a text file, and then search *that* for my long lost files. It's 1000 times faster than old windows search. If you Drive Read by Date, that's your MRU list. But you can also do it by name, by type, etc.

      --
      My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
    3. Re:Needed: Unlimited Most Recent Files List by Anonymous Coward · · Score: 0

      You are describing zeitgeist on linux.

    4. Re:Needed: Unlimited Most Recent Files List by Anonymous Coward · · Score: 0

      You should be able to go back in time and answer the question "What file was a I working on a week ago?"

      "find" on NIX systems can do that by create time, modify time, access time.
      "find Source -type f -mtime -30 -ls"
      will show the files in the directory Source that were modified in the last 30 days.

      "find Source -type f -ctime -30 -ls"
      will show the files created in the last 30 days.

    5. Re:Needed: Unlimited Most Recent Files List by 0123456 · · Score: 1

      You are describing zeitgeist on linux.

      Yeah, the pile of crap that would thrash my laptop's hard drive for three minutes every time I logged in until I uninstalled it.

  18. WTF is wrong witth the concept of files? by rossdee · · Score: 0

    I don't see any need to change. - although the 3 letter filename extension to determine the type of file is getting a bit long in the tooth. (I was using an OS and filesytem in the late 80's that didnt have that problem.

    1. Re:WTF is wrong witth the concept of files? by Anonymous Coward · · Score: 0

      Hmm, I gotta uncompress my .jpeg's using .texinfo manuals. Doesn't work. I think I need to change the .config files again.
      But really, even Windows doesn't have that problem.

    2. Re:WTF is wrong witth the concept of files? by lwriemen · · Score: 1

      Ignore the "3 letter" wording, and the OP is right to state that using the file extension to determine file type is a problem, mostly confined to Windows and OS X today.

    3. Re:WTF is wrong witth the concept of files? by 0123456 · · Score: 1

      I don't see any need to change. - although the 3 letter filename extension to determine the type of file is getting a bit long in the tooth. (I was using an OS and filesytem in the late 80's that didnt have that problem.

      The problem is that if you have local files on your own computer, Microsoft can't rent them back to you.

      The whole push to get files off your own hardware onto The Magic Cloud is pure rent-seeking.

    4. Re:WTF is wrong witth the concept of files? by Errol+backfiring · · Score: 1

      (I was using an OS and filesytem in the late 80's that didnt have that problem.)

      Yes, the commodore 64 was a great machine!

      --
      Nae king! Nae laird! Nae yurrupiean pressedent! We willna be fooled again!
    5. Re:WTF is wrong witth the concept of files? by Junta · · Score: 1

      , mostly confined to Windows and OS X today.

      Sadly, aside from the executable bit, Linux desktops are largely afflicted by this as well, as all the desktop environments have pretty much embraced extension based file typing.

      --
      XML is like violence. If it doesn't solve the problem, use more.
    6. Re:WTF is wrong witth the concept of files? by jandrese · · Score: 1

      You had a Mac?

      --

      I read the internet for the articles.
  19. when reality stops throwing file like things at me by brokeninside · · Score: 1

    ... then I'll start looking for analogies other than a "file" (or something bsimilar like a notebook) to use with computers.

    Thin about it. The objects we use most often books, physical files, CDs, musical instruments, notecards, kitchen gadgets, etc. All have a discrete identity that makes their representation by a file on a file system quite intuitive.

    Only when reality starts presenting itself as something other than individual entities with their own discrete identity will most people move to a different paradigm.

  20. Help me out with this: Is "File" Patentable? by withoutfeathers · · Score: 1

    Smells like MS is laying the foundation for a whole new tangle of patents.

  21. The "Paradigm Shift" is back! by EmagGeek · · Score: 1

    I never thought I would see the "Paradigm Shift" return to the common corporate lexicon. Of course, there is also the "Paradigm Shift for Paradigm Shift's sake."

    This, right here, is the kind of blue-sky thinking that can create a paradigm shift that will empower key contributors to cover all directions of the compass in the realization of the critical program objectives. The kind of solution that will be the result of joined-up thinking will easily land and expand across all verticals in a process-oriented organization. However it will be a key component of the storyboard to collect the buy-in from key stakeholders to ensure 100% coverage in gating milestones.

    1. Re:The "Paradigm Shift" is back! by andrewbaldwin · · Score: 1

      Well said :-)

      Just one criticism ... you forgot to mention how "such an approach would interlock horizontally and vertically across business units to leverage the synergies arising from an ongoing optimsation of the function stream envisaged in the up-coming opportunity horizon".

      Some years ago I came across a Word macro called Bullfighter [I can't remember who the original author was but I'd love to credit him/her]. This analysed text for excessive length and presence of buzz words.

      If I had it now (and if it worked in Libre Office) I suspect my PC would have melted :-)

    2. Re:The "Paradigm Shift" is back! by RazorSharp · · Score: 1

      You should be promoted to upper management.

      --
      "From the depths of my skeptical and rationalist soul, I ask the Lord to protect me from California touchie-feeliedom."
    3. Re:The "Paradigm Shift" is back! by EmagGeek · · Score: 1

      Allow me to word that slightly differently:

      "such an approach would pull in the opportunity horizon by interlocking horizontal and vertical business units, leveraging synergy and commonality between their respective focus teams, achieving greater velocity in quality function deployment moving forward." :D

  22. ...What? by Rie+Beam · · Score: 1

    I read the entire paper (the second article), which was essentially an analysis of Apple software that concludes that "Apple write a lot to the hard drive and we don't know why" and "this raises more questions than it answers".

    Can someone please explain if either article is actually proposing an applicable solution, or simply stating "things need to change!" like a 19-year-old Occupy Wall Street protester?

    1. Re:...What? by KnownIssues · · Score: 1

      I read the first article, and I have to say, I did not see anything that proposed what should replace files. There was the vague "encompass metadata within a file abstraction", but really, what does that mean?

      The main point of the article, as I read it, was that what a user *believes* a file is and what the storage media/application calls a file are often completely different and that the next form of "file" should better represent what a user thinks of as a file--i.e. the smallest allocatable unit of content, e.g. a photo, a contact, a spreadsheet, a document--and the actions they want to perform on it.

      The article gave the example of a OneNote Notebook. On your computer it stores Sections as files and Notebooks as folders of these files. This makes sense from a technology perspective. But a user (a normal one, not a Slashdot one) expects the Notebook to be stored as a whole indivisible unit. And not every storage medium stores the Notebook the same way; SkyDrive was given as an example.

      On the other hand, I don't think this is uncommon for a research paper. Not every research paper is intended to be a fascinating read about deblur technology in Photoshop. We're taught to "not point out a problem unless we have a solution", but that's not always the best philosophy. Sometimes it's perfectly valid to point out the flaws in something without knowing how to fix it; sometimes the problem is that people don't see there's a problem and the first step is jut raising awareness there's a problem.

    2. Re:...What? by Hatta · · Score: 1

      I'll bite on that OT troll. OWS has actually published a list of reasonable, workable policy changes. If you want to make fun of someone for being ignorant, look in the mirror.

      --
      Give me Classic Slashdot or give me death!
    3. Re:...What? by gzipped_tar · · Score: 1

      So again, software engineers are attempting to dictate what a user should want.

      The problem is that many users (a normal one, not a Slashdot one, in your words) inherently want things that are logically too inconsistent to hold up in the real world (I'd also like a pony, wouldn't I?) Therefore it is fruitless trying to cater to every whim of the User, especially not at the OS/FS level where the "business logic" is of vital importance.

      In the end, it is not important what a user think a file is -- at least not important at the FS level. If an application handling a file-based view of data (like OneNote) fails the user's expectations, it's the app's fault, and capitalism + natural selection should take care of its fate. The modern filesystems already do a good job which they're supposed to do, and should be left alone.

      --
      Colorless green Cthulhu waits dreaming furiously.
  23. temp files by Anonymous Coward · · Score: 0

    My compiler does not create OBJ files, just BINary directly. Actually, you usually go source code --> memory ready for execution.

  24. I thought this was a science story about flies by nullCRC · · Score: 1, Informative

    My bad.

    --
    Vescere bracis meis.
    1. Re:I thought this was a science story about flies by Bosconian · · Score: 1

      I know, man, what's to rethink? Just swat 'em or shock 'em and be done with it, fer Crissakes.

      Let's rethink the nature of mosquitos next. You think 'em and I'll smack 'um.

      --
      Scarce, scared, scarred, sacred... -Col. Bruce Hampton
    2. Re:I thought this was a science story about flies by spidr_mnky · · Score: 1

      I couldn't stop seeing that, either. Frankly, it would have been more interesting, and I'm not much of a biology enthusiast.

  25. test by Anonymous Coward · · Score: 0

    test

  26. Re:files by TaoPhoenix · · Score: 1

    They kinda sorta work, *if* you manage 100 file extensions. Forget the Ribbon, the other disaster from Office 2007 was the 'glorious basterd' new file names, docx xlsx and the others. But of course 'file extensions are too hard for users' so those differences get hidden. One of my 'mission critical' programs from work FINALLY added support for those filenames ... *this past April*.

    So yeah, there's probably a scorpion barb in the Microsoft article.

    --
    My first Journal Entry ever, in 8 years! http://slashdot.org/journal/365947/aphelion-scifi-fantasy-horror-poetry-webzine
  27. It's time to unreserve "special" characters by wallywalters · · Score: 1

    including especially the question mark, quotation mark, colon, forward slash and asterix. For clarity and accuracy, any punctuation mark that's commonly used in everyday writing should be available for use in filenames going forward.

  28. Metadata? by Rie+Beam · · Score: 1

    There isn't an issue with files. Files are essentially the atomic structures of the filesystem -- the dividing points between different pieces of content. You can add all the abstraction you want, but if you can't find Piece of Information X at the end of it, it's still a worthless abstraction. Redesign the file system, sure, but the nature of files isn't in question here, but rather how they're accessed.

  29. Reinventing RMS? by AB3A · · Score: 1

    First, they're using bloated programs on poorly optimized file systems and they then complain about performance.

    Second, a better optimization would result if you took in to account what the file type was.You'd lose some compatibility, but you'd gain a surprising amount of performance. The solution has been sitting around for decades: Anyone remember the infamous Record Management System from DEC? It existed as a layer between the kernel and the user space.

    It would answer the concerns of these researchers, but it would require a massive rewrite of all the programs that use the file systems.

    We're headed back to the future...

    --
    Nearly fifty percent of all graduates come from the bottom half of the class!
    1. Re:Reinventing RMS? by dkleinsc · · Score: 1

      Am I the only one who was thinking this post was going to be about giving Richard Stallman a makeover?

      --
      I am officially gone from /. Long live http://www.soylentnews.com/
  30. A possible real goal by dkleinsc · · Score: 1

    Pretend for the moment that Microsoft has found a way of storing data that completely does away with the directory tree and file concept that have been a basic piece of operating system design since the 1970's. Now, to make Windows do that would require major changes and break backwards compatibility, so why would they possibly want that?

    Well, imagine another family of operating systems had made files the key component of what they do, so much so that it makes practically everything look like a file, whether a network socket or a hardware device. It's even gone the extra mile on compatibility to support using a wide variety of other OS's file systems, including Windows' preferred file system, so that those who want to run multiple OS's on the same machine can do so relatively painlessly.

    Now imagine that Microsoft wants to break that compatibility in an attempt to maintain its market position. Now, they're first try (which is a lot cheaper) is to occasionally redesign their filesystem so that the other family of operating systems has to adjust their compatibility layers. But those jerks seem to be keeping up with you, reverse-engineering what you did. So now, to really break compatibility, you have to go after the concept of having a file system, so that instead of something coherent that the other OS can build a driver for, you need special proprietary code to turn the gobbledygook on disk into something a user can read.

    Of course, the only part of this that's really imaginary is that last bit. But my guess is that what they're aiming for is "Want to read data from a Windows machine? You need a copy of a certain Windows DLL running, which will only run on Windows."

    --
    I am officially gone from /. Long live http://www.soylentnews.com/
    1. Re:A possible real goal by skids · · Score: 1

      Good god! We'd have no choice but to return to IBM mainframe COBOL, where everything is a database record, not a file.

      Or we could just ignore them and go on with our lives.

      BTW, while where on the subject, do any of the meta-data-supporting filesystems support an operation to "export" a file which just creates an archive of a known format with a bunch of EXIF-ish data bundled along with the file's contents? If so they'd better apply for a patent for it before MS does, given we are now a "first to file" nation.

    2. Re:A possible real goal by N1AK · · Score: 1

      Now, to make Windows do that would require major changes and break backwards compatibility, so why would they possibly want that?

      Because sitting still and waiting for someone else to do it isn't always the best business strategy. I suppose it's possible that Microsoft employs enough people who are stupid enough to consider your scenario as a viable business strategy but I don't find it to be a remotely reliable answer. Microsoft can't remove the ability for 'file' interactions between their operating system and competing systems. They wouldn't even get as far as implementing it before it was regulated against.

  31. Replace files with data objects by concealment · · Score: 1

    While XML is annoying, it shows us the importance of both data and data describing that data (including "meta-data").

    My guess is that we'll take a page from object-oriented computing and in the future, see data as stored only within object types, with associated description data and possibly transformation data (something like XSLT).

    In particular, this would open up all file formats to the end user, as understanding the structure of a data object is a lot more sensible that hand-coding a parser for binary files.

    The influence of the semantic web, object-oriented thinking, and the inevitable inclusion of high-capacity databases as part of the operating system (we already see this with LAMP as a popular platform not only for development, but for daily use) will drive this change.

    Personally, I think it's about time. A file is a low-level format, basically a giant string of data between two points. We should not be using files as end users; that's for the operating system. And at the same time, we'd like our data to be there in a form we can manipulate, not dependent on file-types and specific applications.

    Back in the 80s, there was more of this thinking but no one got it to catch on. The original Macintosh file system used a "data fork" and a "resource fork" for objects included with the file. There were other experiments, most notably Talient and OpenDoc (http://en.wikipedia.org/wiki/OpenDoc).

    A good discussion of what open data formats might mean can be found here:

    http://www.malcolmgroves.com/blog/?p=633

    1. Re:Replace files with data objects by gzipped_tar · · Score: 1

      But we separate the source code we're working on and the Makefile (or some other sort of "metadata") as two different text files for a reason. I prefer my photo album as just a bunch of photo files with external metadata (index, date, tag, etc) for the same reason. The reason is this: in the real world, small bits of resources gathered in a meaningful manner in accord with some standard tend to work better than an obscure serialization of "objects" as instances of some do-everything class.

      To put it in context, I want to do a simple git clone and get a bunch of files with metadata stored in the .git directory, rather than having to the possibly expensive de-serialize, only to serialize again for patching one single line in a text file. *Especially* not when the said serializer application is blurred into the OS/FS level.

      --
      Colorless green Cthulhu waits dreaming furiously.
    2. Re:Replace files with data objects by Junta · · Score: 1

      The issue is much of the 'metadata' that is externalized is meaningless outside of a specific context. For example, a song has some informational data that inarguably persists with the file (vintage, artist, cover art, etc) and some data that varies depending on the context (the rating the current listener ascribes to it, order in a playlist, or other data that has special meaning only in relation to something else like 'this is "our" song' isn't a universal attribute of a song.

      --
      XML is like violence. If it doesn't solve the problem, use more.
  32. Oh dear god, please, please, please.... by gestalt_n_pepper · · Score: 3, Insightful

    Do NOT "improve" the file. I'd like to continue to be able to use my computer and other devices.

    --
    Please do not read this sig. Thank you.
    1. Re:Oh dear god, please, please, please.... by Anonymous Coward · · Score: 1

      Behold, the new Ubuntu 12.04 will improve the desktop user experience with a new Singularity filesystem. Now the directories are gone, and user is free to search and access files by a search engine.

    2. Re:Oh dear god, please, please, please.... by Air-conditioned+cowh · · Score: 1

      I'd like to continue to be able to use my computer and other devices.

      And I'd like to stop you! Ha ha ha ha haaah!

    3. Re:Oh dear god, please, please, please.... by TemporalBeing · · Score: 1

      So that's why they broke the system in 11.10 by moving /var/run to /run and /var/lock to /run/lock...

      --
      Truth is like the sun. You can shut it out for a time, but it ain't goin' away. - Elvis Presley (source: imdb.com)
  33. POSIX xattrs by Salamander · · Score: 3, Insightful

    Look them up. They already allow you to attach arbitrary metadata to a file. Most modern filesystems and user-level utilities support them already. They're even used as the underpinnings for security mechanisms such as POSIX ACLs and SELinux. Sure, there are issues with performance when you have *lots* of xattrs on a file, and that's a fruitful area of research, but we sure don't need some brand-new Microsoft-invented thing to deal with metadata.

    --
    Slashdot - News for Herds. Stuff that Splatters.
    1. Re:POSIX xattrs by Tetsujin · · Score: 1

      Look them up. They already allow you to attach arbitrary metadata to a file. Most modern filesystems and user-level utilities support them already. They're even used as the underpinnings for security mechanisms such as POSIX ACLs and SELinux. Sure, there are issues with performance when you have *lots* of xattrs on a file, and that's a fruitful area of research, but we sure don't need some brand-new Microsoft-invented thing to deal with metadata.

      The issue isn't the underlying mechanism that provides the capability for assigning arbitrary metadata to a file: rather, the important issue is how we treat that metadata in the UI.

      You could think of it like this: it's not necessarily about redefining the filesystem-level notion of what a "file" is, but rather about establishing conventions for how we treat files and work with metadata.

      --
      Bow-ties are cool.
    2. Re:POSIX xattrs by formfeed · · Score: 1

      Dear Sir or Madam,
      That xattrs of yours sounds like a wonderful idea!

      But why not take it a step further? I could imagine an operating system where everything is just a file: You plug in a camera, it shows up as a file. Hardware properties? Read them from a file and change them by writing to a file. CPU Speed? Sleep states? An audio stream? - I know it sounds crazy. But someone should really try it.

  34. even more fundamental by Anonymous Coward · · Score: 0

    Why do we even think in terms of files or more particularly "file operations"?
    Why should I have to "save file" in an editing application. That's a hold over from the days of slow mass storage, where you don't want to take up time in the middle of your other work.

    1. Re:even more fundamental by 0123456 · · Score: 1

      Why should I have to "save file" in an editing application.

      Because, uh, when you totally screw things up without realising you want to be able to abandon the current version and go back to the last good version? Because when you're editing on a laptop with an HDD you don't want it perpetually spinning and sucking up your battery power? Because saving is freaking slow in many applications even on an SDD?

    2. Re:even more fundamental by Tetsujin · · Score: 1

      Why should I have to "save file" in an editing application.

      Because, uh, when you totally screw things up without realising you want to be able to abandon the current version and go back to the last good version? Because when you're editing on a laptop with an HDD you don't want it perpetually spinning and sucking up your battery power? Because saving is freaking slow in many applications even on an SDD?

      In all likelihood the application is already doing "auto-saves" anyway, so it can do recovery if something goes unexpectedly and badly wrong.
      And then, in terms of resource usage, there's not much difference between an application that auto-saves when you close the window and one in which you manually hit "save" before closing the window.
      Pretty much the only technical snag is if a program tries to read the file while the application is still editing it - the other program may not get the latest version of the file. But mandatory locking on Windows pretty much blocks this scenario anyway, and advisory file locking could be used to provide the same sort of behavior on Linux. Or if you really want another application to be able to open the file while it is being edited, and be guaranteed the most up-to-date version - that is not an insurmountable problem.

      Resource usage isn't the issue here. The issue is user interface. Users have been trained to respect a difference between "in-memory" and "on-disk" data. But that doesn't mean it's necessarily the best choice moving forward, just that it's what people are used to. The situation could be changed (and already has been changed, on some platforms) as long as users were made to understand the change in paradigm.

      PalmOS did this: and while this was in part just natural for early iterations of the platform (everything was in RAM anyway) it was also part of an effort to streamline the UI. I don't know if the same was true of Newton, but I believe it's true of current smart-phones as well.

      There are other problems with auto-save, problems that weren't addressed on PalmOS: for instance, what if the user makes a mistake, which winds up getting auto-saved to the file? If they had been using a more traditional application that required explicit user action to save the data, then maybe (even if the change was beyond the reach of their "undo" history) they would be able to revert to the copy of the file on-disk. Unless they reflexively hit "save" at some point, in which case they're, again, boned.

      The solution, probably, is file versioning incorporated into the app itself. At a very simple level this could mean "undo" history is very long, and saved to disk. (This raises other problems, of course - we have seen cases where someone released a document that contained metadata that they didn't want to release... Certainly releasing a document that included a lengthy history of your edits could be embarrassing and possibly dangerous... So teaching people to strip that data out of a file before they publish it, and incorporating that into the UI is important in that case.)

      --
      Bow-ties are cool.
  35. keep it simple, stupid by Anonymous Coward · · Score: 0

    It took me three tries to get any meaning out of that 'quote from the first paper' mentioned above. Seems too much verbiage is spent on trying to prepare my brain to agree with the ideas before it actually tells me what the idea is.

    As for the paper itself, I am nonplussed. A "file" is a sequence of bytes, with a defined start location, and length, recorded on a storage device to be retrieved from that storage device at a later date. What the paper describes doesn't change this idea, it insisting that every "file" should have a wrapper around it and users should not be able to access the "file" without the wrapper.

  36. But, surefly there'll still be file name suffixes? by magbottle · · Score: 0

    .txt, .doc?

    There _has_ to be!

    Otherwise the Mac OS X engineers will look like idiots for dismantling the Mac system of file data types in favor of using file suffixes for file content identification.

  37. Hmm... by J'raxis · · Score: 1

    "Copy," "delete," and "ownership" being three points they're trying to address? Why does this sound like a submarine attempt to embed some sort of IP protection in the lowest levels and very concepts of files on computers, framing it all as merely a technical re-engineering of the "file" concept?

    1. Re:Hmm... by Anonymous Coward · · Score: 0

      "Copy," "delete," and "ownership" being three points they're trying to address? Why does this sound like a submarine attempt to embed some sort of IP protection in the lowest levels and very concepts of files on computers, framing it all as merely a technical re-engineering of the "file" concept?

      It's clearly not a submarine attempt at embedding IP protection, you ignorant fool.

      It's a blatantly obvious attempt. What is WRONG with you?

  38. What? by wisnoskij · · Score: 1

    A file is simply a linear series of data. Period. End of story.
    I don't care where you store the ownership rights, the metadata, or what new fancy things you want to be able to do with files; That is not a ground breaking new concept.

    --
    Troll is not a replacement for I disagree.
  39. Metadata and sharing by Kjella · · Score: 2

    Personally, I've found that the biggest issue with all the "metadata" systems that try to improve on the basic file/folder system is that they don't transfer anywhere. Send the file once through Samba, NFS, email, FTP, rsync or whatever and the metadata is lost. The only systems that actually get used are those that are embedded in the file, like EXIF for JPG, ID3 for MP3 and so on.

    The stupid thing is that we didn't make that a generic part of all file formats, a simple key-value list appended to the file would do. But today that'd break almost everything, plus most things working on the file system would have to know that each file has a data and metadata part. Maybe use a compatibility layer for metadata-unaware applications, where they only see the data part?

    That way we really could have a standard form of metadata. It might not cover every use but it'd sure cover a lot. Copy the file, copy the metadata (if you want, of course). Of course most of these researchers seem to want to get rid of the file altogether and replace it with some sort of cloud service, but I'd rather not. I'd rather know where I have my stuff and be able to put it where I want.

    --
    Live today, because you never know what tomorrow brings
    1. Re:Metadata and sharing by Anonymous Coward · · Score: 0

      Umm, let's choose, say 'xml', and store it next to the file as 'filename.xml'.

      Done. Problem solved.

    2. Re:Metadata and sharing by nine-times · · Score: 1

      Personally, I've found that the biggest issue with all the "metadata" systems that try to improve on the basic file/folder system is that they don't transfer anywhere

      The real problem there is that filesystems and transfer protocols don't have a standard for metadata. The metadata that can be stored in ZFS vs. NTFS vs. HFS+ aren't quite the same, and transferring a file via FTP isn't really going to maintain any of it. Of course, the real problem there isn't technological so much as economic/political. No one has the leverage to push a standard even if a good standard were available to be pushed.

    3. Re:Metadata and sharing by grumbel · · Score: 1

      That doesn't fix the problem. The moment you rename the file the metadata is lost, the moment you copy the file the metadata is lost, etc. That would only preserve the metadata on directory copies.

    4. Re:Metadata and sharing by jandrese · · Score: 1

      Apple used to do this back before OSX. They required special encoding formats to send files through email, but it was hardly impossible. It was quite messy when someone extracted one of those files on a DOS or Unix machine and ended up with a bunch of garbage looking files and directories in addition to what they wanted.

      Apple did do some interesting things with them though. The basic text editor on the system could apply simple formatting (font changes, bold, underline, etc...) to a text document, and if you sent that text document to a DOS or Unix user, they would just see the text without the formatting. Think of it as an early example of separating content from presentation, and just how messy and incompatible Microsoft's solution (RTF) to the same problem was.

      The other nice thing is that programs could store all of their assets (Icons, graphics, sounds, executable code, data blobs, etc...) inside of their own resource fork, so you never had to install anything. There were no libraries to mess with, no .pak files, no registry garf, just a big honking executable that you could run from anywhere. Of course in a multiuser system that doesn't work so well, but a 1980s Mac was in no way a multiuser system. Modern OSX actually does pretty much the same thing, but in a slightly messier manner (executables are big zip files and the OS has magic to handle them...from the gui. From the commandline it's a bit messier).

      --

      I read the internet for the articles.
    5. Re:Metadata and sharing by mx+b · · Score: 1

      Granted I am not an expert on computer filesystems, but this is what I'm curious of. Multimedia formats like OGG are containers, yes? My understanding is it embeds a video, audio, subtitles, into this one container. I imagine it as holding a few other files inside, reminiscent of me zipping up a few files together (perhaps like ODT?). This could be flawed understanding I admit, but bear with me.

      Wouldn't it be possible to make a "universal" file container, in that any other file type could be imbeded with a text file that listed: what type of file it is, what program it is associated with, owner, creation/mod dates, and especially, tags and other types of metadata? (perhaps, author/composer as necessary, things like the publisher or journal it appeared in for pdf, if the main file is source code then the metadata can specify what language it is written in, etc.).

      Then we can get away from every file having different extensions to everything being a container that declares what the file is used for, etc. I mean, they sort of will have extensions, but not in the file name. And anything you change carries over with the file.

    6. Re:Metadata and sharing by Anonymous Coward · · Score: 0

      I always wondered why metadata wasn't with data from the start. Doesn't do any good for me to put metadata on a file in Windows if it's going to stay local to my machine. Websites where there are tags on someone's picture to identify its content and who did it? Too bad that info's not in the picture itself, that'd save a lot of headaches for when a picture becomes popular but you have no idea who created it. Granted there are cases where I don't care about metadata, so I delete those metadata sidecar files when transferring Mac files to a thumbdrive. But metadata's useful for those rare moments when you need it, and given how many programs beg you to put in metadata today, would it kill them to spend a few days thinking of a viable long-term solution for making these changes stick?

      But again, for anyone who actually cares about the metadata to a file, they'll probably have some separate text file identifying it. They might package the text file with the original data when sending it to someone. But then again perhaps not. What kills me is when website services have metadata separate from a picture (like a custom description, or when it was uploaded, or THE FRICKIN' URL YOU DOWNLOADED IT FROM), but not automatically give that info to you when you download it. So you have to copy+paste that info itself if you really care about it. Seriously, why does no website on the planet attach the URL you download a file from?

    7. Re:Metadata and sharing by Kjella · · Score: 1

      Wouldn't it be possible to make a "universal" file container, in that any other file type could be imbeded with a text file that listed: what type of file it is, what program it is associated with, owner, creation/mod dates, and especially, tags and other types of metadata? (perhaps, author/composer as necessary, things like the publisher or journal it appeared in for pdf, if the main file is source code then the metadata can specify what language it is written in, etc.).

      In theory, it's not a problem. The problem is that you would break all existing file formats, they'd all complain the files are corrupt.

      --
      Live today, because you never know what tomorrow brings
    8. Re:Metadata and sharing by spitzak · · Score: 1

      I agree with you, but I think a solution is to make the metadata always be part of the file.

      Basically if you 'cat' a file, the resulting stream of bytes contain all the metadata. If you copy it to another system that does not store metadata, then the resulting block there will still have it (and there will likely be utilities and even libraries that read the metadata). If eventually the file is copied back to a system that understands the metadata scheme, it may disassemble the block back into the metadata representation.

      An obvious problem is incompatibility, which is the reason this probably has not happened yet. As soon as you attach metadata your text file will be changed. Even if there is no metadata, your file will have to be modified so that it does not match whatever pattern indicates metadata.

    9. Re:Metadata and sharing by Anonymous Coward · · Score: 0

      Agreed. This is why XMP exists. It's just a shame all file browsers don't leverage on it (editing/adding/searching metadata).

    10. Re:Metadata and sharing by Kjella · · Score: 1

      XMP only tries to standardize the metadata content, not the location. From the WP page:

      TIFF - Tag 700
      JPEG - Application segment 1 (0xFFE1) with segment header "http://ns.adobe.com/xap/1.0/\x00"
      JPEG 2000 - 'uuid' atom with UID of 0xBE7ACFCB97A942E89C71999491E3AFAC
      PNG - inside a 'iTXt' text block with the keyword 'XML:com.adobe.xmp'
      GIF - as an Application Extension with identifier "XMP Data" and authentication code "XMP"
      PDF - embedded in a metadata stream contained in a PDF object
      For file formats that have no support for embedded XMP data, this data can be stored in external .xmp sidecar files.

      So maybe it's here, maybe it's there, maybe it's not in the file at all. That's solves the easy part, not the hard part.

      --
      Live today, because you never know what tomorrow brings
    11. Re:Metadata and sharing by guruevi · · Score: 1

      It exists in various forms. One of those containers is called DICOM and is used extensively in the medical field. It's a really great format (albeit very specific in use) but also a lot of overhead gets into it. It even specifies the Endianness of your data.

      The main problem is standardizing it in a way that is both flexible, usable and quick. I see this problem with DICOM all the time, every vendor of medical software attaches it's own tags in binary or other unreadable form (XML) and some even make the tags for important metadata invalid in order to lock the end-user into a specific solution for eternity.

      --
      Custom electronics and digital signage for your business: www.evcircuits.com
  40. keep your stinking abstractions... by ynohoo · · Score: 1

    away from my data!

  41. Yes by biodata · · Score: 2

    Yes I would. If I deliberately transmit a message to someone else, then I have no expectation of being able to 'untransmit' that message. The logic error here is thinking that files are like objects. They are not (only), they are also like messages. Big business wants files to be like objects so they can own them. Everyone knows they can't do it, and this effort will fail like all others, due to the nature of reality. Files are not objects.

    --
    Korma: Good
  42. Until there is massive block I/O paralellism ... by Anonymous Coward · · Score: 0

    and the standard open, close, read, write, seek have been replaced by something else, there is no need to update the concept of files.

    The filesystem is an important backend component of almost everything which is done in operating systems. Therefore, like TCP/IP, it should be layered, relatively stupid and stateless. This way, the capacity of it is easy to extend, reliability is easy to achieve, and the backing technologies will enjoy economies of scale.

    History has borne this out. Don't fuck with this.

    As far as "types," the problem is is that files can claim to be one type, but really be another type. Trusting files to represent what they say they represent is a security vulnerability. Since any decent program should be verifying data coming from an untrusted source such as a random file from a random location, you might as well let the program determine the type by looking at magic numbers (or gasp, supporting a standard such as XML) anyway.

  43. Retrieving your data 20 years from now by Jazari · · Score: 1

    So now on top of wondering if my backup DVDs will still be readable in 20 years, and if I'll have the right program to interpret the file, I have to wonder if the very concept of a "file" will remain stable over that time?

  44. There is no "issue." Ownership is stupid by Errol+backfiring · · Score: 1

    Ownership is stupid when it comes to files. Or to many other things. If a developer has made a 90% of config file on my system, is it his or mine? But that is not the question. The question is: Who *lovemaking* cares?

    --
    Nae king! Nae laird! Nae yurrupiean pressedent! We willna be fooled again!
  45. Buzz barf by LoRdTAW · · Score: 2

    "Quoting the first paper: 'For over 40 years the notion of the file, as devised by pioneers in the field of computing, has proved robust and has remained unchallenged. Yet this concept is not a given, but serves as a boundary object between users and engineers. In the current landscape, this boundary is showing signs of slippage, and we propose the boundary object be reconstituted. New abstractions of file are needed, which reflect what users seek to do with their digital data, and which allow engineers to solve the networking, storage and data management problems that ensue when files move from the PC on to the networked world of today."

    They pretty much peppered the report with bullshit and buzz words to make "meta data" and "internet based storage" sound all new and shiny for the brain dead market droids and managers.

    This reminds me of that MIT operating system hoax that was going to take current file system ideas and throw them out the window. Face it, how else do you organize bits of information? The concept of a file is simple: an organized arrangement of bits that contains data which can be moved, re-sized or deleted. How do you change that? The only thing that can change is the method in which they are stored on physical media (file system) or cataloged and indexed.

    I just want one thing: a file system that is part database for fast file searches. I don't want to manually build indexes or any other bullshit just look at the file table and give me my fucking file. Even if you had 100,000 files with file names of 256 characters, its only 2.5 MB, how long does that take to parse? Maybe I don't understand file systems but even a 10 MB file table should only take a few seconds to scan. When I do a search of a directory or entire disk with tens of thousands of files it sometimes takes a minute or two. The disk is thrashing away as if the program is looking all over for the file names. Shouldn't they all be in one place pointing to where they are on disk? Maybe I don't understand file systems in general, someone care to explain?

    And one thing that just popped into my mind is a better method to tag and store files. When I download a file or save a document/image/whatever I shouldn't have to dig through a huge directory hierarchy. I should be able to type the name of a directory and something along the lines of Google's auto complete or intellisense will begin to auto complete my search, regardless of what volume its stored on. As I type vacation.. it should list all directories beginning with that string or tag. Maybe I am ignorant of similar functionality for Windows and Linux. The tags and file/directory names should be system wide and accessible to all programs and commands that interact with files, not just a built in shell.

    1. Re:Buzz barf by benhattman · · Score: 1

      This reminds me of that MIT operating system hoax that was going to take current file system ideas and throw them out the window. Face it, how else do you organize bits of information? The concept of a file is simple: an organized arrangement of bits that contains data which can be moved, re-sized or deleted. How do you change that? The only thing that can change is the method in which they are stored on physical media (file system) or cataloged and indexed.

      If there were one thing you could do to a file that would actually improve it, I would say it would be adding a file header that defines the format of the contents of the file. Any file with that header could then reasonably be parsed by any application. But, even this idea doesn't change the concept of a file as a collection of bits. It would just create a standards committee to define the header format.

    2. Re:Buzz barf by Anonymous Coward · · Score: 0

      Guess you don't care about searching for the CONTENTS of a file.

  46. We should have objects instead of files. by master_p · · Score: 1

    A file is an outdated concept. We should have objects, with attributes and relations (pointers) to other objects.

    We should have an object-oriented database system, not files.

  47. The more you tighten your grip the more will slip by Anonymous Coward · · Score: 0

    ...through your fingers... Whenever M$ speaks of "ownership", I see M$ as thinking of it as theirs and you are renting it, even if you wrote the document from the ground up. I'll just pass on their Kool-aid....

  48. Magic bytes by Anonymous Coward · · Score: 0

    Yes, that's how it's done. The name is also metadata that isn't even in the file, available for any file. And the directory it's in is metadata! And with symbolic or hard links, you can have the same file name metadata and content in different "directory" contexts!

    Oh, PS, when you see a creation date later than the modification date, it's been copied and if you want to know which one is the most recent edit, you check the modification date: the copy won't supersede an earlier edit.

    But I guess you don't like anything.

  49. Missing the point? by porter_haus · · Score: 1

    One argument the paper makes is the ability to export Facebook photos off Facebook, presumably onto your own personal hard drive. It would be a file with metadata that includes friend tags, comments, etc. Doesn't this miss the entire point they try to make in rethinking files for the Cloud? With everything available, wouldn't we lose the "export" grammar in the first place, not to mention personal hard drives?

  50. Unix: EVERYTHING is a file by Anonymous Coward · · Score: 1

    I like the Unix approach most. I love it for 30 years already, beautiful and simple.

  51. The ownership society by PotatoHead · · Score: 1

    Looks to me like they are wanting to model "files" after "things", essentially abstracting away basic computing.

    It will make IP much more of a reality than it is today.

    Lessig was right on with "Code".

  52. Interesting for business, not so much for users... by DeeEff · · Score: 1

    I can see this as being valuable for corporate and academic use, as often having direct metadata can really help, especially if small changes to a file can wipe the metadata completely.

    For that matter, if you take a look at certain types of files, such as ESRI shapefiles, picking out and parsing metadata is a chore, since it's all in XML and the schema is hardly ever 100% consistent. Having some form of implementation for metadata to be tagged into the files would be useful for this sort of thing. It would make projects such as GIS in the cloud a much more feasible system to create, support and scale. I used to work on a project where we needed to upload a lot of Geographical/geospatial data to a server, and the hardest part was always collecting and re-working the metadata. I think a structure like this could work, provided the service itself doesn't imply that users should own what they are using.

    While I agree that cloud style data shouldn't be implemented for regular users (not counting services like dropbox, where it's a minor component), I do think that this sort of file restructuring can be beneficial for businesses and academics, and likely save a lot of money.

  53. NoFile by greywire · · Score: 1

    My first thoughts when reading this (which is not to discount the fact that I've thought about the subject many times before, including concepts like the resource fork in older mac systems, etc) are:

    Why have files at all? Files are only there as abstractions because we are familiar with the physical concept of files and documents.

    I'm sure I'm not the only one to think down these lines. Are there "files" in the volatile memory of your computer? Generally, no.. there are blocks of memory, with address pointers and such to chain them together in way they can be found. First of, instead of translating back and forth between memory and disk files, why not just have one huge addressing space and store everything in that, and let the system decide what to move to persistent storage. The hardrive, or whatever, is just a big cache/persistent store that you rarely think about (of course you'd want to be able to 'hint' the system about what should be persisted right now). Once you've done this, there are no files anymore.

    Of course, you need to be able to locate things. And we have tools for this. They're called databases. Whether its a relational SQL database or a key-value store NoSQL type or even something else, they don't use the file analogy. The actual persisted storage unit would probably just be a database directly stored to the media using whatever formatting was optimal (rather than stuffing a database into a file..).

    Instead of opening a file, you pose a query and the database finds the data for you. Really, not much different than now. open('filename','r') is really just a query too and could still work the same way in such a system for compatibility.

    There's a lot of work to get to where this would be efficient I think, and requires changing how we think about some things. Letting go of preconceived notions of physical files etc. But that's happening more and more. Things are going all digital. It may be 10 years or more but I think it will happen...

    At some point there will be no more "files" or "directories", there will just be information, and questions about the information.

    --
    -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
    1. Re:NoFile by 0123456 · · Score: 1

      Instead of opening a file, you pose a query and the database finds the data for you.

      So:

      a) every time I save a file I have to add a whole bunch of metadata to ensure I can find it again.
      b) what do I do when I need to open the file and the database can't find it for me? Finding 'that file I think I saved last thursday that came from that website whose name I can't remember' seems a lot harder than going to ~/Downloads.

      Your brave new world looks really freaking annoying to me.

    2. Re:NoFile by greywire · · Score: 1

      a) what do you think file names and directory paths are? Some bit of data you are looking for in a database can also be tagged with "metadata" that is analogous to a filename and path.

      b) I can't tell you how many times I've seen somebody (including myself) try to find something they know they saved somewhere, in a folder, but cant remember the exact path, or for that matter thought they'd make it easy and just dump everytning into /downloads but now they have to remember the file name and/or the approximate date they saved it, and starting sorting by date or name trying to find it..

      In our existing world, file names and nested directories are really freaking annoying too..

      Let me pose this for you: do you think Google stores things in a big organized hierarchical file structure, just like a human would do it with paper, manila folders, file cabinets and such? Or maybe they use, I don't know, some kind of "database" or something? Considering the billions of times per day that people are looking for things on Google, I would say, databases are pretty good at finding stuff.

      I think you must have missed the point I was making about the fact that you can easily build something just like a plain file/directory system on top of a database underneath (in fact, I believe, some filesystems in fact do just this, though I havent kept up on modern file system design..). But a database built on top of a filesystem uses very little of the functionality of the file system and in fact the file system is probably a hinderance in general.

      Lets make the inevitable car analogy!

      I think its like combustion engine cars vs electric. Electric is way simpler, far more efficient, provides better torque and a wider performance profile, etc etc. In fact electric kicks the shit out of combustion in every way except for the battery requirement. And thats just a matter of coming up with lighter batteries that hold a bigger charge and can be charged/refueled/replaced quickly and cheaply. And that *will* happen in time.

      Same thing with computers. It will happen. Eventually. Just look at the internet. Its already backed by databases primarily, its distributed, its parallel. All that will make its way down to the smallest computer. If you think URLs are like files think again, how many URLs map directly to a file? How many URLs are connected to their data through hash tables in a fixed hierarchical storage structure?

      I, for one, welcome our new file-less overlords!

      --
      -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
    3. Re:NoFile by 0123456 · · Score: 1

      When I download something from the web, it goes into ~/Downloads. I don't have to waste time telling the system what it is, I don't have to figure out where the system put it and if I want to see all the files I downloaded I just 'ls ~/Downloads'.

      In comparison, using a database and having to enter metadata is clunky and slow. I've never understood why anyone thinks it's a step forward.

      And suggesting that because Google uses a database it's also something that Joe Sixpack should be doing is simply laughable. Joe Sixpack doesn't have a bazillion files on his PC and the files he wants are probably in one of a few directories or on the 'recently opened' list in whatever application he uses. Hiding them from him in a database is pure retardisation.

    4. Re:NoFile by RCL · · Score: 1

      People who say that hierarchical filesystems suck probably have a big mess on their table in real life.

    5. Re:NoFile by grumbel · · Score: 1

      a) every time I save a file I have to add a whole bunch of metadata to ensure I can find it again.

      Your browser has all the metadata it needs and thus could write it automatically, you would only need to supply additional metadata if you want to.

      b) what do I do when I need to open the file and the database can't find it for me?

      The same thing you do today when you can't remember the name the file is saved under. There is no difference.

    6. Re:NoFile by greywire · · Score: 1

      I didnt say they suck, just that they can be annoying too. They've served us pretty well for some time. But maybe its time for something new.

      A neat, rigid hierarchy has its limits. Just ask biologists who sometimes have trouble figuring out just where to put some odd species into the biological classification tree.

      All I'm saying is that computers can handle more complex organizational methods, so why not use them?

      --
      -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
    7. Re:NoFile by greywire · · Score: 1

      I never said that you, or joe sixpack, should have to understand what a database is or how to use it. I was thinking about how the computer works internally. For you, or mr sixpack, it would seem to work largely if not exactly the same as it does now. However you'd also then have additional options if you cared to use them (in addition to the fact the system could potentially work faster and more efficiently even if you didnt).

      Just like you don't have to understand that Google and pretty much every website on the net uses a database to store and find your stuff.

      Also would like to note that you are talking about Joe Sixpack but also referring to unix commands as if the two are in any way related.

      Let me guess. Are you also against "hiding" the computer's internal workings behind "icons" and "windows" as well? Or is it the other way around, that having icons and windows is a 'retardisation' and we should all be working with more expressive and simple shell commands?

      Frankly your viewpoint is confusing to me. But I appreciate the argument, I love a good argument!

      Mark my words: standard file systems will go the way of the dodo. They will be replaced (and indeed this is already starting to happen) by database oriented designs. The hard part is not in making such a system, or in making it just as easy to use; the hard part is the transition to such systems. Just like electric cars and the infrastructure to support them (or other alternative fuel systems). The internet though, is making that happen.

      --
      -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
    8. Re:NoFile by RCL · · Score: 1

      Well, we can have multiple hierarchies. Just as Google started to allow hierarchies of labels in GMail.

      But frankly speaking, I wouldn't like to have my files in one big flat "All files" root directory with different labels/filters attached. Sure, they are like that on the disk at some level. Sure, most people treat their files like that (see desktop folder of an average user). Yet I, for one, like rigid hierarchies even if that means that sometimes I need to pick one of possible categorizations of a file (or duplicate it, or create a link), because they bring order to otherwise chaotic and messy natural way of thinking, just like strict logical reasoning brings order to our understanding of the world.

    9. Re:NoFile by greywire · · Score: 1

      You can have rigid hierarchies in a database too.

      strict logical reasoning is actually quite the thing with database design, being based on mathematical set theory and such. You might not know this to look at the typical database design, but thats only because most people are idiots and shouldn't be designing databases (sorry, I'm at work, and I'm dealing with a horrid database design..).

      I think everyone is missing the point, that a database oriented system should be able to do everything a typical file/directory based system can do, and more. If you need to have a simple file/directory abstraction on top of it, then you can have that.

      The bigger picture though is the idea that you dont even need to be concerned about 'physical' files in the first place if everything is just a chunk of data in memory. No more loading/saving/copying. Sure you have 'document' concepts, like 'email' or 'letter of resignation' or 'picture of my cat' but these dont have to be explicitly loaded/saved.

      You don't "load" an email or "save" it (typically, i know you can if you need to) you just create it and read it, the computer handles saving it, sending it, bringing it back, etc. I dont think there's an email program in existence that actually saves emails as files, or arranges them in physical folders on your harddrive. Personally I cant stand any email clients anymore because with google mail I can always search for a find an email instantly.

      I like the idea that a word processing document is not an encapsulated blob but rather should be a connected arrangement of different bits of data stored optimally, ie, some text here, an image there, etc. We have been heading in this direction for some time now. Look at web pages.. lots of little bits. And look.. most web pages are stored in databases, not files.

      I don't have all the answers.

      --
      -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
    10. Re:NoFile by RCL · · Score: 1

      I think that such abstraction may be fine on certain high (application) level (i.e. basically like it's now), but not on API level. If at all, this should be organized like OSI model, where each layer would see data with less semantic "metadata" involved.

      And even on application level, this approach ignores users like me. I am more or less fine with not knowing how and where my e-mails are stored (although I do backups to my disk, because I don't 100% trust Google), but I certainly would not like not knowing which disk my files are stored on. E.g. if police comes and takes all my computer equipment for checkup (because someone reported that I am using pirated software - things like that happen in my part of the world, at least to people who run their own businesses) I would like to be sure that my sensitive data is safe on a TrueCrypt'ed pendrive that I store elsewhere.

      To sum up, I think that there's no need for such a change. It's not like people cannot grasp the concept of "traditional" file system - DropBox shows that people are perfectly able to store them in a "cloud". Making file storage opaque for the user is a short-term thinking, which helps rapid expansion due to being "newbie"-friendly, but limits everyone's ability to control the data (and also introduces a single point of failure that would be that all-encompassing "database"). And 'control freaks' are a sizeable population.

    11. Re:NoFile by Anonymous Coward · · Score: 0

      When I download something from the web, it goes into ~/Downloads. I don't have to waste time telling the system what it is, I don't have to figure out where the system put it and if I want to see all the files I downloaded I just 'ls ~/Downloads'.

      I'm not sure I agree with this guy, but your argument is kind of asinine. Instead of putting your file into ~/Downloads, firefox just tags it "Downloads". Instead of `ls ~/Downloads`, you do `showtag Downloads` or some such command.

      And suggesting that because Google uses a database it's also something that Joe Sixpack should be doing is simply laughable. Joe Sixpack doesn't have a bazillion files on his PC and the files he wants are probably in one of a few directories or on the 'recently opened' list in whatever application he uses. Hiding them from him in a database is pure retardisation.

      So, Joe Sixpack just has a few tags (possibly overlapping now - he can have something that is both "Downloaded" and a "Picture" and doesn't have to remember if he left that droll cat photo he downloaded last Saturday in ~/Downloads or if he remembered to move it into ~/Photos) and whatever application he uses continues to use the same access-time data it still uses.

    12. Re:NoFile by bingoUV · · Score: 1

      do you think Google stores things in a big organized hierarchical file structure, just like a human would do it with paper, manila folders, file cabinets and such? Or maybe they use, I don't know, some kind of "database" or something? Considering the billions of times per day that people are looking for things on Google, I would say, databases are pretty good at finding stuff.

      You would say wrong. Google does NOT use database to store their search indexes. They use Google file system.

      Now, you will come back and say GFS is a database. Then, FAT32 is a database too. More so, my BTRFS is surely a database - you know, it uses high tech B-Trees, maybe unicorn blood too.

      Not sure about Bing, but Yahoo search before that didn't use databases either - it used some filesystem.

      --
      Bingo Dictionary - Pragmatist, n. A myopic idealist.
    13. Re:NoFile by bingoUV · · Score: 1

      I dont think there's an email program in existence that actually saves emails as files, or arranges them in physical folders on your harddrive

      You think wrong again. Heard of "mbox"? A fairly popular email storing format. Thunderbird is a very popular email program that uses it, Apple's email client supports mbox too. No point mentioning Sylpheed, Pine or Mutt. If you are unaware of all of these, your talking about email clients only weakens your argument.

      In addition to your misconception of popular search engines using databases, this suggests that your ideas are quite out of touch from reality. No wonder the resulting conclusions make for interesting dreams but not practical workable solutions.

      --
      Bingo Dictionary - Pragmatist, n. A myopic idealist.
    14. Re:NoFile by neminem · · Score: 1

      Rigid file systems can also handle more complex organizational methods. Put a file one place, make a symlink to it (or heck, a hard link if you felt like it) in another place, voila, you now have the file in two places. I actually use this occasionally - music folder, for instance, contains a folder (with subdirectories) for mashups, and a folder (with subdirectories) for game tunes. Mashups of game tunes go in the proper subdirectories of both of those. Done.

      Of course, I *also* properly id3 tag everything. I have no issue with the concept that files could contain optional metadata and the file system could let us search for those files using that metadata. Just, don't get rid of my hierarchical structure while you're throwing things out. I like my structure intact, thanks.

    15. Re:NoFile by greywire · · Score: 1

      http://en.wikipedia.org/wiki/BigTable

      Yep, they use GFS, which is not a database, like FAT32 and the rest.

      And on top of that is Big Table.

      Which is a database.

      Of course Big Table is relatively recent, but if you go back and read the original research paper presenting the Google search engine (pretty easy to locate using, of course, Google) you will note that it mentions 'database' more than a few times. Granted they've implemented their own highly optimized specific database instead of using, say, an existing relational database.. but its still a database at a much higher level than a file system.

      'Filesystems' of course won't go away, you still need to handle the low level formatting of the media, do error handling, etc. But this will become a transparent thing most people generally dont need to think about.
       

      --
      -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
    16. Re:NoFile by greywire · · Score: 1

      Yes, mbox. A very modern, high performance email format...

      so, mbox stores all the emails in separate files? And when you want to locate some old email, the complex file/directory structure that it uses to store and sort the emails allows you to search for an email quickly?

      When I gave up using thunderbird in favor of gmail, one of the reasons I did so was the fact that gmail could search for and locate emails instantly, where thunderbird would take a considerable time to find something, with poor results. Do you know why this is? Because mbox sucks, its a flat file containing all emails, and databases are much better for this sort of thing. Thunderbird developers have discussed the limitations of mbox yeas ago. Not sure what the result was, but databases like Sqlite were considered.

      Now, thunderbird, and I'm sure all the rest of the email clients made in this decade, create and store indexes to find emails. Those indexes in combination with even the archaic mbox 'format' constitute a database. The filing system does nothing more than save these two files to physical media.

      I think everyone equates "database" with "MySQL" or more generally "Relational Database" but there's lots of other kinds of DB systems, like the recent 'NoSQL' trend (including Google's BigTable), and these are showing up everywhere.

      --
      -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
    17. Re:NoFile by bingoUV · · Score: 1

      I dont think there's an email program in existence that actually saves emails as files, or arranges them in physical folders on your harddrive

      --
      Bingo Dictionary - Pragmatist, n. A myopic idealist.
    18. Re:NoFile by bingoUV · · Score: 1

      http://en.wikipedia.org/wiki/BigTable

      Yep, they use GFS, which is not a database, like FAT32 and the rest.

      And on top of that is Big Table.

      Which is a database.

      Google search index data does not use BigTable.

      you will note that it mentions 'database' more than a few times

      Any organized collection of data, especially if stored with special regard to efficiency of read / modify, is called database. So in that regard, all filesystems are databases. But since you talk about filesystems vs databases, Google search does NOT use databases, since it uses raw GFS and not BigTable.

      I remember discussing it with Yahoo engineers, and they also explicitly avoided database usage in favour of filesystems for search index.

      but its still a database at a much higher level than a file system.

      Yeah, and my m3u playlist, and '/usr/bin/find > fileList'; are also databases at much higher level than a file system. Except that you were talking about something completely different and changed tack after learning that databases are not used where you thought they are used. And filesystems are used where you didn't think they are used (mbox).

      --
      Bingo Dictionary - Pragmatist, n. A myopic idealist.
    19. Re:NoFile by greywire · · Score: 1

      Seems like the real issue here is in defining what a database is.

      Yes, technically, a file system could be considered a database. So too could m3u and mbox files be considered databases. Any collection of data is a database. But I think most relatively educated computer people would define a database more specifically than this. Generally, I think most people would consider a database to be a set or sets of data with accompanying index(es) used to quickly locate individual bits of that data.

      m3u, mbox, etc would not be databases. mbox in combination with *.msf file, could possibly be considered a database.

      Getting back to my two examples, which in hindsight are not the clear cut examples I thought they were (and thanks again for the wonderful argument, I love to argue! :)

      I am sure Google uses some kind of a database, even if its not BigTable, for Indexing (note this word) the web and providing something to run search Queries (again note this word) against. I am no expert on Google but I am pretty sure when they index the web that they are not storing all this information into a huge, complex, hierarchical file structure. When searches are requested by users, I'm pretty sure they are not somehow mapped to a directory path where the file is the web page they are looking for.

      They have a database (indexes to search against and a snapshot of the websites, minimum) of some kind that is used for their service. That database is of course stored physically using GFS.

      mbox is a flat file containing lots of emails. The mail clients make little use of filesystem features for sorting, organizing or locating emails, though I believe most of them will have an mbox for each 'folder' you create, if you create any. Its not using the filesystem any more than, say, MySQL does when it creates a file for each table.

      Oracle has databases that store directly to a harddrive, but I imagine even then, there's probably a simple filesystem of sorts underneath that handles the low level formatting.

      I still stand by what I said. 'Everything' will go to 'databases' at the 'human level'. Obviously there's still going to be a filesystem at the lowest level. Just like there's still a "DOS" (not MS-DOS, just DOS in general) but nobody thinks about it anymore. Just like there's a "BIOS" of some sort, but nobody really thinks about it much..

      --
      -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
    20. Re:NoFile by Anonymous Coward · · Score: 0

      I dont think there's an email program in existence that actually saves emails as files, or arranges them in physical folders on your harddrive

      Physical folders? No. No, there is not.

    21. Re:NoFile by bingoUV · · Score: 1

      I still stand by what I said. 'Everything' will go to 'databases' at the 'human level'.

      You call this standing by? Earlier you said that a database could be represented as hierarchical filesystem, it is just a database at the back-end. See

      I think you must have missed the point I was making about the fact that you can easily build something just like a plain file/directory system on top of a database underneath

      Now you changed tack again.

      And yes, I thank you too. I enjoy arguing too, but I'd enjoy it better if you really stand by what you say.

      --
      Bingo Dictionary - Pragmatist, n. A myopic idealist.
    22. Re:NoFile by greywire · · Score: 1

      I think you must have missed the point I was making about the fact that you can easily build something just like a plain file/directory system on top of a database underneath

      Now you changed tack again.

      And yes, I thank you too. I enjoy arguing too, but I'd enjoy it better if you really stand by what you say.

      I'm saying a database should be what we commonly interact with, instead of a lower level, limited conventional file system of directories and files. If you don't like interacting in this manner, you can still have a simple directory/file like layer on top of that, and it should work just as well if not better than existing systems. For compatibility I see no reason you couldn't replace the entire filesystem in a current computer with a database and have it still work the same on the surface, while using a database underneath, and then you'd have the option of accessing the database directly to do things you couldnt easily do with a simple filesystem.

      After the discussion so far, I have realized that clearly there is still going to be some kind of low level filesystem to handle the physical formatting of the media, etc. Also there would need to be some sort of way of saying "I want this data to be physically on this particular hardware". But I still think dealing with a simple file/folder structure is going to be marginalized.

      --
      -- Senior Software Engineer, Attorney appearance services, locallawyerapp.com.
  54. Wait until Gnome 3 and Unity get ahold of this! by Anonymous Coward · · Score: 0

    Just wait until Gnome 3 and Unity start doing away with files! They've ruined the desktop, so they need somewhere to go next. Why not design a new user interface that doesn't use files? They'll make computers unusable, or die trying! They're on a mission.

    Microsoft won't be adding this to Win8, by the way, because they've been trying to use a relational database for a decade or more as the Windows file system, and it's never gotten off the ground. That's probably music to the ears of Unity and Gnome 3, though, since they can make a huge mess disaster catastrophe out of the idea!

  55. Access control by Hentes · · Score: 1

    Ownership is just microsoftspeak for access control, it's a security feature. And the idea of metadata and access control in a filesystem is not really new.

  56. Nothing new by Hentes · · Score: 1

    While they might be new for MS these are certainly not new ideas. But at least a move in the right direction.

  57. Just because these are senior people by Anonymous Coward · · Score: 0

    PhD's of course. Just because they have done all this study, doesn't mean the idea is necessarily bad.

  58. just a simple matter of enineering... right? by Thud457 · · Score: 1

    I agree, one infinitely long tape ought to suffice for everybody.

    --

    the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff

  59. For a truly interesting take on the file concept.. by seandiggity · · Score: 1
    --
    Geeks like to think that they can ignore politics, you can leave politics alone, but politics won't leave you alone.-rms
  60. Ubiquitous Windows by Anonymous Coward · · Score: 0

    So if some one sends me a .txt from a Windows machine file I need to find a Windows machine to open it, I can't open the damn thing on Linux with vim anymore.

    1. Re:Ubiquitous Windows by phonewebcam · · Score: 1

      Welcome to the new m$ business model. They most definitely do want you to be able to open it, but for a fee extorted by anyone writing the tool which lets you do so. Nowadays extortion beats innovation and as their relevance seeps away day by day it's their only desperate way forward.

  61. FILE(1) by tqk · · Score: 2

    I'd like to take this opportunity to point out the brilliance of the "file" command (in *nix). All its smarts, plus all the details mentioned in its manpage, are all I ever needed to know about any file's technical details. This BS from Microsoft is re-inventing the wheel, badly and foolishly, with suspiciously strange priorities. No surprise there.

    The "file(1)" manpage is a great read, including potshots at SysV, BSD, and mention that it (or at least Debian's version) was written by a fellow Canuck (Ian F. Darwin).

    FYI, a point & click interface to manpages:

                  xman -notopbox -bothshown &

    Enjoy the odd behaviour of the Athena Widget Set's scrollbars. :-)

    --
    "Tongue tied and twisted, just an Earth bound misfit ..." -- Pink Floyd.
  62. And in other news by Anonymous Coward · · Score: 0

    Microsoft plans to reinvent the file system......again

  63. KISS by Nom+du+Keyboard · · Score: 1

    Keep It Simple Stupid. A file is a container for digital data. Add a unique identifier (file name) to locate it. Add external meta data to describe it if you wish. But why does it have to be more complicated when it does nothing more than hold unspecified unstructured digital data? All that these more complicated proposed systems have accomplished is to spend a lot of money with nothing yet to show for it, and delay new operating systems for years.

    --
    "It's the height of ridiculousness to say for those 9 lines you get hundreds of millions."
  64. Keep beating that dead horse by Daniel+Phillips · · Score: 1

    Is this just an attempt to rescue Bill Gates' besmirched reputation as an technical visionary?

    --
    Have you got your LWN subscription yet?
  65. Steve... by Anonymous Coward · · Score: 0

    If only we had Steve Jobs to solve this problem for us. :-(

  66. A simple taxonomy of files. by Animats · · Score: 1

    I looked at this problem once, in the context of distributed file systems. I'd divide files into the following categories, with slightly different semantics for each. This can be fitted into the standard UNIX/DOS/Windows model, but it resolves some issues that result in programs doing elaborate workarounds to get correct file semantics.

    • Unit files A unit file is only meaningful once it has been completely written and closed. Once closed, it will never be rewritten, only replaced. Most files are unit files. The file system should guarantee that 1) unit file replacement by a new version is an atomic operation, 2) unit files not properly closed do not replace old versions, so that when a program fails while writing a unit file, the old version remains undamaged, and 3) a unit file being written is not visible to other processes until closed.. This was a feature of some mainframe operating systems. Because the UNIX file system concept doesn't have this, there's much fussing around with ".part" files and renaming strategies to achieve unit file semantics.

      The file system could unduplicate unit files based on their content. One approach would be that the real name of a unit file is its cryptographic hash; any other name it has is an alias. Backup programs can usefully use such information.

      The ability to close and commit a group of unit files as an atomic operation would be useful. This should be the last step of an "install", so that if anything goes wrong, the install is automatically backed out, like a database rollback.

      This is the default form of file.

    • Log files Log files are written sequentially. Writing can only be done at the end. The file system should guarantee that 1) writing must be sequential, 2) in the event of a crash and restart, the file may not be complete but will be correct up to the end position of the file, which will be at the end of a single write operation, and 3) multiple processes can write to one log file, but all processes append, never overwriting each other.

      This is how UNIX/Linux files ought to work in "append" mode.

    • Managed files Managed files are written and read in a random access fashion, but only by programs which understand their format. Managed files typically contain databases of some type. The file system should guarantee that 1) managed files can be opened for full or partial exclusive use, and 2) additional functions for insuring that file writes are flushed from cache to disk are available.

      Managed files have the most complex semantics, but not many programs use them. The ones that do typically go to a lot of trouble to get the file semantics right, to maintain database integrity. Read what SQLite needs from a file system to get a sense of how managed files need to work.

    • Scratch files Scratch files do not outlive the process group that created them, and are invisible outside that process group. They can be read and written freely, but do not have permanent existence in the file system. The file system should guarantee that when the process group goes away, so does its scratch files, so as not to clutter up the file system. This would stop the accumulation of abandoned temporary files.

    This is how UNIX and Linux should have worked. Today, programs struggle to get those semantics across platforms, but don't always succeed, leaving behind truncated files, partial failed installations, junk files, and database disasters where two programs accessed the same managed file.

    As for metadata, the original MacOS "resource fork" concept was a good one. But the original implementation was botched. The resource fork was a badly implemented tree-type database store, one that was corrupted if a program failed to close the file properly. If the resource fork had been implemented so that at the end of each write, the resource fork was guaranteed to be in a usable state, the whole concept might have been more successful. It took a long time for Apple to fix this; the phrase "damaged resource fork" appears tens of thousands of times in Google until 2006.

    1. Re:A simple taxonomy of files. by spitzak · · Score: 1

      I absolutely agree with your idea.

      Almost 100% of the files on a modern system should be your "unit files".

      The work around on Unix to get this behavior is obscenely complex. And changes to file systems can break it: that is what all the complaints about EXT4 were: programs were assuming that link() would always be done after all previous data is written. This probably would not be a bit deal normally, but was used everywhere to try to simulate the "unit file" by assuming that writing a temporary file, closing it, then linking the correct name to it would produce the desired effect. If there was a simple direct way to make such files then the people writing EXT4 could support it directly.

      "scratch files" as you describe are the same as the unit files if they are never closed. It may be a good idea to add a call that can be done to an opened for write unit file handle that duplicates what happens if your program exits without closing it, so that resources can go away.

      An extra idea is just a phony file system that just allocates space in the process's own memory. This would be mounted at some well-known location. It would always be blank on startup, and all data written there is lost when the process exits. Mmap calls can be used to instantly create a "file" that matches some block of memory. This is useful for interacting with code that only knows how to read/write files (many image libraries have this problem). Also could be used as that scratch space.

    2. Re:A simple taxonomy of files. by Tetsujin · · Score: 1

      I looked at this problem once, in the context of distributed file systems. I'd divide files into the following categories, with slightly different semantics for each. This can be fitted into the standard UNIX/DOS/Windows model, but it resolves some issues that result in programs doing elaborate workarounds to get correct file semantics.


      • Unit files A unit file is only meaningful once it has been completely written and closed. Once closed, it will never be rewritten, only replaced. Most files are unit files. The file system should guarantee that 1) unit file replacement by a new version is an atomic operation, 2) unit files not properly closed do not replace old versions, so that when a program fails while writing a unit file, the old version remains undamaged, and 3) a unit file being written is not visible to other processes until closed.. This was a feature of some mainframe operating systems. Because the UNIX file system concept doesn't have this, there's much fussing around with ".part" files and renaming strategies to achieve unit file semantics.

      • Managed files Managed files are written and read in a random access fashion, but only by programs which understand their format. Managed files typically contain databases of some type. The file system should guarantee that 1) managed files can be opened for full or partial exclusive use, and 2) additional functions for insuring that file writes are flushed from cache to disk are available.

        Managed files have the most complex semantics, but not many programs use them. The ones that do typically go to a lot of trouble to get the file semantics right, to maintain database integrity. Read what SQLite needs from a file system to get a sense of how managed files need to work.

      • Scratch files Scratch files do not outlive the process group that created them, and are invisible outside that process group. They can be read and written freely, but do not have permanent existence in the file system. The file system should guarantee that when the process group goes away, so does its scratch files, so as not to clutter up the file system. This would stop the accumulation of abandoned temporary files.

      We have everything or nearly everything we need to implement this - though admittedly it could be a bit cumbersome...

      "Scratch files" can be created by creating a file, opening it, and unlinking (removing) it. The file continues to exist until the last open filehandle to it goes away. But since it's unlinked, it has no permissions and no one else can open it. But if you want to grant a filehandle to that file to another process, you can either fork() and the new process will get a copy of the filehandle, or you can send an open filehandle to another process via... a Unix Domain socket? (IIRC. I know there's a mechanism, I don't remember if Unix Domain Sockets are it. The dbus system uses filehandle passing for certain things, I believe.)

      "Unit files" would be more or less the same thing: create the new version of the "unit file" with a temporary name and unlink it from the filesystem. Processes opening the file will see the old copy - until you're ready to commit the new version, at which point you unlink the old version and link the new version to the old filename. Anyone who opened the old version will continue to see it until the last open filehandle to it is closed. The main bit that's missing, I guess, is that this isn't an atomic operation. If you unlink the old file and then somehow fail to link the new version to the old filename, the file momentarily doesn't exist (until you link the old one back again...) And you don't get the "commit all or revert all" behavior you describe for a batch operation.

      It's worth noting that "Unit files" take the same amount of resources as the current scheme but you lose the ability to peek at the new version before it's committed...

      I'm not

      --
      Bow-ties are cool.
  67. What They Omitted by Scarletdown · · Score: 0

    New abstractions of file are needed, which reflect what users seek to do with their digital data, and which allow engineers to solve the networking, storage and data management problems that ensue when files move from the PC on to the networked world of today, and provide us with another trivial little idea that we can claim as our imaginary property, patent, lock down, control, and squeeze licensing fees out of in anything that implements what we are going to try to spew out.

    --
    This space unintentionally left blank.
  68. DRM on pictures is a fairy tale by tepples · · Score: 1

    Would you be opposed to a DRM scheme that would allow you to totally and irrevocably delete a picture you posted to Facebook because it allows you to retain total ownership ?

    Yes, because it's a fairy tale. Anybody can take a picture of a screen using a camera.

    1. Re:DRM on pictures is a fairy tale by CharlyFoxtrot · · Score: 1

      Well that's actually a derivative work, an unlicensed derivative work. And the fact you can prove it is unlicensed because it lacks the metadata of the original is valuable by itself. Imagine you upload a picture to Facebook that has a "Creative Commons Free for non commercial use" license embedded in it, if someone exploits the analogue hole and uses it in an ad the fact that it is missing the metadata gives you a legal recourse.

      --
      If all else fails, immortality can always be assured by spectacular error.
    2. Re:DRM on pictures is a fairy tale by Grishnakh · · Score: 1

      You really think the courts are going to care if you're mad because someone made an unauthorized copy of you smoking a joint or whatever? Do you have any idea how much it costs just to file a lawsuit, after accounting for the attorney's fees? "Legal recourse" is basically useless unless you're a big company (or you're suing a big company, and can then get an attorney to work for you without an up-front fee).

  69. Has everyone forgotten the OOXML scandal? by phonewebcam · · Score: 0

    Is anyone outside m$ really considering letting them define what a freaking file is from now on after this?. But let's not get ahead of ourselves here. Surely once Microsoft has bullied their shit into everything once again, we can all trust them and no one will end up having to pay any kind of extortion racket like this, and this, this, this and this.

  70. Nepomuk? by mx+b · · Score: 1

    I haven't noticed anyone bring this up yet, but I thought this was the main goal of the Nepomuk project? (in KDE)

    Well, maybe not do away with files themselves but rather how we store and access them. Rather than digging around in folders, we tag new files when we save them and search based on tags, i.e., what the file is used for or what it contains. I think that's a fantastic idea that needs more time to grow. I hate dealing with folder hierarchies, especially because I often run into the situation where a certain document can properly belong in one of several folders, and then I never know where to keep it so I don't lose it. The ability to throw all my documents in one folder and tag them with as many tags as necessary and then search for what I need (or rather, create "virtual folders" that sort things based on tags no matter their location) is really great step out of that annoyance.

  71. On the nature of files by Tetsujin · · Score: 1

    We’ll end up with 10 different standards, and no one will bother keeping metadata accurate on all their files. At best metadata is useful for a single person on a small subset of files where they find it useful. Everything else, the only metadata anyone is going to care about (and be bothered to enter) is title, which is served fairly effectively by the file name.

    Metadata becomes a lot more useful as your collection of files grows, and as the UI develops to better take advantage of the information. To a certain degree this has already happened (and it kind of makes me wonder where they've been the last several years) - though of course you don't see it in every program yet.

    Often, yes, as you say, title is all you're likely to need - or else the few other pieces of information you need (album name, track or episode number) are easy to incorporate into the file name or directory structure. But there may be cases where a piece of media doesn't have a title. This is often the case with large numbers of photos: there may not be a basis for providing a unique title to each image. So photo management software deals with this by encouraging organization and search via metadata (basic stuff like the date, but also tagged events, locations, and individuals). These files probably have filenames, but they're probably meaningless stuff like IMG_1234.JPG - just a sequence number provided by the camera. If you think about how you'd want file operations to relate to file name in that case, the filename actually doesn't come into consideration.

    Consider, for instance, if you're moving one collection of files into another directory. And in that target directory there happens to be another IMG_1234.JPG. Do you overwrite that other file? The traditional answer would be "yes", since files are uniquely identified by their filenames. But the filenames have no particular value in this case: they are purely artificial. In this particular case that's probably not what the user wants. If anything they'd want the files to overwrite only if they're the same file.

    There are other cases where there may be a sensible choice for the filename of each file, but it doesn't necessarily make a lot of sense to have it as a unique identifier. For instance, suppose you have a directory full of videos, and each has its title as its filename. Now suppose there's different variations of a few of the files: maybe two different versions of the same movie (a fansub and an official release) - maybe two different encodes (one for best quality, others to play on specific devices - and remember that different encodes could use the same container format, so the difference isn't necessarily ".AVI" vs. ".MKV" or whatever)... You could incorporate that information into the filename or directory structure - but it makes the filenames increasingly artificial, and the directory increasingly burdened with additional directories - different files treated as different items in the directory when in fact there is good reason to treat them as different versions of the same thing.

    And then there's other cases where the traditional system of filenames works just fine and change will probably only foul things up - source code directories and so on in which it's very convenient to have a simple way to uniquely identify a particular piece of data, without having to address complicated questions like "but which one?" At most, you'd want something analogous to (or integrated with) a versioning system - you could specify "which one" if you wanted to but most of the time that question would already be answered.

    So I think there are problems with the current system of files. There are cases where there is no useful information stored in the filename, and changing the filename to be both unique and informative is a much more cumbersome process than populating the file's metadata with relevant tags. There are cases where multiple "files" may just be different renderings of the same thing, in which case it may not be useful to treat them as separate entities at all, but rather as versions of the same thing. So I think it is worth thinking critically about how we approach the issue in the future.

    --
    Bow-ties are cool.
  72. Confusing by jdavidb · · Score: 1

    At first glance, I thought they were talking about implementing something like the HURD, where every "file" could potentially be a service behind the scenes. Reading a little further, I thought they were talking about implementing something like the old Macintosh HFS resource forks.

    But then I kept reading and realized they were just making some noise about DRM. Nice try, Microsoft; you almost had me going, there.

  73. Make them encapsulated by hesaigo999ca · · Score: 1

    The problem is that a filesystem has its own way of dealing with types of files.
    I saw keep everything as binary, and allow for the tool opening that file (binary) to have a centralized view of dealing with that data.
    I have a binary file which could be word, or could be html or could be pdf, but the file knows itself what it is.....this would just require that ALL file systems keep the files unfragmented, which is not possible at this point unless using linux distros.... the fat and ntfs that are 80% or more of the market share, fragment the files so that the OS knows the file info and where it resides.

    Why break a file at all, if you do you have no way of knowing what is what....somewhere along the way the file could have recognizable pointer markers to let know the start and end of a file...within the file....so as to allow for a quick data loss recovery as well.....i think etx3 does this, if i remember correctly, but it has been so long i haven't touched linux, i feel almost like a virgin....fragile and vulnerable..... jk.

    A file is a file, keep it together , even when you work with it. then rebuilding file structure is easier, as well as being able to keep info inside the file itself becomes doable as well.

  74. The filesystem is a database. by blair1q · · Score: 1

    The filesystem is a database with a schema that's baked into the operating system or (in the case of savvy operating systems) bolted onto it.

    Oracle has made a mint by doing away with files and using bare disks to hold data in schemas their users develop (or buy), but all they're doing is generalizing disk access. They do this because dealing with both the filesystem schema and the user schema is redundant and a waste of time, and time is money to users with megajumbo databases. They also do it because it involves a lot of proprietary middleware they can overcharge for, even though it's pretty simple and even Larry Ellison could implement it.

    If Microsoft wants to beef up the filesystem trope beyond directories and inodes and open/close/read/write, then let them. They've cut several versions of the Windows filesystem over the years, there's no reason they can't roll out a new one and let the market vote with its feet.

  75. Be guided, but not bound, by traditional paradigms by Tetsujin · · Score: 1

    A file is simply a linear series of data. Period. End of story.

    Engineers and interface designers have the ability to determine how these abstractions are implemented at the low level, and how they are presented to the user at the conceptual level. While the abstractions of the past and present are very useful, it is not sensible to assume they will continue to be the best course in the future.

    Let's look at an example in kind of the middle ground, between implementation and conceptualization. Suppose you have a file containing plots of some value with respect to time.

    Now, each plot, by itself, is very linear by nature. However, the plots together are not. They run in parallel. You could simply concatenate them or interleave them to turn them into a serial file, but the data is not linear by nature. As a result you pay a certain price, when you edit that data, maintaining that serial format. Suppose you want to add data to the end of the time sequence for each plot? If the plots are simply concatenated, you have to shift the contents of the file around each time. If the plots are interleaved, then you introduce a bunch of file seeks when you read the data back out. If you decide each plot should be a separate file, then each is no longer tightly coupled to the series of time values, unless that time sequence is duplicated in each file... And the collection is no longer a "unit" on the filesystem. The data is separated into multiple units, in that case, for reasons that have no relationship with the nature of the data itself or the ways it is intended to be used.

    Therefore, I think there is a certain merit to the idea of separating the concept of creating "contiguous, linear" allocations of disk storage from the concept of creating a "unit" in the directory tree. Forked files allow you to shift the problem of expanding or shrinking these allocations to the filesystem layer - arguably a more appropriate place for it than in the application itself.

    --
    Bow-ties are cool.
  76. Universal container formats by Tetsujin · · Score: 2

    Wouldn't it be possible to make a "universal" file container, in that any other file type could be imbeded with a text file that listed: what type of file it is, what program it is associated with, owner, creation/mod dates, and especially, tags and other types of metadata?

    Do you know that they tried this already?
    In 1985? (Well, I'm speaking specifically of IFF - but there were other efforts. Mac's file forks were kind of the same sort of thing, except that they maintained the abstraction all the way down to the filesystem layer.)

    Now, just because they tried it already and more or less failed doesn't mean it couldn't work... But they were in a much better position in 1985 to make this work than they are now (we've gone too long and come too far without a "universal format", it'd be nearly impossible to get people to embrace that kind of change now...) so I think it's kind of a lost cause.

    I found it absolutely fascinating, personally, when I read one of the original documents on IFF. The ambition, the hubris perhaps, with which they were trying to guide the future of personal computing. They weren't just seeking to create "a" format, they were aiming for it to be the format. And it would have been capable of just about everything you suggest - embed a FORM of whatever you like in a LIST, put in descriptive chunks, etc... I believe Amiga embraced the concept to a fairly high degree.

    There are various historical and technical reasons why it didn't really pan out. I think one of the big ones is simply that IFF wasn't the right format for everything. Perhaps no one format can be. Among other things, IFF required four-byte payload sizes appear at the start of each chunk. That limits a chunk (and therefore a file) to 4GiB maximum (not such a big deal in 1985 or even 1995... But these days it'd be an unacceptable limitation) - but another problem is that sometimes you need to write out some data and you just don't know how big it's gonna be. Streaming audio and video are a pretty good example. You can discretize the stream, populate it with known-size chunks, but you don't know the size of the whole stream until it ends.

    I think general-purpose data formats are a good thing - but I believe it's very important to consider that there may be cases where a particular format just isn't right for the problem. And that brings us back more or less to the current scenario, in which different applications tend to have totally distinct file formats, not even sharing an overall containment structure. From that perspective, it's wasteful to continue re-inventing metadata storage for each new file format that comes along, and wasteful to implement all these different methods of reading metadata out of different application-specific file formats. There's also the danger that we will want to change the format of the data in the metadata fields (just as we shifted from "whatever local variant of ASCII your region uses" to mostly using UTF-8 - which still isn't necessarily adequate for all regions, incidentally) Another all-new text encoding so soon after Unicode's introduction isn't too likely, but the OS, in defining how these metadata fields are defined and used, could change the requirements that go beyond what the container format can provide (for instance, storing data that goes beyond the limit of a particular format's "metadata region" size limit, or storing something that's better encoded in some binary form other than text. Decoupling the encoding of metadata from the definition of file formats eliminates a bunch of redundant work and leaves us more room to change what metadata contains and how those contents are used, as we get a better idea of how, ultimately, it will be used as the dust settles around this whole issue.

    --
    Bow-ties are cool.
    1. Re:Universal container formats by mx+b · · Score: 1

      Wasn't aware of the history, thanks for your response.

  77. Indexing services by Tetsujin · · Score: 1

    I just want one thing: a file system that is part database for fast file searches. I don't want to manually build indexes or any other bullshit just look at the file table and give me my fucking file. Even if you had 100,000 files with file names of 256 characters, its only 2.5 MB, how long does that take to parse? Maybe I don't understand file systems but even a 10 MB file table should only take a few seconds to scan. When I do a search of a directory or entire disk with tens of thousands of files it sometimes takes a minute or two. The disk is thrashing away as if the program is looking all over for the file names. Shouldn't they all be in one place pointing to where they are on disk?

    We pretty much already have this. The way it works in practice is that there's some service on the machine that provides indexing, maintaining a central database of metadata. When a file changes, the metadata is re-scanned and the index updated. Then you can use the index to search for things.

    I know this exists for Linux but I don't know to what extent it's actually supported by applications. (I never use the feature.) On Windows, these days, file manager windows show columns containing metadata fields (unless you turn that off - haven't had much luck so far, actually) and you can't swing a dead cat without hitting a search field.

    It's not incorporated at the filesystem level, but for the most part it doesn't really need to be. As long as the feature is there, and you can rely upon it being there, and applications actually take advantage of it and work with it, it's just as good. Well, very nearly.

    One thing I think could be improved is that sometimes file names just aren't meaningful. Filenames from a digital camera, for instance, tell me very little - and renaming the files, while keeping the names unique and making them meaningful is not always easy. I could have two different 003.jpeg's in two different directories with completely different contents. If I move the contents of one directory into the other, I don't really want one to overwrite the other, because that would be dumb. That filename is entirely meaningless, the only reason the file even has a name is because the filesystem requires files to have unique names. But that could be addressed at the UI level (and has been, on Windows anyway, which is how you wind up with things like "003 copy copy copy copy.jpeg") so it doesn't necessarily require a change to the underlying file paradigm.

    --
    Bow-ties are cool.
  78. Tagging by Tetsujin · · Score: 1

    When I download something from the web, it goes into ~/Downloads. I don't have to waste time telling the system what it is, I don't have to figure out where the system put it and if I want to see all the files I downloaded I just 'ls ~/Downloads'.

    But sooner or later you're probably moving them somewhere, right? The analogous procedure in a "database" filesystem would be to recategorize the item from "uncategorized recently downloaded item" to something that'll be more useful in the long term. One important thing to remember is that right now tagging and categorizing a file is a chore because that's not something a lot of UI is designed to do... and stashing a file in a directory hierarchy is relatively easier because it's something we're used to, and something UIs are currently geared toward. That's also the only reason sticking files in a database could reasonably be called "hiding" them. It's just a question of what the UI is geared toward. (In the case of folks like me, and you apparently, the "UI" in this case would include the command shell and various mechanisms used to address files in other programs as well...)

    My personal feeling is that the usefulness of a "database filesystem" would be greatest in (and perhaps limited to) certain domains. Media files, hell yes. Source code... Well, apart from version control I really don't think so. :) But having the hierarchy indexed could certainly be useful ("find me the files that implement this method of this virtual base class - because some knob on the programming team doesn't believe in grouping class definitions in a sensible way"... of course some IDEs already do this independently of any system-wide indexing support)

    The thing with media files is that you tend to wind up with a huge collection of 'em. They usually don't need to reference one another the way source files do, they're self-contained and there's not always a good way to name them. Video and music files can usually be named with their title, of course, and it's not hard to put them into a hierarchy, but the hierarchy isn't always useful. If what I want is to play "Doppelganger" - I'm not likely to have to disambiguate a request like that. I can specify the full path from the base directory where I keep all my music, but I could as easily skip that. (And it's not that hard in a shell that supports the ** notation for "find" like searches during globbing...) But if I take a bunch of photos, there may not be any value to giving them filenames at all. It's more useful to organize photos by things like date and tags. "IMG_0286.JPEG" isn't useful in any way, and coming up with unique titles for images that may not even merit a title would get a bit crazy. The filename becomes an afterthought in that case - a necessary "evil" if it's something you have to deal with manually. (If it's dealt with automatically, the filename could still be useful as a unique identifier.)

    --
    Bow-ties are cool.
  79. Epic Legends of the Hierarchs by Tetsujin · · Score: 1

    People who say that hierarchical filesystems suck probably have a big mess on their table in real life.

    I have never tried organizing my table hierarchically. Tell me, where do I put my cell phone in that case? Do I group it with my desk phone, because it's a phone, or with my computer, because my phone is also a little computer that's plugged into my big computer?

    --
    Bow-ties are cool.
    1. Re:Epic Legends of the Hierarchs by RCL · · Score: 1

      Normally you pick one of possible categories, and I would grouped it with desk phone because of similar function.

  80. Re:Stevearino, the Stevinator... by Tetsujin · · Score: 1

    If only we had Steve Jobs to solve this problem for us. :-(

    Hey, he had his chance. How long was he with Apple after his return? Ten years or something? He ushered in OS X and the shift to Intel - both of them representing fairly extensive breaks with previous Apple products, either of which could have been a great time to tackle something like this...

    Apple brought us Spotlight, I guess. It's representative* of the way things have been going: not redefining the filesystem, but building a database of file information and using it for searches.

    (* I don't know who pioneered the concept or what implementations came first, so I say only that Spotlight is an example of filesystem indexing.)

    --
    Bow-ties are cool.
  81. Some unlicensed derivatives are not infringing by tepples · · Score: 1

    Well that's actually a derivative work, an unlicensed derivative work.

    In some cases, the law permits an unlicensed derivative work. For example, as my country's copyright law puts it: "The fair use of a copyrighted work, for purposes such as criticism, [...] is not an infringement of copyright." Granted, your ad example probably isn't a fair use.

  82. That's not really what I was talking about, but... by Tetsujin · · Score: 1

    And all this shows why you don't need the filesystem to track metadata, all you have to do is embed it into the file.

    Well, my post really wasn't addressing that question at all. My post was about whether using metadata, as opposed to directory structure and filename, to find data was a reasonable sort of UI, or if people's tendencies to be lazy about writing metadata would undermine that too much. My point was that if the system is well-designed around the use of metadata, users will tend to keep their metadata well-ordered, because in that case it's actually useful and easy to do so.

    To address the point of whether metadata should be part of the file structure, or adjacent to it - I think there are advantages to each approach. Presently, there's a lot of infrastructure that's just not geared to dealing with metadata that's not stored as part of the file itself. But if you copy an MP3 file, the ID3 tag will be preserved, because it's there in the file structure. So at present that's a definite advantage, and not one to be underestimated.

    There are disadvantages to bundling the metadata: for starters, if you have two files with identical data but different metadata, tools like "diff" or "md5" would reflect that difference. Or if you modify a bit of metadata, you're changing the file's modification time as well. That could be undesirable. Suppose you download a file via bittorrent and tag it according to your preferences - you won't be able to seed the torrent from that file, because its checksum will have changed. Or what if you want to tag an HTML file with metadata, or some other file type for which metadata either isn't supported, isn't adequate, or you just plain don't want it there in the file contents? The reason why it's called metadata is because it's data in reference to the primary data of the file... not part of the primary data of the file. I don't claim that this, by itself, is a conclusive argument in favor of filesystem-level metadata but I hope you take my point that there is a logical basis supporting its separation from the primary data stream.

    There's also the maintenance issues around supporting each new type of metadata for each new file format as it's introduced - and if, as part of the OS design, you make some decision about metadata or how it's used that doesn't fit well with how it's stored in a particular file format, then resolving that disparity could be a bit of a headache for the implementers as well as anyone who has to use that UI. If you provide metadata as part of the filesystem, its format can be changed to suit the way it's being used, and these changes can be transparent to applications and users.

    But I think the bigger issue, and the main thrust of the articles and the main focus of current work in improving utilization of metadata in the UI, has more to do with how metadata is presented to the user, rather than whether it's stored in the file or adjacent to it. The indexing systems in present use can use both approaches: metadata within files for file formats the indexing system is designed to specifically support, and filesystem-level metadata for others. From a pragmatic standpoint that's probably the way to go: use file-level metadata where it's appropriate, use filesystem-level metadata where it's appropriate, and just do your best to resolve disparities as they crop up. It's hard to effect this kind of change, so there are always advantages to an approach that provides a smooth transition.

    --
    Bow-ties are cool.
  83. meanwhile... by Tom · · Score: 1

    While MS has its research department thinking up old thoughts everyone and his dog has had for the past 20 years, we already have metadata in half a dozen non-MS filesystems, and we have resource forks, extended attributes and user-presentation layers that will happily show the user a directory as the application contained inside because really that's what he cares about.

    What we don't have is some of the other interesting ideas we had 40 years ago. Some of them went out rightfully, some of them we simply lost because they were good ideas that weren't ported to our modern operating systems.

    So, you want to re-invent the file? How about you come up with one idea that's actually new? Because otherwise it's re-hashing, not re-inventing. :-)

    --
    Assorted stuff I do sometimes: Lemuria.org
  84. Re:Be guided, but not bound, by traditional paradi by wisnoskij · · Score: 1

    But you could come up with a million different examples of data, and how they are handled has to be on a application level because only the application knows how to deal with the data.
    And there is a reason the files are linear series of data, because that is what HDs are as well.
    Not that a file cannot be broken apart into different sections to fit/optimize performance but at the application level they have to be considered linear series of data if only because every programming language of earth is set up to read files linearly.

    --
    Troll is not a replacement for I disagree.
  85. Re:That's not really what I was talking about, but by Grishnakh · · Score: 1

    It seems to me that by putting metadata into the filesystem, you're creating some big problems with compatibility: different filetypes need different metadata. For instance, a PDF file might have information on author, title, etc. A jpeg file might have EXIF camera settings. Having the filesystem deal with metadata seems like it's pushing this stuff down into the OS, where it really should be left up to apps. Also, what if people decide they want different metadata? Back when jpegs were first made, they didn't include EXIF data, but now they frequently do thanks to the proliferation of digital cameras. Presumably, the standard was modified to allow this. But changing the standard is easy with metadata encapsulated in the file itself; just have a version number in the file heading saying what standard the file conforms to, and apps will read this and interpret the data accordingly. Changing an application or two is a lot easier than patching the whole OS to deal with a change in metadata standards. Or, what about files made by specialized programs that not many people use? Why should OS makers have to deal with metadata standards for every filetype in existence? The way it is now, only people who write apps dealing with a particular filetype have to deal with metadata standards.

    Taking the metadata out of the file creates a lot of complexity, without any significant gain that I can see. Your examples of bittorrent files and files with changed metadata not md5-matching others just doesn't seem to be enough of a problem to warrant all these changes. In fact, these problems can be easily fixed by fixing the tools that use these files; BitTorrent, for instance, could be modified so that certain popular filetypes (e.g. video files like avi and mkv) are recognized by the tools and the metadata ignored when creating an md5sum. Modifying a tool that only some people use is a lot easier than modifying an entire OS.

    MacOS tried to do this very thing long ago, and finally gave it up. There's probably a good reason for that: the benefits weren't worth the costs.

  86. Re:Be guided, but not bound, by traditional paradi by Tetsujin · · Score: 1

    But you could come up with a million different examples of data, and how they are handled has to be on a application level because only the application knows how to deal with the data.

    Well, I take your point, that it doesn't necessarily make sense for the OS to get too heavily involved in what would normally be application-level decisions. However, I think more flexibility in the structure of files could be useful. Give applications the tools and let them decide how to use them. Providing a feature like file forks is awkward because most software doesn't currently deal with it (as you point out) and there's a little bit of a technical challenge in implementing it and a logistical problem in getting it into all the various filesystems (the ones where it's possible to do so, anyway) - but it is not an insurmountable problem, nor is it a slippery slope into ever-increasing complexity, or a path leading to an unavoidable fate of excessive OS involvement in application file storage strategies. It is one useful organizational tool, allowing an application to have multiple "sequential byte range" abstractions within something that's treated as a single unit on the filesystem. The difference in implementation between file forks and directories would be very minor. The major difference would be in how UIs treat the "forked file".

    And there is a reason the files are linear series of data, because that is what HDs are as well.

    To the extent that this is true now, it is becoming less so over time. Hard disks aren't linear by nature - they have multiple platters, for starters, so they'd be more like multiple contiguous ranges. Then there's firmware in the drives that maps around bad sectors, quietly substituting other areas of the disk. One can certainly still treat the drive as a linear range of storage space, and it's a convenient way of dealing with the disk, but it's an abstraction that we're very quick to abandon... Filesystems, for starters. If you have a directory of files, you don't want to think about that in terms of sequential storage on disk. You want to be able to copy and move and erase and create them and never care about where on the disk they go. The filesystem layer hides the abstraction of the disk as a sequential thing, and then reintroduces it at the file level.

    And there's no guarantee that the "sequential" file data even will be "sequential" on-disk. We try to minimize the fragmentation of files, but the "sequential" nature of the file is, again, really just an abstraction. We could exploit that - tell the filesystem to insert a block of disk space into the middle of a file, and the filesystem wouldn't really have to move any data around to perform an insertion - but as far as I know that's not a supported operation at the application level on any OS. So instead we read all the data out of later parts of the file into RAM, then write it back to disk somewhere else - all for an insertion operation that could be handled much better by the filesystem.

    Not that a file cannot be broken apart into different sections to fit/optimize performance but at the application level they have to be considered linear series of data if only because every programming language of earth is set up to read files linearly.

    Things change.

    I mean, I get your point here, too. I wasn't a Mac user back in the day, but from what I've seen interoperability was a bitch because of file forking. And that would still be true today.

    But I can't accept "because that's the way it's been for 30 decades" as an argument for why a design choice is good. Things will change in the future. I don't know how, or when, but it's bound to happen. Being a part of that change, rather than being left behind by it, requires openness to new ideas. Even the most fundamental concepts of computing, sooner or later, will be subject to revision.

    --
    Bow-ties are cool.
  87. Re:That's not really what I was talking about, but by Tetsujin · · Score: 1

    It seems to me that by putting metadata into the filesystem, you're creating some big problems with compatibility: different filetypes need different metadata. For instance, a PDF file might have information on author, title, etc. A jpeg file might have EXIF camera settings. Having the filesystem deal with metadata seems like it's pushing this stuff down into the OS, where it really should be left up to apps.

    But what we're dealing with now is metadata very much as an OS-level concept. I think right now the implementations have a bit of a "strapped-on" feel, but it's going to become more and more central to how OS UI works.

    Also, what if people decide they want different metadata? Back when jpegs were first made, they didn't include EXIF data, but now they frequently do thanks to the proliferation of digital cameras. Presumably, the standard was modified to allow this. But changing the standard is easy with metadata encapsulated in the file itself; just have a version number in the file heading saying what standard the file conforms to, and apps will read this and interpret the data accordingly. Changing an application or two is a lot easier than patching the whole OS to deal with a change in metadata standards.

    It really isn't. At least if you're patching the OS, you can just make the change once, instead of again and again for every application that uses the file type. How many applications work with video files, or images?

    OS-level metadata also tends to be very flexible. xattr support on Linux, for instance, lets you store name/value pairs with whatever name/value you want. There are limitations (I think a maximum value size limit measured in kilobytes, at least on some filesystems - so it's not like a full "file fork" implementation at this point) - so pretty much, if you want to add a new field, you just add a new field. The same is actually true of most forms of metadata stored within file contents as well - at least the modern ones. I think if a metadata system doesn't have the approximate flexibility of XML then it's pretty much rejected. :)

    Or, what about files made by specialized programs that not many people use? Why should OS makers have to deal with metadata standards for every filetype in existence? The way it is now, only people who write apps dealing with a particular filetype have to deal with metadata standards.

    That is not the way it is now. Desktop indexing (present in Windows, OS X, and at least optionally in Linux) monitors the filesystem, re-scanning the in-file metadata when a file is modified, so it can build a central database for quick searches. So the indexing system needs to know how to read these different file types.

    Taking the metadata out of the file creates a lot of complexity, without any significant gain that I can see. Your examples of bittorrent files and files with changed metadata not md5-matching others just doesn't seem to be enough of a problem to warrant all these changes. In fact, these problems can be easily fixed by fixing the tools that use these files; BitTorrent, for instance, could be modified so that certain popular filetypes (e.g. video files like avi and mkv) are recognized by the tools and the metadata ignored when creating an md5sum. Modifying a tool that only some people use is a lot easier than modifying an entire OS.

    I wouldn't exactly call that "easy", personally... :) Maybe you're right and my examples could be better. But it addresses a general issue that metadata is not conceptually part of the file contents - that's the whole point of metadata. Like the filename, it's just there to tell you what's in the file. If you change the filename or date stamp, it doesn't affect the file's contents. So by the same logic I'd say search tags and so on shouldn't be part of file contents either.

    Concepts don't always m

    --
    Bow-ties are cool.
  88. Nothing to rethink by reboot246 · · Score: 1

    I use my files to sharpen things - knives, mower blades, machete, hoes, spades, etc..

    Simple, huh?

  89. Re:That's not really what I was talking about, but by Anonymous Coward · · Score: 0

    Taking the metadata out of the file creates a lot of complexity, without any significant gain that I can see.

    Actually I think it creates complexity for the OS and filesystem, but simplicity for the application. Apps don't need to worry about anything other than the contents of the file, and knowing how to structure the data appropriately.

    MacOS tried to do this very thing long ago, and finally gave it up. There's probably a good reason for that: the benefits weren't worth the costs.

    Nah, they didn't give it up. HFS+ still uses it. What they did do is start relying on filename extensions to be more compatible with other platforms in a networked environment. Too much for my taste, actually. Filename extensions are a convenience, but they're also a relic that should have been ditched in 1984 (and was, by Apple). There is absolutely no reason the filesystem should have to rely on the filename to identify the file format.

  90. Re:That's not really what I was talking about, but by Grishnakh · · Score: 1

    They never were used by Unix/Linux, to my knowledge. The "file" command will quickly tell you what kind of file you're dealing with, regardless of its name or extension.

  91. Re:That's not really what I was talking about, but by Anonymous Coward · · Score: 0

    True, they've never held any special meaning to the filesystem/OS on Unix/Linux, but they have been in common usage for decades.

    How many times have you found an old program distributed as a .tar.gz ?

  92. Aww, by Anonymous Coward · · Score: 0

    Not this shit again :-(

  93. Yay, recycled again by bryan1945 · · Score: 1

    Terminals, network PCs, cloud, whatever.
    If companies want to put their stuff out "there," feel free.
    I want my grubby hand on my files. AND I use whiteout on them. (PITA take the drives in and out of the enclosure all the time, though)

    --
    Vote monkeys into Congress. They are cheaper and more trustworthy.
  94. A plug for the InterTubes protocol by ka9dgx · · Score: 1

    The big problem with files is that they get disconnected from context far too easily, especially when you share them with others. This realization is why I want to build the inter-tubes protocol. It syncs up collections of files, deals with permissions, and makes a set of services available to get thumbnails of photos, etc

    My use case goes like this:
    I have 330,000 photos I've taken in the last 14 years. I'd like to share them. The current choices are

    • email a few at a time
    • post them on Flickr, FaceBook, or some other site
    • give someone a copy of ALL of them on an external drive.

    What I'd like to do instead is to give them a small file which contains permissions to access my tube containing my photos. It would be a very small file, with just a few cryptographic signatures, probably less than 20k. However, this would then allow the user to list all of my photos, and use the thumbnail service associated with it to pull across thumbnails of things (instead of the full size images).

    If they then find a file they like, the can get the full size version. If they then add tags or comments to the file, those would get synced back to me via the tubes.

    What do you all think of this idea?

    1. Re:A plug for the InterTubes protocol by knorthern+knight · · Score: 1

      > My use case goes like this:
      > I have 330,000 photos I've taken in the last 14 years.
      > I'd like to share them. The current choices are email a
      > few at a time post them on Flickr, FaceBook, or some
      > other site give someone a copy of ALL of them on an
      > external drive.

      It's simple. Set up a database server, and store the database on it...
      Table 1
      All the photos as BLOBS, plus add additional fields for filename, date/place taken etc, etc.

      Table 2
      A list of users with additional fields for files they're allowed to access, and additional stuff (e.g. thumbnail or full, etc)..

      Give him a limited user account on your server that only allows him to run a program that checks which files he's allowed to access, and then download files he's authorized to download. You're basically recreating iTunes

      My problem is with people who wnt to change *EVERYBODY'S* filesystem to accomadate their edge case. To use a car analogy... your neighbour uses a Ford F-350 Super-Duty to tow his 5-ton trailer for work-related stuff. Should your Toyota Echo be rebuilt with a diesel V8 engine and 5-ton towing capacity, simply because a few people need it?

      Same thing here. Do what you want on *YOUR* machine. Leave mine alone.

      --

      I'm not repeating myself
      I'm an X window user; I'm an ex-Windows user
    2. Re:A plug for the InterTubes protocol by ledow · · Score: 1

      - Install Opera.
      - Enable Opera Unite.
      - Configure the File Sharing Unite app (there's also ones designed for *exactly* what you are talking about, e.g. thumbnails, etc.) to point to your folder you want to share and allow the people you want / password you want.
      - Send them your Unite URL (they can open it from any browser) and stay online for as long as you want them to access your files.

      You don't have to upload everything to a remote server. You don't have to do anything special. The tagging/comments would have to work in the Unite app but there's no reason that can't be done (or isn't already in some Unite apps) from what I can see.

      But although you solve one problem, you don't really solve the problem that stops EVERYBODY doing this. 24/7 access is a pain. Bandwidth limits are a pain. People using your upload to look at your photos while you're online gaming is a pain (330,000 photos? Even someone browsing the thumbnails provided by your connection would make a huge dent in your ping) Managing access is a pain. Stopping people from blanket-downloading everything is a pain. Stopping them giving their crypto-signature to a friend who you then CAN'T distinguish from them is a pain.

      You're really not proposing anything that isn't a) possible, b) out there already, c) couldn't be cobbled together in an afternoon by someone who can write shell-scripts and use freeware and d) not being used by anyone else for a number of reasons.

      You've basically reiterated ideas present in the very first papers of the WWW, not to mention things like XML etc. Sure, you've slapped encryption into it but that's not different, technically, from handing someone a SSH key unique to them that only allows them access to their user's photos over SSH/SCP.

      If people *needed* it, we could whip that up tomorrow, cover it in a commercial face, and bundle software that handled all the "SCP" side of things transparently. Fact is, few people would find it actually beneficial and those few would be the ones who would generally NOT want their upload flooded by their cousins downloading their wedding photos (home connections just aren't good for uploading, and if you have a server elsewhere, why not use it?!).

      The option you REALLY miss is:

      Use a remote dedicated server as a vital backup store that you really want to be using for those 1/3 of a million files anyway, and provide SCP and passwordless SSH keys to your friends so they can read-only access them as an added bonus.

      Sure, it won't auto-thumbnail, but it would be a cinch with imagemagick utils and FUSE to make it auto-create thumbnails on-the-fly in particular folders and have some fancy bit of client software do automatic "Show the thumbnail folder first, show the real file if requested" but you can also do it manually - one folder named thumbnails that you tell them loads really quickly, one folder named "full-size" that you tell them should only be used for full-size image downloads.

      (Hell, get them to plug-in one of those Windows-integrated SCP tools and they can map the server as a drive - one for thumbnails, one for full-size images - browse it in explorer and then just drag-drop the full-size ones they want to their own computer.)

      It exists TODAY, NOW. You're just not using it. Those that are don't need anything particular, more complicated, fancy filesystems on their clients, upgraded OS concepts etc. A copy of WinSCP or a bit of freeware/shareware that maps SCP drives in Windows works perfectly. Hell, you could run the SSH server from your own machine if you were really sadistic and didn't care about your upload.

      If it's such a ground-breaking, useful feature to you - why aren't you already doing it (given that you're on Slashdot, I'm presuming that doing - or finding someone to do - this sort of nerd-work is a walk in the park)?

      "Everything is a file". That concept basically makes possible any number of extremely fancy ideas in an afternoon, especially if you can knock up scripts of a FUSE interface. And that's why most of those fancy ideas (e.g. WinFS) are virtually cancelled after billions in investment - because they rarely work better than something you can knock up in an afternoon after someone's explained what they want.

  95. Re:files by Anonymous Coward · · Score: 0

    Forget the Ribbon, the other disaster from Office 2007 was the 'glorious basterd' new file names, docx xlsx and the others. But of course 'file extensions are too hard for users' so those differences get hidden. One of my 'mission critical' programs from work FINALLY added support for those filenames ... *this past April*.

    Are you minimum wage IT?

    I'd HOPE you realise that renaming 'something.docx' into 'something.doc' isn't going to allow you to magically open it in Word 2003. DOCX is a ZIP file with XML files inside of it. DOC is a binary, legacy clusterfuck of OLE garbage; Microsoft themselves have trouble maintaining compatibility with their files between versions, at least DOCX makes it easier to import that crap into LibreOffice.

  96. Wikipedia reference? Seriously? by Anonymous Coward · · Score: 0

    It is like citing hearsay. I love wiki for learning new things but I find it ridiculous to cite a source that is not peer-reviewed.

  97. Please add metadata handling to GNU/Linux FSes by Anonymous Coward · · Score: 0

    If metadata would be part and parcel of files on proper GNU/Linux filesystems, it would be so very much easier to find and browse your stuff. Now all we have is folders. And that's making the files dumb, like actual silly pieces of paper that can only be put into stupid folders to avoid a mess. But files are not clumsy physical objects but shining ideas.

    Instead, give me all files related to Sarah on my harddisk, please. Now give me all indecent pictures of her. Now give me her pictures in 2007. Or give me all FLACs longer than 3 minutes without vocals. Give me all videos shot in (tagged with) Norway and featuring James... The possibilities to sort files are endless, having to only come up with one singular ontology (your strict directory tree structure) and use it for all your stuff all the time in absolute insane. Nobody cares where a file is, everybody just wants to find it every now and then.

    Metadata adds massively value. Without it, a van Gogh is just an old painting.

    There has to be an easy interface for searching for a specific file or for a specific theme when you need it. And there has to be a sensible browsing mode when you don't know what you're looking for or are trying to figure out what all there is. Now making smart searches is absolutely impossible. You're bound to miss some stuff and get embarrassing false positives.

    Of course, you will have to input the metadata for it to be used but that could be a semi-automatic process. Just let us descend finally from the goddamn directory tree to the solid ground of smart metadata.

    And Micro$oft and other cloudy rip off artists can go fuck themselves.

  98. Re:That's not really what I was talking about, but by bingoUV · · Score: 1

    Or, what about files made by specialized programs that not many people use? Why should OS makers have to deal with metadata standards for every filetype in existence? The way it is now, only people who write apps dealing with a particular filetype have to deal with metadata standards.

    That is not the way it is now. Desktop indexing (present in Windows, OS X, and at least optionally in Linux) monitors the filesystem, re-scanning the in-file metadata when a file is modified, so it can build a central database for quick searches. So the indexing system needs to know how to read these different file types.

    Very dishonest argument.

    Desktop search engines have it very easy. Just consider the possibility - they can simply run /usr/bin/strings on the general area of the file where metadata is likely to be found and index the resulting data. In any order whatsoever, without making a distinction between "comment" like metadata (e.g. John's photo) and actionable metadata (e.g. photo taken when camera is at an angle of 47 degrees from the vertical). Even doing so can make a very good desktop search engine. And not supporting specialized file formats is very much a possibility.

    This is much much less difficult than the program which has to make sense out of the data. To show the image appropriately rotated according to a piece of actionable metadata for instance. And not supporting specialized file formats is not an option - the particular program is for that specialized file format.

    There is no comparison. At all. Especially when the GP already talked about specialized file formats, "that not many people use".

    So yes, it is true to a great extent that "only people who write apps dealing with a particular filetype have to really deal with the nitty-gritty of metadata standards."; my alterations in italics.

    --
    Bingo Dictionary - Pragmatist, n. A myopic idealist.
  99. Re:files by Anonymous Coward · · Score: 0

    Kinda Snarky there AC. Sure I know you can't just rename the file, you have to Save-As back to the older version. But I'd send a contract to a colleague in 2003 and it would come back in 2007 that I'd have to backport again.

    --Tao

  100. the metadata is there to make sure you do not copy by aXi · · Score: 0

    This way you can make sure you do not copy your file to people Microsoft or the government does not want you to copy files to.

    The file-system will check and see if the file that is being copied to it is allowed to be copied to it. And both file-systems check whether upon copy completion if the file in the original and/or source and/or destination storage device fs must become uncopyable, or whether it should be deleted after having been copied, even if you meant only to copy it.

    In show DRM for every file ever created....... With other words the back door for the DRM no one wants and the MPAA wants everyone to have/use/abide-by.

  101. My crystal balls say by uninformedLuddite · · Score: 1

    That all files will end up in some form of container in which the metadata is embedded. There's probably already a patent for a similar system currently in use and the very idea has RIAA and MPAA people drooling all over themselves in anticipation of finally owning everything in the box.

    --
    The new right fascists are bilingual. They speak English and Bullshit.
  102. Re:That's not really what I was talking about, but by Tetsujin · · Score: 1

    Or, what about files made by specialized programs that not many people use? Why should OS makers have to deal with metadata standards for every filetype in existence? The way it is now, only people who write apps dealing with a particular filetype have to deal with metadata standards.

    That is not the way it is now. Desktop indexing (present in Windows, OS X, and at least optionally in Linux) monitors the filesystem, re-scanning the in-file metadata when a file is modified, so it can build a central database for quick searches. So the indexing system needs to know how to read these different file types.

    Very dishonest argument.

    Desktop search engines have it very easy. Just consider the possibility - they can simply run /usr/bin/strings on the general area of the file where metadata is likely to be found and index the resulting data.

    Great, so the table's gonna have a lot of entries for "JFIF". :) And a lot of good "strings" is gonna do on compressed source data...

    But in fact, this isn't what current desktop search engines do: they recognize known file type and process them specifically, so they know which ID3 field is the artist and which is the title, etc.

    --
    Bow-ties are cool.
  103. Dynamic Databases. Build it, use it, move on: by Anonymous Coward · · Score: 0

    It's called a Dynamic Database:

    http://c2.com/cgi/wiki?DynamicRelational

    http://c2.com/cgi/wiki?MultiParadigmDatabase

    "parent=rowID" references would make it hierarchical; or put another way, provide a hierarchical view.

  104. Re:That's not really what I was talking about, but by bingoUV · · Score: 1

    they recognize known file type

    Yes, for known. Whereas we are talking about "specialized programs that not many people use". Not likely to be known. And not a single desktop search engine "knows" about all file types.

    For such types of files, the heuristic I suggested is still the best after a few filterings. And like I also said, for desktop search it is an easy possibility to ignore rare file types. But not for a program that's purpose is to read those rare file types.

    And of course other arguments of mine that you didn't address.

    --
    Bingo Dictionary - Pragmatist, n. A myopic idealist.