Slashdot Mirror


Object Storage and POSIX Should Merge

storagedude writes: Object storage's low cost and ease of use have made it all the rage, but a few additional features would make it a worthier competitor to POSIX-based file systems, writes Jeff Layton at Enterprise Storage Forum. Byte-level access, easier application portability and a few commands like open, close, read, write and lseek could make object storage a force to be reckoned with.

'Having an object storage system that allows byte-range access is very appealing,' writes Layton. 'It means that rewriting applications to access object storage is now an infinitely easier task. It can also mean that the amount of data touched when reading just a few bytes of a file is greatly reduced (by several orders of magnitude). Conceptually, the idea has great appeal. Because I'm not a file system developer I can't work out the details, but the end result could be something amazing.'

10 of 66 comments (clear)

  1. Or in other words... by 14erCleaner · · Score: 4, Insightful

    "If we support POSIX, then we'll support POSIX".

    --
    Have you read my blog lately?
  2. I'm just the idea man by Anonymous Coward · · Score: 5, Funny

    Because I'm not a file system developer I can't work out the details, but the end result could be something amazing.'

    But you can put the check in my name.

  3. Why would you want this? by godrik · · Score: 3, Interesting

    I do not understand this at the highest level. How is this an improvement over POSIX? My understanding is that object storage is essentially a dumbed-down file system where you have to read the entire object (file) at once. Or have to write the object (file) at once. Why does it improve anything? Is it just because the "address" can be a url? Just write that as a specific file system so that you can read/write to /dev/url/http/slashdot.org/ and be done with it.

    What am I missing?

    1. Re:Why would you want this? by radish · · Score: 3, Interesting

      It allows the implementation to make a lot of assumptions & simplifications, which in turn makes things like S3 possible. There's no way you could practically offer POSIX style FS access in a cloud-like environment.

      --

      ---- Den ene knappen er powerknapp, den andre er Bender voice knapp "Bite My Shiny Metal Ass"

    2. Re:Why would you want this? by Salamander · · Score: 3, Interesting

      That is very untrue. I'm on the GlusterFS team, and we've had users providing "POSIX style FS access in a cloud-like environment" for years. Amazon recently started doing the same with EFS, and there are others. It's sure not easy, I wouldn't say any of the alternatives have all of the isolation or ease of use that they should, but it's certainly possible.

      --
      Slashdot - News for Herds. Stuff that Splatters.
    3. Re:Why would you want this? by Dutch+Gun · · Score: 3, Insightful

      Think for a moment about how much network overhead it would require to:

      a) Open a specific file
      b) Seek to a specific location
      c) Read or write data down to the byte level an arbitrary number of times
      d) Close the file.

      Each one of those operations needs back and forth communication across the internet, with error-checking and encryption overhead. Now, remember that these operations need to by synchronized across multiple machines, probably in multiple data centers across the world as well.

      Compare that to atomic per-object operations, and how much more straightforward that is for network-intensive operations. In the end, it's probably much more efficient to simply send an entirely new file than trying to change a single byte inside a file.

      Besides, if you really need byte-level access to remote storage... we have that already. It's called a database.

      --
      Irony: Agile development has too much intertia to be abandoned now.
    4. Re:Why would you want this? by Salamander · · Score: 4, Insightful

      It's because they throw out a lot of POSIX features/requirements - e.g. nested directories, rename, links, durability/consistency guarantees. In other areas, such as permissions, they have their own POSIX-incompatible alternatives. These shortcuts do make implementation easier, allowing a stronger focus on pure scalability. The theory is that the combined complexity of POSIX semantics and dealing with high scale (including issues such as performance and fault handling) is just too much, and it becomes an either/or situation. As a member of the GlusterFS team, I strongly disagree. My colleagues, including those on the Ceph team, probably do as well. The semantics of object stores like S3 have been designed to make their own developers' lives easier, and to hell with the users.

      Not all POSIX features are necessary. Some are outdated, poorly specified, or truly too cumbersome to live. On the other hand, the object-store feature set is *too* small. I've seen too many users start with an object store, then slowly reimplement much of what's missing themselves. The result is a horde of slow, buggy, incompatible implementations of functionality that should be natively provided by the underlying storage. That's a pretty lousy situation even before we start to talk about being able to share files/objects with any kind of sane semantics. You want to write a file on one machine, send a message to another machine, and be sure they'll read what you just wrote? Yeah, you can do that, but the techniques you'll have to use are the same ones that are already inside your distributed object store. Even if both their implementation and yours are done well, the duplication will be disastrous for both performance and fault handling. It would be *far* better to enhance object stores than to keep making those mistakes . . . or you could just deploy a distributed file system and use the appropriate subset of the functionality that's already built in.

      A semantically-rich object store like Ceph's RADOS can be a wonderful thing, but the dumbed-down kind is a disgrace.

      --
      Slashdot - News for Herds. Stuff that Splatters.
  4. Re:POOS by U2xhc2hkb3QgU3Vja3M · · Score: 3, Funny

    Posix Over Objects Partitions wasn't a popular choice.

  5. Why Should Object Storage and POSIX Merge? by QuietLagoon · · Score: 2
    There I fixed the title for you.

    .
    Aside from trying to leverage the huge portability of POSIX by using its name, what exactly is the benefit if the merger would occur?

    .
    Why not standardize and implement Object Store across many different operating systems (working code would be required for each OS), and then submit Object Store to be a part of the POSIX standard?

  6. Scalable b/c it's stateless by HockeyPuck · · Score: 3, Informative

    One of the great advantages that allows object storage to be scalable is that it's completely stateless. A single command has no dependency on the previous or next command. There's no modification of existing objects, no "seek then write" commands either. This allows object storage to maintain one of the key tenants of being a cloud storage, it's not to provide high availability of a given instance, but to guarantee that the "retry" or the "allocation of a new resource" always succeeds. For example, VMs can go down at anytime, but there should never be an instance whereby you cannot create a new VM to replace the one that just died. While VMs can die at anytime, the VM service (EC2, Nova) can never go down.

    With this crap like "seek", "open" then "read" that the author proposes you now have commands that are dependent on each other and thus create state. Something we want to avoid.