Object Storage and POSIX Should Merge
storagedude writes: Object storage's low cost and ease of use have made it all the rage, but a few additional features would make it a worthier competitor to POSIX-based file systems, writes Jeff Layton at Enterprise Storage Forum. Byte-level access, easier application portability and a few commands like open, close, read, write and lseek could make object storage a force to be reckoned with.
'Having an object storage system that allows byte-range access is very appealing,' writes Layton. 'It means that rewriting applications to access object storage is now an infinitely easier task. It can also mean that the amount of data touched when reading just a few bytes of a file is greatly reduced (by several orders of magnitude). Conceptually, the idea has great appeal. Because I'm not a file system developer I can't work out the details, but the end result could be something amazing.'
'Having an object storage system that allows byte-range access is very appealing,' writes Layton. 'It means that rewriting applications to access object storage is now an infinitely easier task. It can also mean that the amount of data touched when reading just a few bytes of a file is greatly reduced (by several orders of magnitude). Conceptually, the idea has great appeal. Because I'm not a file system developer I can't work out the details, but the end result could be something amazing.'
"If we support POSIX, then we'll support POSIX".
Have you read my blog lately?
The article seems to go into solutions that let you access S3 as a fuse module, but it's failing to consider that you can go the other way. Gluster, Ceph, and probably others let you access data both as a filesystem, and as an object store. It's a little more complex to setup and maintain than what this article seems to be envisioning, but it can offer a lot of flexibility. I suppose it's not as cheap to run these yourself as to use S3 in most cases.
Because I'm not a file system developer I can't work out the details, but the end result could be something amazing.'
But you can put the check in my name.
Wouldn't it be great if you gave me millions of dollars!
Because I'm not a money maker I can't work out the details, but the end result could be something amazing (for me at least).
We can call it POOS, Posix Over Object Store.
I do not understand this at the highest level. How is this an improvement over POSIX? My understanding is that object storage is essentially a dumbed-down file system where you have to read the entire object (file) at once. Or have to write the object (file) at once. Why does it improve anything? Is it just because the "address" can be a url? Just write that as a specific file system so that you can read/write to /dev/url/http/slashdot.org/ and be done with it.
What am I missing?
If file systems allowed arbitrary attributes per folder/file, then file systems could serve as both CMS's and light-duty CRUD storage. Most intranet CMS content is just lists of documents and links, with a few notes. They could be queried via SQL or an SQL-like language[1], along with the usual file-oriented techniques.
In addition to the arbitrary attributes, a set of common attributes would be reserved, at least by convention:
* title (file/folder name)
* synopsis
* content (file bytes)
* type (type of content, perhaps by extension)
* thumbnail (or icon)
* create-date (date/time)
* modif-date
* orig-author (writer user name)
* modif-author (who changed it)
* sequence (preferred ordering [2])
* hidden (internal or system files/rows)
Conventions for display preferences and perhaps an HTML template(s) per folder or folder groups[3] could also be defined as part of the convention.
And perhaps per-folder[3] settings can enforce certain attributes or constraints, such as making synopsis required, for example.
Think about it: a flexible data tool without totally reinventing the wheel. We just soup-up existing file systems (or at least file conventions). Something is more likely to be accepted if it's similar to familiar tools--file systems and RDBMS in this case.
[1] Comparing operations may have to be more type-explicit if using dynamic or "indicator-free" types.
[2] Ordering by "convention" attributes (above) would typically be available, but sometimes the author wants explicit control of ordering
[3] Folders could perhaps define a path and/or grouping so that they can "inherit" selected features from other folders, such as preferred display settings.
Table-ized A.I.
Byte-level access, easier application portability and a few commands like open, close, read, write and lseek could make object storage a force to be reckoned with.
Got all of that already. Perhaps not well defined by the POSIX standard. But only because certain implimentors whined and cried that they would be cut out of the party if they had to support real O/S standards.
But this isn't 'object storage' (unless all your objects are bytes). Object storage is an extension of higher level record access that VMS and other (mainframe) systems have had for years (decades). But now combined with object method storage. Starting to sound like RPC (server run) or write once, run anywhere. So Java vs ActiveX and maybe Javascript all over again. Just nope.
Have gnu, will travel.
You had an idea. Just implement it. If it is of any value, people will pick it and you will get famous (and perhaps rich if you can leverage on that)
"Because I'm not a file system developer I can't work out the details..."
I don't need any help from you. And your opinions don't count for much. Thanks for the idea, even though it's essentially worthless without data to back up your assertions.
.
Aside from trying to leverage the huge portability of POSIX by using its name, what exactly is the benefit if the merger would occur?
.
Why not standardize and implement Object Store across many different operating systems (working code would be required for each OS), and then submit Object Store to be a part of the POSIX standard?
I always was under the impression that POSIX has something close to byte access with lseek().
Did Paul Reiser get out of jail and change his name?
Oh, wait, this guy is trying to replace file systems with a database, not databases with a filesystem like Reiser did.
Well, nothing new under the sun, just same old ideas being recycled and touted as something totally new.
AWS offer Object Storage for its scalability. Cloud file services sit on-top of that & only accept "complete" uploads.
The only happy medium I know of is www.Bitcasa.com which implements POSIX (most of it) atop S3 in the form of a virtual drive. Their Linux client is only for corporate users due to a lack of focus consumer-side, but their Windows & Mac clients offer virtual desktop.
Ref: I work for Bitcasa
Science & open-source build trust from peer review. Learn systems you can trust.
On cloud services, storing all your files as "objects" is much cheaper than renting a filesystem to store them on. The gist of this article is, "if S3 allowed block-level access, it would be as cheap as S3 and as flexible as a filesystem."
The most powerful sentence in the article is "I can't work out the details." I can't imagine any cloud-services engineer reading this article and thinking, "ooh, I'd never thought of adding block-level access!" I think block-level access is the most-requested feature since S3 was born. The author hasn't described how this will work -- or how S3 works, even.
"POST (dds an object using HTML forms — similar to PUT but using HTML)"
What does that even mean? Evidently, the author meant "HTTP".
In regards to "merging" obejct storage and POSIX, that's been done. That's what the Joyent people did with their Manta object storage: you operate on the objects using standard *nix tools. They've recently open-sourced it under a free and GPLv2-compatible license (MPLv2).
Object storage is just block storage with arbitrarily large block sizes. POSIX is just an interface. Implementing the API over an object store should be simple. Leave the details of wiring memory to logical sectors to the object store implementation and forget about it. Not enough memory to cache an entire file? It's time to start rethinking "memory" anyway. Or just look at how it's already done in eg OS/400.
As in byte serving and the range header? https://en.wikipedia.org/wiki/...
You could read and write entire files easily in POSIX, last I checked. You know, as in Python "open(filename).read()".
One of the great advantages that allows object storage to be scalable is that it's completely stateless. A single command has no dependency on the previous or next command. There's no modification of existing objects, no "seek then write" commands either. This allows object storage to maintain one of the key tenants of being a cloud storage, it's not to provide high availability of a given instance, but to guarantee that the "retry" or the "allocation of a new resource" always succeeds. For example, VMs can go down at anytime, but there should never be an instance whereby you cannot create a new VM to replace the one that just died. While VMs can die at anytime, the VM service (EC2, Nova) can never go down.
With this crap like "seek", "open" then "read" that the author proposes you now have commands that are dependent on each other and thus create state. Something we want to avoid.
In addition of HTTP methods there are WebDAV methods:
PROPFIND — used to retrieve properties, stored as XML, from a web resource. It is also overloaded to allow one to retrieve the collection structure (also known as directory hierarchy) of a remote system.
PROPPATCH — used to change and delete multiple properties on a resource in a single atomic act
MKCOL — used to create collections (a.k.a. a directory)
COPY — used to copy a resource from one URI to another
MOVE — used to move a resource from one URI to another
LOCK — used to put a lock on a resource. WebDAV supports both shared and exclusive locks.
UNLOCK — used to remove a lock from a resource
And Range Requests:
I'm developing a system that integrates HTTP/WebDAV and SQL with URI's as follow:
http://host:port/table/row_id/column/column_part
With WebDAV integration SQL database can be seen as a filesystem. Using SQL references has the advantage that the information can be divided in small data parts
for easy editing and is more versatile than filesystem trees.
An object can be written with POST method or when an object in XML, iCalendar, etc format is written using PUT method it can be
processed in the server and it's data stored in the appropiate columns.
And it can be used with other protocols or a FUSE filesystem.