On the State of Linux File Systems
kev009 writes to recommend his editorial overview of the past, present and future of Linux file systems: ext2, ext3, ReiserFS, XFS, JFS, Reiser4, ext4, Btrfs, and Tux3. "In hindsight it seems somewhat tragic that JFS or even XFS didn't gain the traction that ext3 did to pull us through the 'classic' era, but ext3 has proven very reliable and has received consistent care and feeding to keep it performing decently. ... With ext4 coming out in kernel 2.6.28, we should have a nice holdover until Btrfs or Tux3 begin to stabilize. The Btrfs developers have been working on a development sprint and it is likely that the code will be merged into Linus's kernel within the next cycle or two."
The article states that ext4 was a Bull project; and that is not correct.
The Bull developers are one of the companies involved with the ext4 development, but certainly by no means were they the primary contributers. A number of the key ext4 advancements, especially the extents work, was pioneered by the Clusterfs folks, who used it in production for their Lustre filesystem (Lustre is a cluster filesystem that used ext3 with enhancements which they supported commercially as an open source product); a number of their enhancements went on to become adopted as part of ext4. I was the e2fsprogs maintainer, and especially in the last year, as the most experienced upstream kernel developer have been responsible for patch quality assurance and pushing the patches upstream. Eric Sandeen from Red Hat did a lot of work making sure everything was put together well for a distribution to use (there are lots of miscellaneous pieces for full filesystem support by a distribution, such as grub support, etc.). Mingming Cao form IBM did a lot of coordination work, and was responsible for putting together some of the OLS ext4 papers. Kawai-san from Hitachi supplied a number of critical patches to make sure we handled disk errors robuestly; some folks from Fujitsu have been working on the online defragmentation support. Aneesh Kumar from IBM wrote the 128->256 inode migration code, as well as doing a lot of the fixups on the delayed allocation code in the kernel. Val Henson from Red Hat has been working on the 64-bit support for e2fsprogs in the kernel. So there were a lot of people, from a lot of different companies, all helping out. And that is one of the huge strengths of ext4; that we have a large developer base, from many different companies. I believe that this wide base of developer is support is one of the reasons why ext3 was more succesful, than say, JFS or XFS, which had a much smaller base of developers, that were primarily from a single employer.
...can I configure to use large extents (like in the megabytes range)?
A cute FA in some ways, but bereft of content. Wish there was something to see here, like comparisons regarding integrity, access costs, evolution from JFS and Andrews journaled FS, etc. No real meat (with apologies to the vegetarians out there). Just a lightweight historical analysis with some glib suggestions of current adaptations.
---- Teach Peace. It's Cheaper Than War.
Very insightful. Just goes to show the power of open source.
Oh, by the way... forgot to mention. If you are looking for benchmarks, there are some very good ones done by Steven Pratt, who does this sort of thing for a living at IBM. They were intended to be in support of the btrfs filesystem, which is why the URL is http://btrfs.boxacle.net/. The benchmarks were done in a scrupulously fair way; the exact hardware and software configurations used are given, and multiple workloads are described, and the filesystems are measured multiple times against multiple workloads. One interesting thing from these benchmarks is that sometimes one filesystem will do better at one workload and at one setting, but then be disastrously worse at another workload and/or configuration. This is why if you want to do a fair comparison of filesystems, it is very difficult in the extreme to really do things right. You have to do multiple benchmarks, multiple workloads, multiple hardware configurations, because if you only pick one filesystem benchmark result, you can almost always make your filesystem come out the winner. As a result, many benchmarking attempts are very misleading, because they are often done by a filesystem developer who consciously or unconsciously, wants their filesystem to come out on top, and there are many ways of manipulating the choice of benchmark or benchmark configuration in order to make sure this happens.
As it happens, Steven's day job as a performance and tuning expert is to do this sort of benchmarking, but he is not a filesystem developer himself. And it should also be noted that although some of the BTRFS numbers shown in his benchmarks are not very good, btrfs is a filesystem under development, which hasn't been tuned yet. There's a reason why I try to stress the fact that it takes a long time and a lot of hard work to make a reliable, high performance filesystem. Support from a good performance/benchmarking team really helps.
What Sun needs to do is release ZFS under a proper license so we can finally have 1 unified filesystem. Yes, we can use it under FUSE, but this brings unnecessary overhead and problems. It will be nice when we can transport disks around, similar to fat(32), and not have to worry about whether another OS will be able to read it or not. On top of that, CRC block checksumming, high performance, smb/nfs/iscsi support integrated, Volume AND partition manager.
Come on Sun! Are you listening??
I didn't know there was something other than FAT - when did this happen??? Can I get modules for 0.001 beta kernel? ;-)
Its a real killer.
zosxavius photography
Just my 2 bits. As a user of Linux in a software/algorithm context, my personal beefs with ext3 / the current kernel line are:
1) IO priority isn't linked to to process priority, or at least, not in a decent manner. it is all too easy to lock up the system with one process that is IO heavy (or a multiple of these) -- hurting even high priority processes. As the IO call is handled by a system level (handling buffering, etc.) -- it garners a relatively high priority (possibly falling under the RT scheduler) and as a result IO heavy processes can choke other processes.
2) ext3+nfs simply sucks with very large amount of files. I used to routinely have directories with 500,000 files (very easy to reach such amounts with a cartesian multiplication of options). The result is simply downright appalling performance.
It is rarely an issue to me, but once in a while it is convenient to be able to plug an USB disk on a machine with Windows or Mac OS X. What portable file systems are there that will cover those cases? Last I did some looks around a few years back I ended up concluding that the best option for a file system supported on both Linux and Windows was ext2 (with third party drivers for Windows). The only other file system supported on both was FAT, which have several drawbacks.
Moving forward, what file system will be the most portable? Are we going to be stuck with ext2 and FAT for file systems that we need to access across multiple operating systems, or is there going to be some journaled file system with support for large disks and basic unix semantics?
Do you care about the security of your wireless mouse?
A: Because it breaks the natural flow of a message.
Shutting down free speech with violence isn't fighting fascism. It IS fascism!
We're checksumming free disk space. That's dumb.
It makes RAID rebuilds needlessly slow.
We're unable to adjust redundancy according to
the value that we place on our data. Everything
from the root directory to the access time stamps
gets the same level of redundancy.
The on-disk structure of RAID (the lack of it!)
prevents reasonable recovery. We can handle a
disk that disappears, but not one that gets
some blocks corrupted. We can't even detect it
in normal use; that requires reading all disks.
We have extremely limited transactional ability.
All we get for transactions is a write barrier.
There is no way to map from RAID troubles (not
that we'd detect them) to higher-level structures.
With an integrated system, we could do so much
better. Sadly, it's blocked by an odd sort of
kernel politics. Radical change is hard. Giving
of the simplicity of a layered approach is hard,
even when obviously inferior. There is this idea
that every new kernel component has to fit into
the existing mold, even if the mold is defective.
but for grabbing attention, it seems to have worked. It happens all the time--ever watch the evening news?
We're all hypocrites. We all have hidden parts, it's the contrast between them that make us more a hypocrite than others
worth a mention for large media space.
http://www.dragonflybsd.org/hammer/index.shtml
is it ready for the desktop?
Filesystems are the new 'definitive, authoritive, all-encompassing audio userland' on Linux.
Religion is what happens when nature strikes and groupthink goes wrong.
..called TLDRFS It simply ignores any files larger than 64KB.
Never have I been so happy and so angry in such a short period of time. I salute you, yet still shake my fist angrily in your general direction.
If I mod you up, it doesn't necessarily mean I agree with what you've said, sorry.
because "one size does not fit all." Some file systems handle better in say a database enviroment handling large number of small files while other handle better in something else. If you want a standard fs for general use, that's what ext2 is for as well as ext3(which is backwards compatable with ext2). Can you use another fs other then what most distro decided upon, sure, that's what freedom is about. New implementations are created because they are created with different goals in mind.
Windows is not without it's own choices mind you either, fat(fat16), fat32, ntfs, WinFS(cancelled). As time pass, even microsoft attempts to create a new and improved fs (key-word: attempt). Sure they tend to force the latest fs on you but that's microsoft way VS linux way of choice.
This seems such a popularity contest. I usually don't know why I'd want one FS over another.
I propose we make a list of OS features that are desirable, and gravitate towards the one that can provide em all.
1) Transparency. Should be easy to move data from disk/vol to disk/vol at/near rated speed of interface.
2) On line snapshot via COW for online maint etc.
3) Journaling to ensure compatibility with power-offs. (control over where logs go too!).
4) Compatible with LVM structures.
5) Compatible with RAID structures.
6) variable block sizes.
7) Fast search via name or metadata.
8) Good with lots of little files
9) Good with GIANT files too
10) Mandatory Access Controls
11) Read only as needed.
12) Noatime as needed.
13) On line defrag/reorg.
Basically, I need to perform maint on File System, as well as needing to provide for operational adjustments for Speed or Security or End-User-Shutdowns.
14) Rsync from a block level would be nice too!
Just a few things. Am sure there are many more.
Perhaps the various F/S out there might be listed by a plusses/minuses rating level for a set of features.
I had used XFS a few years back and was amazed at how fault tolerant it is. (slow with lots of little files, but.... rocked otherwise.
JWest
Maybe not for a desktop machine, but for servers I like to use XFS. That started way back then when XFS was the first (and then only AFAIR) fs that supported running on softraid. It was not that long ago and CPU cycles were already so cheap on x86 that softaid was already a pretty nice solution for small servers.
For small servers I have not changed that setup (XFS on softraid level one on two cheap drives) ever since.
I guess for the big machines it might be very different. I am pretty happy with XFS as it is.
Could you tell us which song this post is a parody of? Thanks.
I don't really care about better filesystems. ext3, NTFS and Mac OS Extended seem to be extremely reliable and work perfectly well on their respective platform.
The only real problem I have is there doesn't exist a modern journaling FS which would work just as well on all 3 platforms.
I can use ext3, but cannot plug it into a Mac.
I can use Mac's FS, but cannot plug it into Windows (unless I pay for a proprietary driver every time I use that disk on a different machine)
I can use NTFS, but cannot write to it on a Mac.
This is the real problem I have. I would like one of these 3 (preferably the open source ext3) to have perfect support on both of the other 2 OSes. And if there is a serious such project for ext3, I would be glad to contribute with a donation.
Hans was a jerk who has difficult to work with, and now he is a convicted murderer. That doesn't change the fact that Reiser4 as is may be the best desktop file system for Linux users, even with plenty of room for improvement.
There are filesystems in development like Btrfs and Tux3 that look promising, but why should Reiser4 be abandoned? It is GPL. Anyone can pick it up and maintain it, or fork it.
Does anyone know anything about the future of Reiser4?
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
Exactly. I can use ext3 in Windows, but only if I mount ext3 with the right options. For instance, when I installed openSUSE 11 on my dual-boot box, I decided to use writeback for journaling, and now none of the ext3 options for Windows can mount that drive. ntfs in Linux is painfully slow, even with ntfs-3g.
I'm looking to build my next box with 4 x 1.5TB drives and I'm debating how to partition and format it for a dual-boot Windows/Linux setup. Do I keep most of shared media (I'm going to rip my entire DVD collection, not to mentiom my emulator/rom collection, e-books, MP3's, etc) on NTFS or EXT3?
http://blindscribblings.com - Tasty pop-culture in conceptual fashion.
As a result, many benchmarking attempts are very misleading, because they are often done by a filesystem developer who consciously or unconsciously, wants their filesystem to come out on top, and there are many ways of manipulating the choice of benchmark or benchmark configuration in order to make sure this happens.
The other side of the coin can come up as well. If you develop and test everything on one platform with one test workload, it's easy to unintentionally make it highly optimal for that exact setup and terrible elsewhere.
Whenever I have to install some server, I have a metaphysical question: ext3 or reiserfs?
Ext3 has a lot of advantages, including a possibility to do a fast recovery of files. While it is not needed often, at least once per year I have such demand. At the other side, undelete methods with raiserfs are very problematic.
At the other side, my servers are up usually for a year or more. This means that the most of company's employees may go on one day vacation whenever I want to reboot a machine with 4TB file system.
Any good idea to solve those two issues with one file system?
you are quite wrong about windows file systems.
fat16 is dos legacy and isn't used anymore. since windows xp came out, fat32 should be used only for flash memory so there is only ntfs.
winfs never was a filesystem (fs standing for future storage). it was just (more or less) an sql database of file metadata, running of course on top of ntfs.
"It's such a fine line between stupid and clever" -- David St. Hubbins, Spinal Tap
As a result, many benchmarking attempts are very misleading, because they are often done by a filesystem developer who consciously or unconsciously, wants their filesystem to come out on top, and there are many ways of manipulating the choice of benchmark or benchmark configuration in order to make sure this happens.
Wouldn't it be logical to assume a filesystem developer has an idea on what the workload and hardware will be like _before_ writing his filesystem, then picking a benchmark that suits his ideas on what a filesystem is supposed to do? No manipulation necessary, intentional or otherwise.
Repeato ad absurdium...
All these fancy features, but we are still using filename extensions (eg. .zip) to specify data types.
Did OOP even happen?
:T:R:A:N:S:
Well actually if you install drivers you can access ext3 on Mac. There is also a third party read write ntfs driver for Mac (but is slow) - But the point being that it would be nice to be able to use a filesystem out of the box on Mac/Windows/Linux that supports larger than 2GB file sizes
The only real problem I have is there doesn't exist a modern journaling FS which would work just as well on all 3 platforms.
I agree with you that's really important. I'd also like zfs to be that filesystem. However, as long as you don't need that drive to be the root drive of your respective file system, you might be interested in some of these links:
I can use ext3, but cannot plug it into a Mac.
Give this a try. The latest news is that you get write support in Tiger, but I use it in Leopard without problems.
Also don't worry about the ext2 part. Ext3 is designed to be backwards compatible with ext2. It can be mounted as ext2 (it just won't get journaling)
You didn't ask for it, so you might already know about this windows driver. There are actually a couple out there, I think that one works the best (which is kind of unfortunate, because it's freeware, but proprietary).
I can use NTFS, but cannot write to it on a Mac.
Sure you can, same way you do it in Linux, through fuse and ntfs-3g.
I can use Mac's FS, but cannot plug it into Windows (unless I pay for a proprietary driver every time I use that disk on a different machine)
Yeah, you got me there. MacDrive works really well, but I'd like a non-proprietary version myself.
For a removable drive that you can plug in anywhere, I'd go with ntfs actually. No FAT size restrictions, no permissions (actually a plus for a removable drive), and most linux distributions come with ntfs-3g installed by default. That means you only have to install the driver in mac os x
Warning: Opinions known to be heavily biased.
What do you want to specify the data type?
Some non-human readable meta data? If someone sends you a Not-a-virus.txt in an email attachment, what kind of file is it? An executable a funny story? How would you know?
This is not the funny you're looking for.
Seconded.
The one most wanted feature is the ability to store the mime type (or something like that) of a file.
With all the search features in modern OSes, even the file name should take second place to that.
Oh, who does that?
My OS certainly doesn't care.
NAME file - determine file type SYNOPSIS file [-bchikLnNprsvz] [--mime-type] [--mime-encoding] [-f namefile] [-F separator] [-m magicfiles] file file -C [-m magicfile] file [--help] DESCRIPTION This manual page documents version 4.24 of the file command. file tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic tests, and language tests. The first test that succeeds causes the file type to be printed. The type printed will usually contain one of the words text (the file contains only printing characters and a few common control characters and is probably safe to read on an ASCII terminal), executable (the file conâ tains the result of compiling a program in a form understandable to some UNIX kernel or another), or data meaning anything else (data is usually âbinaryâ(TM) or non-printable). Exceptions are well-known file formats (core files, tar archives) that are known to contain binary data. When adding local definitions to /etc/magic, make sure to preserve these keywords.
Users depend on knowing that all the readable files in a directory have
the word âoetextâ printed. Donâ(TM)t do as Berkeley did and change âoeshell
commands textâ to âoeshell scriptâ.
The filesystem tests are based on examining the return from a stat(2)
system call. The program checks to see if the file is empty, or if itâ(TM)s
some sort of special file. Any known file types appropriate to the sysâ
tem you are running on (sockets, symbolic links, or named pipes (FIFOs)
on those systems that implement them) are intuited if they are defined in
the system header file .
The magic tests are used to check for files with data in particular fixed
formats. The canonical example of this is a binary executable (compiled
program) a.out file, whose format is defined in , and
possibly in the standard include directory. These files have a
âmagic numberâ(TM) stored in a particular place near the beginning of the
file that tells the UNIX operating system that the file is a binary exeâ
cutable, and which of several types thereof. The concept of a âmagicâ(TM)
has been applied by extension to data files. Any file with some invariâ
ant identifier at a small fixed offset into the file can usually be
described in this way. The information identifying these files is read
from /etc/magic and the the compiled magic file /usr/share/file/magic.mgc, or the files in the directory /usr/share/file/magic if the compiled file does not exist. In addition,
if $HOME/.magic.mgc or $HOME/.magic exists, it will be used in preference
to the system magic files.
Because the operating system would tell you what the metadata means. I don't quite see how 'file.notavirus' and 'file' with metadata saying 'notavirus' makes any difference apart from the latter being more flexible. Still. nothing prevents us from doing both, and as it stands it's too hard to establish cross-operating system support for it to be feasible.
There are extended attributes and magic numbers. Some file managers seem to make limited use of these. I guess the problem with extended attributes is that they aren't guaranteed to be transferable along with the file when you, e.g., upload the file to a website, or transfer it to another computer. Only the main stream of a file is guaranteed to be transfered.
But make no mistake, the filesystem fully supports this kind of information. We just don't seem to make much use of it right now.
Sad to see JFS being overlooked so. While it may not have the postmodern features to compete in the wake of JFS, it's still in many cases the best current filesystem for linux. It's remarkably crashproof, has the lowest CPU loading of any of {ext3 jfs xfs reiser3}, good all-round performance (generally either first or second in benchmarks) and is fast at deleting big files. I haven't used anything else in a couple of years - I used to put reiser3 on /var, but got fed up with its crash intolerance. It's sad to see jfs so overlooked, because at least until btrfs or tux3 come out it's arguably the best option available.
"'I pass the test,' she said. 'I will diminish, and go into the West, and remain Galadriel.'"
- JRR Tolkien.
seriously how many filesystems do we need? this is the failing of OSS, too many projects inventing the same thing splitting up their developing efforts so neither of them achieve anything.
Jeez, choose one FS and stick with it.
Just like everyone else does? Oh wait...
We've got FAT32 for thumb drives, NTFS for Windows hard drives, HFS+ for OS X hard drives, and ISO9660 and UDF for CDs/DVDs -- several versions of them, in fact, with Rock Ridge extensions, Joliet extensions, plus UDF2...
Now, most distros do stick to the standard pattern of one swap partition and one big root partition, formatted as ext3.
However, just like the rest of the world, we recognize that different situations call for different filesystems. UbiFS is a filesystem sitting directly on flash media (instead of above some wear-leveling ATA-emulating layer designed for FAT). sshfs is a (fuse-based) filesystem for accessing remote filesystems via ssh. Squashfs is a read-only, heavily-compressed filesystem, ideal for use on livecds and DVDs. And then there are the journaling filesystems like ext3.
It would be stupid beyond belief to suggest that we should unify those things into one giant, lumbering, monolithic, bloated filesystem.
Instead of one stable implementation you have this whole business of a dozen half-assed implementations again.
Look at the above filesystems I've mentioned. What is half-assed about them? Please be specific.
If Linux coders were designing "standards", we'd have TCP/IP, TCP/IP2, TCP/IP3, TCP/IP4-beta5-pre46 and what not.
Actually, I'm betting quite a lot of Linux coders were designing them -- and the situation isn't far from what you describe.
We still have ipv4 and ipv6. We also have TCP, UDP, and ICMP. We have implementations of TCP over UDP. We have older protocols like AppleTalk and IPX -- dead in most places, but there's a reason Linux still supports them. We have NetBIOS and DNS. We have Zeroconf and Bonjour.
This right there shows you what's wrong with Linux.
Really?
No, filesystems are actually one place Linux is king -- with the possible exception of ZFS, but that's being rectified. We support just about everyone else's filesystems -- from 9pfs to sysvfs, and everything in between. With FUSE, we even support a few filesystems that won't go into the kernel, for technical or legal reasons.
In fact, it speaks quite loudly about the quality of desktop Linux that you think this is what's wrong with Linux. You could have chosen to complain about how slow suspend/resume is on some devices. You could have raged about the lack of support for some obscure wireless card, or whined about being forced to modify your kernel commandline for your touchpad to work.
These might all be valid complaints, but you didn't raise them. Instead, you chose to complain about how many filesystems we have -- how dare we provide so much choice!
And seriously, what kind of fucking ridiculous name is Tux3?!
The same kind of fucking ridiculous name that FAT32 is. Or maybe you like HFS Plus?
Don't thank God, thank a doctor!
Over and above this, it'll need a new name. I know it doesn't make one iota technical difference, but people are fussy about such things; change the name, and people don't care if it was developed by fiends. Keep it and people will find excuses to edge away and it'll wither on the vine.
The Volkswagen was a runaway success despite its Nazi origins, but had it been named the "Hitlerwagen", things would have probably turned out a lot differently.
Ext3 is actually ext2 with a special file used for journaling. In fact you can create this journal on existing ext2 partitions to "convert" them to ext3. Assuming that the machine was shut down correctly, and all the data in the journal has thus been flushed to the disk, ext2 tools should mount it just fine as an ext2 volume - and if they don't, it's time to run fsck and fear the worst.
Forget magic. Any technology distinguishable from divine power is insufficiently advanced.
How are filesystems that were designed from the ground up to be journaled filesystems "essentially bolt journaling to the traditional file system UNIX layout"?
Did you actually bother to look at the layout of an XFS filesystem?
You refer to a Debian article on XFS CPU usage, but fail to mention that the conclusion of the aritcle is "Based on all testing done for this benchmark essay, XFS appears to be the most appropriate filesystem to install on a file server for home or small-business needs" and "Personally, I still choose XFS for filesystem performance and scalability".
You then go on to continue the myth about XFS corrupting your data, a myth that has been debunked countless times. Changes were made to XFS some time ago so that it behaves in a way that user's were expecting.
It's a somewhat half hearted blog in my opinion and I'm surprised that slashdot picked it up.
There is folly and foolishness on the one side, and daring and calculation on the other. - Admiral Pellew, Hornblower
Windows is not without it's own choices mind you either, fat(fat16), fat32, ntfs, WinFS(cancelled). As time pass, even microsoft attempts to create a new and improved fs (key-word: attempt). Sure they tend to force the latest fs on you but that's microsoft way VS linux way of choice.
Fat16 has been deprecated for ages (decades, I think). Windows only supports it for old floppy disks you have hanging around; it won't let you format any new devices in Fat16. Fat32 is only there for compatibility with Windows 98 and ME; the only reason XP lets you format a new disk as Fat32 is that it's (supposedly) possible to upgrade from Windows 98 to Windows XP, and so it needs the ability to use the same filesystem.
WinFS isn't a filesystem in the same way Ext2 or NTFS is. It's more of a database-backed meta-data layer sitting atop NTFS.
Since Windows 2000, the only real Windows filesystem is NTFS.
Sure they tend to force the latest fs on you but that's microsoft way VS linux way of choice.
1) XP will install and run on Fat32, if you really want to. (Windows 2000 also, IIRC, but it's been awhile.)
2) But would any sane person do that? Seriously, sometimes there's an obvious "best approach". Criticize Microsoft all you want when they deserve it, but that's ridiculous.
Comment of the year
Fat16 has been deprecated for ages (decades, I think). Windows only supports it for old floppy disks you have hanging around; it won't let you format any new devices in Fat16. Fat32 is only there for compatibility with Windows 98 and ME; the only reason XP lets you format a new disk as Fat32 is that it's (supposedly) possible to upgrade from Windows 98 to Windows XP
Maybe you are talking about MS's advanced filesystem tools, but there's an article from MS that points out the bizarre fact that Vista's dialog for "format unformatted drive" defaults to FAT16.
To wit: "As for Windows, I would have expected it to always default to FAT32, but a quick look at the Format dialogâ(TM)s pick for one of my USB drives showed I was wrong." -- Mark Russinovich
There aint no pancake so thin it doesn't have two sides.
on Windows i can see the file extension of every file on my hard drive. i determine the file type based on the same attribute that my shell does. if i get a file attachment or am browsing a directory, i can immediately distinguish executables from non-executables. if i'm looking for a PNG image, i just look for the appropriate icon and the .png extension, and i can double click on the icon and open the image without the possibility of accidentally running a malicious executable.
however, on a lot of people's Windows systems they have explorer configured to hide known extensions. so the shell still uses file extensions to determine file format, but they're now relying solely on the file icon to indirectly determine file type. but since executable files can have embedded icons, it's very easy for an attacker to give a file a deceptive name and icon, disguising a virus or trojan as an image or text document.
sure, the user could right-click on the file and select "Properties" to look at the "Type of file:" field. however, doing this for every single file you want to examine is very tedious and time-consuming. most people simply aren't going to go through that kind of hassle. imagine if you have to examine a directory with 100 images in it. are you going to open the properties dialog 100 times, once for each and every file?
using meta data or magic number to determine file format would have the same drawback. how would you determine the format of a file at a glance using meta data? you wouldn't have a safe/accurate and intuitive means of determining file type.
Ext3 is a hack on top of Ext2. It's popular b/c distributions default to it b/c it's easy to support and when you hit "create filesystem" it defaults to ext2/3. In the long run... XFS is... 1. Faster. 2. Resizable. You buy another hard drive and just add it. It takes a while but it works. 3. Built for journaling. Ext2/3 has it's place. Boot partions and portable drives. But to indicate that ext3 or even ext4 is state of the are or has "won the war" and that xfs or jfs haven't gained traction is... well... wrong.
Is there any plan to support ANSI T10-DIF in EXT4 either initially or in later patches?
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Yes, there is a perfectly sane reason to use FAT32 on a modern computer, the lack of an ACL node to be updated means that compiles that are I/O limited can see a HUGE speedup by being targeted at a FAT32 partition. Under Windows 2000 I was blown away by the 250% compile speed performance one developer got by using a relatively small FAT32 partition for his compiles.
The other reason is of course to format any type of media that needs to be used in another computer or a non-PC electronic device.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
The default behavior of EXT filesystems to launch a sometimes lengthy fsck while booting really throws computer novices for a loop; they get scared, or impatient or worse.
Also, I don't think the logic of using EXT3 for large partitions containing large files quite holds up: 1) deleting large files is slow, 2) when fsck strikes you are bound to have a long wait.
EXT3's forte isn't small files either. I just don't think its a very good filesystem. If EXT4 doesn't address more than storage space issues, then I'll take a pass on that as well.
XFS is still inappropriate for most hardware setups, IMO.
I think that will pretty much leave ReiserFS and JFS as my main options.
Who is using that? Debian/GNOME and it doesn't matter for me what the extension is unless the file mime-type type isn't discoverable from the file itself (e.g. it's empty).
Put identity in the browser.
Pray that your data is truthful! (see integrity)
Pray that your overhead is low and performance is high!
Pray it works with your hardware!
Pray that it scales!
Pray that it is reliable!
Pray that you can easily back it up and restore it!
We are currently accepting donations and prayers...
- Rev Daryl McBush
I don't think that there is a 100% "safe and accurate" way to display the file type, assuming you are depending on a possibly-hostile file to supply the information in the first place. There are, however, a few things that an operating system can do to make life safer for users:
1) Clearly mark executable files. Have some visual indication whether a file is set to be executable (this, of course, assumes that your operating system has an execute bit; if it doesn't, that's a bigger problem). This indication should be consistent, universal, and impossible to override with metadata or custom icons. It should apply both to CLI shells and GUIs. (Although not necessarily in the exact same way; however my personal preference for such an indicator, which is putting the file name in bold, would work both in a GUI and CLI environment.)
2) Don't use the same action to execute as to open. Using the same action (the double-click) both to "run" and to "open" -- which are two very different actions -- is probably responsible for the vast majority of user-propagated malware today. I would love to see an operating system rigorously enforce a separate 'run' action, so that a user clicking on what appears or claims to be a data file (intending to open an application and read that file) could not accidentally execute it.
3) Break the filesystem into 'data' and 'executable' sections, and bar files on the 'data' sections from being marked as executable under any circumstances. I don't think this would be as effective as #2, but it would probably involve less user retraining. In order for content to be executed, it would have to be copied or installed onto the executable partition (which in normal operation could even be mounted read-only).
You could do all of this with the data-type indicator as part of the file name, or as a separate piece of metadata; it doesn't really matter. There's no 'safety' advantage to doing it either way, it's just that keeping it in the file name is considered very ugly by a lot of people (myself included). I'm personally a fan of the way that the Mac used to do it, with a two part code (one for the file's actual type, the other for the application that either created it or should be used to open it), except that unlike the Mac, it should be easily editable by the user, and a lot of standardization and interoperability challenges would have to be solved. I'll be surprised if I see the filename.ext thing die in my lifetime, honestly. It's just too entrenched.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
My command line does exactly what you ask for in the last paragraph by using different colors to represent different file types. I'm sorry, but I cannot be convinced that a gui has less capability than a terminal to represent information using non-textual visual cues, or even a handy-dandy sidebar with file information, including file type.
GNOME makes use of magic numbers and MIME types. Nautilus will detect file types regardless of extension... I do use file extensions, though, for interoperability with other systems in the house (like WinXP and the Xbox.)
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Repeato ad absurdium...
What is that gibberish supposed to mean? Christ, I hate mock-Latin. If you want a fancy-sounding term referring to repeating something again and again, use ad nauseam.
i will admit, i hadn't thought of those other possibilities, and i'm glad you brought those to my attention.
however, it's hard to strike a fine balance between usability and security. and if you make users choose between the two, most will choose usability. that is why i like having file extensions, as it provides usability without significant compromises to security. you could use file icons to indicate file type to the user, but then you would have to disable custom or embedded icons. now, this may be acceptable in a business environment, but custom program icons are a basic aesthetic feature that most consumers have come to expect. and aside from aesthetics, easily distinguishable program icons also significantly enhance usability.
also, the problem with using separate commands for "run" and "open" is that, generally, users expect to perform the default action for a particular file type on double click, regardless of what file type it is. for an audio file it might be "play", or "queue to playlist", and for an image file or text document it'd be "open" in the default editor, and for an executable one would expected it to be "run." i think if the interface gives a clear (and accurate) indication of file type then it's really not necessary to require different actions for "open" and "run" commands. this can be achieved either with visible file extensions, or by putting an executable in bold as you mentioned. though using an execute bit that's off by default might be a prudent policy.
another thing i'd like to point out is that file extensions make searching for specific file types much easier (well, in theory at least; the Windows XP's built-in search feature is absolutely worthless) since you can type something like "*.css" to find all CSS files in a directory. i'm not sure how else you can handle such searches in an intuitive manner.
Resizable, to me, implies growable and shrinkable. XFS is not shrinkable, unless you know something I don't.
The biggest would be it is easy for the user to see what the file type is supposed to be and thus what it'll open with, and change that, if necessary. For example on my system anything with the extension .txt is going to open in UltraEdit. So if something comes in claiming to be a text file but is actually something else, doesn't really matter, the text editor is what opens it. There isn't any room for trickery of making it look like one thing abut having the OS open it in another. The extension determines what app is launched and I determine the extension to app mapping.
Also of course it is a simple way of indicating type that doesn't rely on file systems with multiple data streams. That can be a nice feature, but what happens if your file hits a system without it? Mac users dealt with that for years, and still do to some extent, due to Apple's use of multiple data streams. Something would go out to an FTP serer and all of a sudden Macs didn't know what it was since the resource fork was gone.
are you just trolling or have you seriously never hear of magic numbers, I often forget to name the type of file and have never had any problems, but generally it is a good idea to use a file.type naming system because it means you can differentiate between text, scripts, configs, etc easily.
IranAir Flight 655 never forget!
I bought mine from Nostradamus. The gould is very nice. Get tour own pick and dig it up yourself, before the askenazim jews bury this technology so they can have their shakira glory. Joseph, that 2-headed kike kraut, he too must be killed; he wears the horns of power as in Psalm 92. Get on withe the spanking scene, zoot: for great justice of the pleasure Chief Wounded Bagpipe. - Yes Massa'd, Israelean Inteligence operator 409, cleans with a shine.
not sure if much on-topic, but Apple Mac(intosh) has done without them for its entire life, still going through various kernel "replacements" and the like.
File extensions have nothing to do with file systems, in fact on an older non-graphical Unix system you wouldn't use them at all...
I't a (double) "resource" issue:
1) On the Mac files used to have a "resource" part in addition to the plain raw data, and in the former you would specify the application that created the file and the filetype itself;
2) On the Internet most of the hosting is done on Unix Servers, but most of the content comes from Windows computers - thus the ubiquitous useless file extension...
3) The world could easily do without file extensions TODAY, given that most of the files are opened via a browser of some sort, it would be a one-day switchover to start using MIME headers to decide on the filetype and say adieu to the threesome extension...
Just my 2€cent
I dont think it would be logical at all.
First, It would mean that each workload would require a different filesystem design.
Then it would also mean that you dont need synthetic benchmarks at all, just a run the expected workload as your "benchmark".
...probably the worlds best filesystems (Digitals AdvFS) has been released to OS and it seems that nobody cares...
I've used it a lot on Tru64 and I am quite sure that it would be a real hit for Linux.
We are trying to reinvent something we just can take for free...
I can use NTFS, but cannot write to it on a Mac.
O but you can. First, you install MacFUSE and then install NTFS-3G on top of it. You can even format disks as NTFS with Disk Utility.
Repeato ad absurdium...
All these fancy features, but we are still using filename extensions (eg. .zip) to specify data types.
Did OOP even happen?
I really don't know what you're talking about.
Apple Macs don't need the filetype specified in a file extension. Neither does Gnome, nor (IIRC) KDE - and the underlying OS certainly doesn't.
In fact, if you go back in time a little, RISC OS didn't either - it stored the filetype in a small piece of metadata. (And applications were actually directories but with a filename starting in "!" - the OS automatically ran a program called "!Run" inside the directory when you double clicked on it. OS X does something fairly similar today).
The only operating system I can think of in common use today which does is Windows - but seeing as we're talking about Linux filesystems, you can't possibly be referring to that.
Then you've never used a windows gui ;)
This is not the funny you're looking for.
Your post is precisely one example why I wish OOP never did happen. Now we have waves of people (like you!) demanding that we hide stuff left and right. What's wrong with a file extension? Why can't I see what my files are? Why do I have to rely on some complex filesystem to determine things for me?
Stop STOP trying to hide stuff! My filetypes are NONE of your business. (And your bloated OOP code to implement it does not belong in my kernel either.)
since some still don't know what the proper name of reiserfs is. It's getting a bit old.
Twitter supports and protects racists - by smearing their critics with the "Hate Speech" label.
Look around, there are people who like to dominate and like to be dominated. ...
It's all one bestiary world where the strongest will survive and the weakest will just follow like cattle
There would be no single hair on my body who would think to drink poison, just because of belief.
Believing really controls that much, I would think these people were semi-forced drinking their coolaid by misusing their (trust and) beliefs!
Just like the phrase "If you believe good enough in it, it will happen" it's all based on common willpower...
If not, explain me why placebo's work in some cases ... belief!
To end this rant ...
I might have found "the true poison" in belief, it's not the name or origin but it's the abusal of it's ancient words towards unfettered deeds...
--- I am known for the ones who want to find me on the net. Is that a privacy risk or a privilege? One might wonder..
All DB folks love JFS due to it's extent-base allocator (continuous chunks of up to 4MB iirc).
I've done enough benchmarks and JFS outperforms ext3 by around 10-20%.
Further, when hosting big files the space efficiency is notable (no more block lists, but only a few extent entries).
JFS had a small problem in 2.4 when there was a memleak in the metadata paths so a fileserver would run out of memory after a couple of weeks. But I traced that down together with Shaggy@IBM.
XFS is also nice, but the lack of a proper userspace fsck has turned me away there.
Also, come back to me when ext3/ext4 has *dynamic* and not static, pre-allocated inodes. I have numerous data sets where we'd run out of inodes because we'd have to host a LOT of small files with a LOT of directories (simulations etc).
Reiserfs outperforms all of them here.
I personally run all home and group drives on reiser due to its space efficiency for smaller files and short fsck times (yes, we are paranoid and run regularly full fscks on our TB data partitions! And you should, too)
Don't get me wrong - ext3 has proven to be a good filesystem for the casual user and for the system partitions. But for serious workloads there are better alternatives.
When will people actually write articles about filesystems that matter? The linked blog is just a horribly assorted list of cliches.
NTFS-3g in Linux (and OSX, and probably all other) is ridiculously fast. Benchmarks: http://www.ntfs-3g.org/performance.html and in my personal experience I can't tell when I'm on my ext3 partition and when on the NTFS partition. The best thing is the excellent crossplatform support, it just works everywhere (if NTFS-3g is available of course).
Confused - what's OOP got to do with this?
On Windows, the extension is the execute bit. At least that is more or less the case. I don't know at what layer that is implemented. It is a convention inherited from DOS. In DOS the semantics of the execute bit (extension) was implemented by the shell rather than the kernel, might still be the case in Windows. Might not be the best idea, it is just the kind of heritage that is hard to get rid of.
Good point. I hadn't thought about it that way. But thinking about it a bit more I realize that when working from the command line, those two really are different. The main reason you don't unintentionally run an executable from the command line is that there is a difference between typing less virus.txt and ./virus.txt.
Do you care about the security of your wireless mouse?
Maybe it would be ok if you had to go through some additional steps the first time you run an executable that wasn't installed on the system. A dialog that asks you if you really want to run this program would be a step in the right direction. (But we know that such a solution is too convenient, many users really need something more inconvenient). You could open a window explaining that to run this program you have to right click on the icon and choose to mark it executable.
There are different ways to keep track of what files are ok to be executed. You could have an execute bit on the file. You could have a list of accepted file paths stored somewhere, or a list of accepted file hashes.
Do you care about the security of your wireless mouse?
Why would you right click on it? Why wouldn't the operating system tell you what type the file has automatically?
You're insisting we have to make the file type part of the filename because of some false assumption that the operating system can and/or will only ever tell you the name of the file, that for some reason the name would only ever be the exposed metadata presented to the user. There is no reason why this would be the case.
The name is the name. There is no reason for the name to contain the file type any more than it should contain the date the file was created or the name of the person who created it. One could come up with equal justifications for either of those pieces of metadata to be part of the filename, and if it were the case that some third rate single-tasking binary loader like CP/M had done just that, and we were still using filesystems based upon said operating system's quirks, I don't doubt many people - probably yourself included - would come up with complaints about us moving that metadata out of the filename. "But how am I supposed to know who created this file?" they'd cry.
It's not the right place for this data, especially in a world where 50% of the population have no idea that .EXE is a program not a document, and 90% have no idea that .BAT is likewise.
You are not alone. This is not normal. None of this is normal.
I suggest everyone tag this article "whore", "whoring", "blogwhore" and "blogwhoring". Because that is what it is.
It's about a silly ideal of "code going with data". To anyone reasonable, on the Intarbutts that's a bloody horrible idea just because of the security implications of running code downloaded from J. Random Webpage on the client computer.
Yet these OOP wankers call structs and arrays "plain old data", as though being time-tested and proven over some fifty years was suddenly a bad thing because it isn't the Latest New Thing. Apparently some people will do anything to have more mechanisms, particularly ones that dictate policy, in the name of hypothetical "flexibility" that never appears or is required.
An extension is an easy way to organize something. I can write a script that say find .jpg and move them to my images folder. If I need to use metadata, my job just got harder because I have to know now what I'm looking for.
Granted, if you're complaining that a .zip shows up as a zipped icon based only on .zip, then yes, it's a bit absurd.
No, resizable implies that the size can be changed. It does not imply both making it larger and smaller; in fact, a filesystem that can only be shrunk would qualify as resizable.
You'll have to find another expression.
Wouldn't it be logical to assume a filesystem developer has an idea on what the workload and hardware will be like _before_ writing his filesystem, then picking a benchmark that suits his ideas on what a filesystem is supposed to do?
No, that would be illogical, unless again they were trying to craft bullshit benchmarks. The developer does not know how I will use the filesystem, and so any such benchmark is not useful to me. I also want to know how well the filesystem will perform if I have to perform some new task on it.
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
That some people are idiots is no reason to do away with a convention that's been extremely useful to those of us who aren't.
Sure, you could get operating systems to display the file type. But could you get EVERY operating system to use the same convention again? How about every email program, website, or file transfer application as well?
Most importantly though, could you get bit of code everywhere to display the file type as instantly and unambiguously as foo.bar?
You don't make changes just to make changes, Barak. You make changes when they're going to be useful enough to justify all the ass-pain they're going to cause.
Violence is like duct tape. If it doesn't solve the problem, you didn't use enough.
Maybe it would be ok if you had to go through some additional steps the first time you run an executable that wasn't installed on the system. A dialog that asks you if you really want to run this program would be a step in the right direction. /p>
You mean like OSX does?
I don't recall seeing that on OS X. But then again, the times I have actually put a GUI program on an OS X machine can be counted on one hand. Turns out that for work a browser and a terminal will suffice most of the time. And the laptop came with firefox, terminal, and ssh client all installed.
Do you care about the security of your wireless mouse?
Like Java or flash, it shouldn't matter where an executable comes from; the operating system should make it safe to execute by restricting what it can do. Give it no rights to the network, file system, and limited ability to draw things to the screen by default, and only allow privilege escalation by well defined user interface dialogs, and even then never a blanket permission to read or write to the entire file system, or to open up an unlimited number of network connections, or draw fake security dialog boxes on the screen.
What is this, 1980 when a DOS program had full control over the entire computer? It's 2008, and it should be slightly easier to do real privilege separation.
XFS is also nice, but the lack of a proper userspace fsck has turned me away there.
Eh? Man xfs_repair(8)
Just because it's not called "fsck" (and not run at boot time) does not mean that the functionality is not there when you need it.
A crash does not mean you need to run fsck; that is why you pay the price for the journaling overhead, right? When xfs detects errors at runtime, run xfs_repair, and bask in the glory of "a proper userspace fsck."
The OP was retardo infinitium.
OS X does not do anything of the sort.
All we need is a find(1) that can do something like this:
/var/somewhere \;
find . -mime image/jpeg -exec mv {}
Usage: km/h for speed (kilometers per hour); kph for very slow impulses (kilopond hours).
Yes, it does - albeit new in Leopard, and only for downloaded applications (but that's the source of most new executables on most systems these days).
Here is a guide on how to disable it... http://www.macosxhints.com/article.php?story=20071029151619619
I think transami was confusing ad nauseam with reductio ad absurdum.
Isn't this kind of crying out to be automated?!
... on more factors than just a group of people, even if I know them very well.
If there is a fire ready to toast my body, ok, I'll probably use the same exit wisely if no others are found. ;)
Again, if I believe I'm going to toast in 0-3 seconds like in the movies, it would be too late anyways to do stupid things
--- I am known for the ones who want to find me on the net. Is that a privacy risk or a privilege? One might wonder..
It's not to specify a data type, it's to give the user (and file browsers) a hint as to what it is. I can call my zip file "foobar" if I want, but only the person who created it would know what it contains (unless it were documented somewhere, or a browser pre-emptively opens the file to determine it).
Say I create a directory with some extensionless© files in it, and I run the following command:
I now have a text file (although only I know it's a text file) with a list of files that only I know what they are (or maybe I don't!)
Homonyms are fun!
You're driving your car, but they're riding their bikes there.
Windows XP already does something this for downloaded executable files. It seems to have been introduced in SP2, and will popup the warning before running an untrusted program:
"The publisher could not be verified. Are you sure you want to run this software?"
The third in a series?
Semi-automatic amateur armchair Australian philosopher; conjecture ready at any moment...
Well, apparently a lot of programmers think putting the type that something is, in the name of whatever.
Like strName. Its just useful to know what you're dealing with in general.
This is not the funny you're looking for.
"...it seems somewhat tragic that JFS or even XFS didn't gain the traction that ext3 did to pull us through the 'classic' era...
I always had high hopes for Reiser FS... it was a real killer.
That is all.
You are definitely off-topic here since the discussion was on Linux file-systems, however you have brought up an interesting argument that has been running over the years. In reply I have always found that so called file extensions are meaningless in a Linux/Unix environment and in a MS Windows environment the dependency on file extensions is not a good way of managing your files, in fact this dependency on file extensions is one of the main reasons why many MS Windows users get fooled into running what they think is an innocuous file.
:-)
When using a GUI browser in Linux/Unix you can normally see what type of file you have because many decent browsers use the meta data in the file not the file extension and display the information accordingly. It is even easier on the command line where the command "file *" is a great way to list the attributes of each file. You can list thousands of files quickly by doing this.
To say you need to know how to distinguish executables from a normal files, in Linux/Unix you can see this in a decent browser or just use the command "ls -l" although I would definitely query why do you want to mix executables with non executables, since in Linux/Unix this is a poor approach to file management.
I have always found that running a command via a browser is risky unless you know what that command does. Executables in Linux/Unix rarely use the "exe" extension and are normally located in clearly defined directories such as "bin", "sbin" and even "lbin".
Being a mild Grammar Nazi I think having a capitol letter at the start of a new paragraph is a better way to express yourself
There ain't no such thing as proprietary standards only proprietary formats. Standards are by definition open.
A few months ago I noticed quite a serious problem with Ext3. A power outage caused an error on the primary Ext3 superblock. Ext3 driver has a problem with the specific SATA error and refuses to mount filesystem in normal mode and displays some cryptic SATA errors instead (but it does mount in recovery mode). FSCK is unable to fix the error, even when feeding it the correct offsets of superblock copies, because it gets thrown off course by the same SATA read error and it quits before even starting any work using alternative superblock copies. So a single sector error causes you quite a big problem, practically you are forced to copy the partition to fix it quickly, not to mention the time wasted to figure out what's going on.
Not exactly Object-Oriented Programming, but IBM had this concept on the Series i (AKA AS/400) for some time now :http://en.wikipedia.org/wiki/AS/400_object. Best command line interface ever...
So is totally unreasonable to look at existing workloads that perform badly, then create a new filesystem optimized for those workloads, then benchmark against those workloads? Maybe the developer doesn't know how YOU will use the filesystem but if you don't know for what kinds of work it was optimized, you might be using the wrong tool for your new task.
"Being a mild Grammar Nazi I think having a capital letter at the start of a new paragraph is a better way to express yourself :-)"
Fixed it for you.
BeOS used human-readable MIME types to specify file types. They were automatically assigned to new files by the system, using magic numbers and falling back to the extension if magic wouldn't do it.
- chrish
There's some convenience in having type be part of the name. Do you want to type rm *.o or rm --filetype=compiled_binary *?
As copyright owner of this comment, I authorize everyone to defeat any technological measure which limits access to it.
The benchmarks are useful but still misleading in many cases. It's okay to say "this filesystem performs like this on this workload" but all too often the results are used to say "mynewfs is 100 times faster than yourmamasoldfs!" when in truth it's only 100 times faster at setting extended attributes or something. A useful benchmark will at least show how a filesystem holds up under non-optimal conditions, so that you don't have a massive fail when the system is just a little bit "off".
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
OS X puts up a warning when you open a document and it begins, as a result of being associated to a particular app, to run a program that you've never used before. If you explicitly launch an application I don't think it warns you; it assumes you know what you're doing. But since opening a document could have the effect of launching an application you didn't mean to open, and this might be non-obvious even if you're a careful user, it warns you if it's never been run before.
I'm not sure it's really meant as a security feature as much as a convenience one -- I typically get it as a result of just having installed some utility that's associated itself with a wide swath of common file types.
"Ladies and gentlemen, my killbot features Lotus Notes and a machine gun. It is the finest available."
Sure, a benchmark can be misunderstood or misinterpreted and vendors often count on this to work in their favour. But that doesn't negate the fact that the filesystem is 100x faster at something, and if your workload depends on setting extended attributes (to use your example) a lot, that may matter significantly.
Uhh, NTFS has ACLs just like *NIX... Though I agree with the rest. Ironic, the M$ solution the most portable, and all that...
I know tobacco is bad for you, so I smoke weed with crack.
...your kernel command line
Uhh, bash is integrated in the kernel? Interesting...~
How on earth is this flamebait? Leopard does exactly that for newly downloaded executables!
Sigh.