Yes, even a 486 box can have difficulty saturating a 10Mb/s Ethernet link, if it uses PIO. With DMA and busmastering you can do better, but still I'd question the ability of the OS and applications to keep up on a slower machine. (In practice, the older Ethernet cards are limited to about 3.5Mb/s due to a conservative implementation of checking whether the medium is free.)
I think classic Ethernet is about the oldest and slowest networking medium that's still widely supported, unless you want to use serial connections.
The idea is you first write a record to the journal - an area of disk set aside for this. If power fails while writing that, this record is lost, but the filesystem is not corrupted because the partially-written record can be cleanly ignored on next boot. And the write() system call or whatever won't return until the disk has finished writing to the journal (assuming ext3 with data=journal), so you know that if the write completed, it's definitely on the disk.
Later on something can come along and move data from the journal to another area of the disk, for efficiency when reading. This too can be done in such a way that no data is lost - if the power fails you can just reexamine the whole journal on the next reboot.
So if you lose power while the application is saying 'Busy saving...' you obviously won't get the file it was in the middle of writing. But once the app says 'Save complete' you know it really has gone to the disk. And no matter what happens, the filesystem won't be corrupted (short of bugs in the FS code or physical disk corruption, of course).
Actually I was thinking of data journalling as well (ext3 with data=journal); I agree that an IDE disk may buffer writes internally and so risk losing data if power goes down, but doesn't the OS compensate for this by issuing explicit flush instructions? A more high-end disk system will store just enough charge to write dirty buffers when power fails, or otherwise ensure that when the disk says it's written it's written.
Disk corruption due to uncommitted writes is kinda orthogonal to the filesystem used. Any filesystem can be corrupted if you write some updated blocks and leave out others. A simple filesystem like FAT might be better able to recover from corruption than a complex one, but still it is possible to corrupt a FAT filesystem with uncommitted writes. When MS-DOS 6 was introduced, many users reported disk corruption. This was not due to any bug in the OS but because smartdrv disk caching was being loaded and the users were switching off PCs before exiting applications to a command prompt (at which point smartdrv flushes writes to disk). Later MS-DOS 6.x versions turned off write-back caching in the default configuration.
If 'just switching off' causes problems because of disk corruption, you could try ext3 with data=journal. This makes sure that _everything_ is committed to the journal as soon as it's written. Another way of looking at it is a 'no lies' property - if the operating system says to the application that the write completed then you know it did complete and isn't just sitting as dirty buffers waiting to be flushed. So you could switch off the machine immediately the 'save' command finishes. (This assumes your hard disk isn't doing its own slightly dishonest buffering, of course.)
If the corruption is caused by the applications leaving things in an inconsistent state if they don't get closed properly, then obviously no filesystem can do anything about that. Using a USB key for/home would stop the system being corrupted but still the user's files might be, and that's probably more serious.
This is considerably worse than necessary; it is possible to cut the bandwidth down to O(n/k+kI+S), where n,k are as above, I is the number of inserts/deletes, and S is the number of substitutions.
Got a source to back that up? (Eg a description of such an algorithm, source code that implements it, etc.)
Yes, I wasn't so much responding to the article (which is a reasonable comparison of disk technologies) as to the parent post which advocated varying filesystems, file sizes and so on to be more 'real world'.
Who are these 'zealots'? I've never come across one. Their main existence seems to be as straw men for lame Slashdot posts to argue against.
RMS may be a zealot but he doesn't match the template given in the article (eg, Microsoft is not the Great Satan) or many of the beliefs attributed to him in Slashdot postings ('all software has to be GPL not BSD-licensed').
Exactly - if extra RAM is a waste for your application, then SCSI would win the benchmark. So I don't see why it's unfair. For other applications IDE + tons-of-ram might beat SCSI for the same hardware cost, and that's a useful thing to know if you're deciding what hardware to buy.
If you increase the budget to $2000 it's by no means certain that IDE would lose out - it depends on the application. For example heavy updates and queries on a ten gigabyte database would always be faster if the whole database can be cached in RAM. (No matter how fast a disk, it will always be much slower than RAM.) And even for a terabyte database you can't be sure that SCSI would win, it depends on the access pattern.
Good point that too much RAM may hinder performance: I imagine to check for this you'd perform the benchmark with as much RAM as you can afford, and then again with half that amount (which would still come in within the budget).
My first step was to use a common compiler, so I chose GNU GCC's latest, 3.3.1. For Solaris, I used the 2.95 GCC build I obtained from Sun's own freeware site (http://wwws.sun.com/software/solaris/freeware/). It compiled without any errors, and took 22 minutes and 26 seconds to compile.
In the real world, you must also take into consideration cost. A fair test would be to take a budget of $500 and try two setups, one with IDE and one withSCSI, with any leftover cash spent buying as much RAM as possible. Then see which system perfoms better with a variety of benchmarks.
The biggest gap in the benchmarks (IMHO) is the gcc comparison - it's well known that gcc-3.2 is much slower than 2.95. He should have used the same gcc version on both OSes, since this was an OS benchmark not a gcc benchmark.
It would make sense to have a 'non-whoring' button so you can still post with accountability (and a starting score of +1 or +2) but not accumulate extra karma from being modded up. Not that I would ever use such a button myself.
Copying form letters is very bad - they don't get read. But if you're wondering about how to use LaTeX you might want to look at some stuff wot I wrote earlier today. (mp_letter.tex in that directory; the other.tex files are earlier stuff.) I don't want to hold it up as a great exemplar though... write your own words and opinions. (But follow the FFII guidelines!)
You don't have to represent the filesystem as XML on disk at all. You'd probably store it in some performance-optimized binary format as at present, and generate XML when it's wanted.
It would be insane to represent the filesystem on disk as a big XML document. It makes a lot of sense to let you view particular portions of the filesystem as XML, or to give it XML-like semantics (ordered elements, unordered attributes, elements can contain text mixed with other elements) and adapt existing query languages (XQL etc etc) to work with it.
You may be right that OpenOffice can't read all Office documents, but given that Word on the Mac doesn't support Hebrew at all and thus cannot read _any_ Hebrew Office documents, OpenOffice has the advantage.
Perhaps this is the best argument for charging for bandwith usage, or at least the most acceptable to Slashdotters. It gives a financial incentive to people to clean up their systems and stop being easy prey to worms and viruses, and makes them pay for the damage they cause (whether deliberately or just through carelessness and using insecure software).
What makes you so sure that 'Stroustrup refuses to fix X'? C++ has been through a lot of changes since its beginning, most of them initiated by Stroustrup. What makes you think that he'd want to fix the language in stone now when he's shown no sign of doing so in the past?
More likely is that the standards committee is waiting for compilers to catch up to the _current_ standard before making new ones:-(.
'Read what he's written about the C++200x standards revision cycle.' OK, fair enough, you do have evidence. Got a link?
'Meanwhile, C++ is being abandoned for...' - hmm, I can't help feeling I have heard this on Slashdot before. I don't think C++ will be Dying any time soon. Some places (I speak from experience) are migrating away from Java towards C++.
If ECC memory isn't there and an error occurs, the system reboots.
How is the system to know whether an error has occurred? If the value at some byte address should be 202 but it comes out 201, how to determine that the result is wrong? For that you need parity memory, which is able to detect (though not recover from) single bit errors. As you say, the detection is called a parity error - but this requires memory with parity information (usually one extra bit for every eight).
I believe that common ECC memory is able to recover from single bit errors and detect two bit errors.
Or does the Mac G5 use parity memory? It seems hard to believe, hardly anyone manufactures parity memory these days, it's either cheapo no-checking sticks or full ECC.
It seems that there is some duplication of work. One group of people is working (with some success) to replace the older X11 bitmap fonts with outline fonts drawn at the right resolution, anti-aliased, and (modulo legal obstacles) hinted. Another group is replacing pixmap icons with SVG files rendered at the right resolution and again probably anti-aliased. Couldn't there be a single piece of code to do both? Why not use SVG to describe glyphs?
Feh. If I wanted to write that kind of JanglyCaps'd verbiage I'd just use Windows. If you are making a pretty and tasteful GUI, why spoil it by making the code ugly?
Yes, even a 486 box can have difficulty saturating a 10Mb/s Ethernet link, if it uses PIO. With DMA and busmastering you can do better, but still I'd question the ability of the OS and applications to keep up on a slower machine. (In practice, the older Ethernet cards are limited to about 3.5Mb/s due to a conservative implementation of checking whether the medium is free.)
I think classic Ethernet is about the oldest and slowest networking medium that's still widely supported, unless you want to use serial connections.
The idea is you first write a record to the journal - an area of disk set aside for this. If power fails while writing that, this record is lost, but the filesystem is not corrupted because the partially-written record can be cleanly ignored on next boot. And the write() system call or whatever won't return until the disk has finished writing to the journal (assuming ext3 with data=journal), so you know that if the write completed, it's definitely on the disk.
Later on something can come along and move data from the journal to another area of the disk, for efficiency when reading. This too can be done in such a way that no data is lost - if the power fails you can just reexamine the whole journal on the next reboot.
So if you lose power while the application is saying 'Busy saving...' you obviously won't get the file it was in the middle of writing. But once the app says 'Save complete' you know it really has gone to the disk. And no matter what happens, the filesystem won't be corrupted (short of bugs in the FS code or physical disk corruption, of course).
Actually I was thinking of data journalling as well (ext3 with data=journal); I agree that an IDE disk may buffer writes internally and so risk losing data if power goes down, but doesn't the OS compensate for this by issuing explicit flush instructions? A more high-end disk system will store just enough charge to write dirty buffers when power fails, or otherwise ensure that when the disk says it's written it's written.
Disk corruption due to uncommitted writes is kinda orthogonal to the filesystem used. Any filesystem can be corrupted if you write some updated blocks and leave out others. A simple filesystem like FAT might be better able to recover from corruption than a complex one, but still it is possible to corrupt a FAT filesystem with uncommitted writes. When MS-DOS 6 was introduced, many users reported disk corruption. This was not due to any bug in the OS but because smartdrv disk caching was being loaded and the users were switching off PCs before exiting applications to a command prompt (at which point smartdrv flushes writes to disk). Later MS-DOS 6.x versions turned off write-back caching in the default configuration.
If 'just switching off' causes problems because of disk corruption, you could try ext3 with data=journal. This makes sure that _everything_ is committed to the journal as soon as it's written. Another way of looking at it is a 'no lies' property - if the operating system says to the application that the write completed then you know it did complete and isn't just sitting as dirty buffers waiting to be flushed. So you could switch off the machine immediately the 'save' command finishes. (This assumes your hard disk isn't doing its own slightly dishonest buffering, of course.)
/home would stop the system being corrupted but still the user's files might be, and that's probably more serious.
If the corruption is caused by the applications leaving things in an inconsistent state if they don't get closed properly, then obviously no filesystem can do anything about that. Using a USB key for
Yes, I wasn't so much responding to the article (which is a reasonable comparison of disk technologies) as to the parent post which advocated varying filesystems, file sizes and so on to be more 'real world'.
Who are these 'zealots'? I've never come across one. Their main existence seems to be as straw men for lame Slashdot posts to argue against.
RMS may be a zealot but he doesn't match the template given in the article (eg, Microsoft is not the Great Satan) or many of the beliefs attributed to him in Slashdot postings ('all software has to be GPL not BSD-licensed').
Exactly - if extra RAM is a waste for your application, then SCSI would win the benchmark. So I don't see why it's unfair. For other applications IDE + tons-of-ram might beat SCSI for the same hardware cost, and that's a useful thing to know if you're deciding what hardware to buy.
If you increase the budget to $2000 it's by no means certain that IDE would lose out - it depends on the application. For example heavy updates and queries on a ten gigabyte database would always be faster if the whole database can be cached in RAM. (No matter how fast a disk, it will always be much slower than RAM.) And even for a terabyte database you can't be sure that SCSI would win, it depends on the access pattern.
Good point that too much RAM may hinder performance: I imagine to check for this you'd perform the benchmark with as much RAM as you can afford, and then again with half that amount (which would still come in within the budget).
In the real world, you must also take into consideration cost. A fair test would be to take a budget of $500 and try two setups, one with IDE and one withSCSI, with any leftover cash spent buying as much RAM as possible. Then see which system perfoms better with a variety of benchmarks.
The biggest gap in the benchmarks (IMHO) is the gcc comparison - it's well known that gcc-3.2 is much slower than 2.95. He should have used the same gcc version on both OSes, since this was an OS benchmark not a gcc benchmark.
It would make sense to have a 'non-whoring' button so you can still post with accountability (and a starting score of +1 or +2) but not accumulate extra karma from being modded up. Not that I would ever use such a button myself.
Copying form letters is very bad - they don't get read. But if you're wondering about how to use LaTeX you might want to look at some stuff wot I wrote earlier today. (mp_letter.tex in that directory; the other .tex files are earlier stuff.) I don't want to hold it up as a great exemplar though... write your own words and opinions. (But follow the FFII guidelines!)
You don't have to represent the filesystem as XML on disk at all. You'd probably store it in some performance-optimized binary format as at present, and generate XML when it's wanted.
It would be insane to represent the filesystem on disk as a big XML document. It makes a lot of sense to let you view particular portions of the filesystem as XML, or to give it XML-like semantics (ordered elements, unordered attributes, elements can contain text mixed with other elements) and adapt existing query languages (XQL etc etc) to work with it.
You may be right that OpenOffice can't read all Office documents, but given that Word on the Mac doesn't support Hebrew at all and thus cannot read _any_ Hebrew Office documents, OpenOffice has the advantage.
Are you saying the Mac uses parity memory? I assumed it did not (as mentioned above).
Well they were going to call the new service 'SCO', but...
Perhaps this is the best argument for charging for bandwith usage, or at least the most acceptable to Slashdotters. It gives a financial incentive to people to clean up their systems and stop being easy prey to worms and viruses, and makes them pay for the damage they cause (whether deliberately or just through carelessness and using insecure software).
What makes you so sure that 'Stroustrup refuses to fix X'? C++ has been through a lot of changes since its beginning, most of them initiated by Stroustrup. What makes you think that he'd want to fix the language in stone now when he's shown no sign of doing so in the past?
:-(.
More likely is that the standards committee is waiting for compilers to catch up to the _current_ standard before making new ones
'Read what he's written about the C++200x standards revision cycle.' OK, fair enough, you do have evidence. Got a link?
'Meanwhile, C++ is being abandoned for...' - hmm, I can't help feeling I have heard this on Slashdot before. I don't think C++ will be Dying any time soon. Some places (I speak from experience) are migrating away from Java towards C++.
How is the system to know whether an error has occurred? If the value at some byte address should be 202 but it comes out 201, how to determine that the result is wrong? For that you need parity memory, which is able to detect (though not recover from) single bit errors. As you say, the detection is called a parity error - but this requires memory with parity information (usually one extra bit for every eight).
I believe that common ECC memory is able to recover from single bit errors and detect two bit errors.
Or does the Mac G5 use parity memory? It seems hard to believe, hardly anyone manufactures parity memory these days, it's either cheapo no-checking sticks or full ECC.
It seems that there is some duplication of work. One group of people is working (with some success) to replace the older X11 bitmap fonts with outline fonts drawn at the right resolution, anti-aliased, and (modulo legal obstacles) hinted. Another group is replacing pixmap icons with SVG files rendered at the right resolution and again probably anti-aliased. Couldn't there be a single piece of code to do both? Why not use SVG to describe glyphs?
Given your Slashdot username, it's likely you don't see any problem with such identifiers ;-).
TextOut (hdc, 100, 100, "Hello world!");
EndPaint (hWnd, hdc);
Feh. If I wanted to write that kind of JanglyCaps'd verbiage I'd just use Windows. If you are making a pretty and tasteful GUI, why spoil it by making the code ugly?