If you just want dotfiles, you can use a floppy for that. Any real 'work' I do is version controlled with CVS, so I'd just need to get a checkout to ramdisk or other temporary storage.
I'm still not saying it would be entirely practical, just that it would be so nice if it were.
First, I'd hope that Knoppix doesn't boot with any services enabled, at least none that have externally open ports. So probably there won't be too many remote exploits. Second, if these do happen then just get a new DVD - they're cheap enough to burn, and changing the disc over is probably rather easier than updating packages, especially for a non-technical user. Third, even if the machine is rooted the potential for installing rootkits on a read-only medium is somewhat limited:-).
Seriously, you could subscribe to a Knoppix service where they mail you a new DVD once a month or so, with more frequent deliveries in the unlikely event of a serious security hole.
If there were a DVD version of Knoppix with *every* free program you could possibly want to use installed - essentially Debian testing on a DVD - then maybe you could do without ordinary Linux distributions altogether. I'd certainly consider it, if I had a PC that was left on 24x7 and important things like mail and CVS on a central server.
If you have a Windows box, this is an important step forward in the quest to Run Everything Under Cygwin. You can try out your existing apps to see if they work under Wine. If eventually you manage to get all your applications working on top of Cygwin (including some or fewer through Wine), then you can yank away the bottom two layers and switch to a Unixlike OS.
Re:Not to be nitpicking...
on
Science Askew
·
· Score: 2
I still remember the Bagpuss episode where the frog tries to woo the princess after retrieving her silver ball, but she's not too keen on the idea. Then at the end he jumps up and kisses her, she turns into a frog and they live happily ever after. At least that's how I think it went.
What pisses me off about tar is that it appears to contain large chunks of zeroes to round everything off to a number of 'blocks'. Which, if they ever were relevant for tape devices, are surely not needed now. For example a tar file containing only a single empty file takes 10240 bytes - a large and suspicious number. By contrast a zipfile containing that empty file is a more reasonable 142 bytes.
'tar xvf' is fine; 'tar zxvf' is not because it has to read the whole archive sequentially and decompress it. (At least I think it does; in principle it could do a little better because gzip compression works in 32Kbyte blocks AFAIK.)
Hmm. 'Do one thing and do it well' might be a better strategy. There are existing very capable encryption and signing programs you can use on individual files or the whole zipfile; there are plenty of existing version management tools. Let the archiver just archive files.
With MS Office, try setting a password on your document. It gets compressed before being encrypted, so this is the easiest way to save disk space provided you can remember the password. At least, this was the case with the last versions of Word I used.
In the days when I used pkzip, I first bundled up the files into an uncompressed zipfile with -e0, and then compressed that. This gives you a few percent over compressing the files straight into a zipfile, when they are compressed individually. You lose the ability to extract individual files but who needs that anyway?
IMHO, since 99% of the time all you do with archives is create them or extract them, it's not worth implementing features like 'add to archive', 'delete from archive' or 'update archive'. Maybe those made sense with SEA ARC on CP/M when disk space was scarce and CPUs slow, but not now. You might as well take advantage of the simplicity and better compression that comes from treating the archive as a single lump.
Therefore the Unix model of tar and then a separate compression program makes more sense - even though tar is such a crusty and wasteful format. The only reason to use zipfiles still is compatibility.
(Although maybe someone will prove me wrong and say 'I update existing zipfiles every day, it's an essential feature, what I do is...'.)
The important part isn't the number of FLOPS (to get those you can just keep buying more PCs until you reach the desired number) but the performance in applications which are not 'embarassingly parallel'. In other words how good is the interconnect between machines? The article talks about a new network to replace Gigabit Ethernet.
You can work around faulty memory in software (like the BadRAM project) but it ought to be possible in hardware too. If you make a stick that is 128 gigabites + N where N is some number of bits greater than there will be faults, and have some hardware remapping. This might be slow if you want to do it with only a few extra transistors (I don't know), but for slow memory it might work.
This would be different to having a ROM with a defect list which is read by software: to the machine the memory would appear perfect. But I think that doing it in software is the technically better solution.
I don't have a permanent net connection so it's a bit awkward to use the validation web page; in any case, I'd have to write code to provide a command-line interface to it (I want the validation to run whenever I hit a keystroke in Emacs, ideally - and then hit Enter to jump to the offending line).
The link you posted to explains how to set up a RAM disk using the extra memory; that has been possible with Linux for a long time. But it doesn't really do what I want.
The lower, fast memory is only eight megabytes on the two particular systems I'm thinking of. If I used a process whose working set was greater than eight megs (or allowing for the kernel and X server, six megs) it would 'thrash' constantly to and from the RAM disk. Of course this is not nearly as bad as thrashing to physical disk, but it could be a lot worse than having a full 40 megs of RAM to use, even if most of that is slower.
The extra memory I have is real memory, you can address it just the same as RAM on the motherboard. If there is a need to, the kernel should be able to map processes into this slower RAM and run them from there, or run a process partly in faster and partly in slower RAM. Also, it should be able to choose between using the slower RAM for running processes, for 'swapping' or for disk cache, whereas if you set it up as a RAM disk it cannot be used as a disk cache. I would like the kernel to use the whole memory space but just *prefer* the lower eight megs, and maybe the memory manager could occasionally migrate pages between slow and fast memory depending on how recently they have been used.
In fact the machine has three memory speeds: the lower 8Mbyte, between 8Mbyte and 32Mbyte which is across the MCA bus but still cached by the processor's L1 cache (and I think by the L2 on the motherboard); and above 32Mbyte which is uncached. A simple RAM disk arrangement may be better than nothing but it is not ideal.
Does anyone know how I can tell Linux that some of the RAM is faster and some slower, while still having enough in total to run bloated apps like Emacs and Netscape 4.x without swapping?
At university I found you could usually tell how good a lecturer would be by the material used for slides. Those using LaTeX and its slides package usually had the most interesting courses (if more difficult); those with wordprocessors in the middle; PowerPoint usually meant fairly fluffy. There were exceptions and it wasn't a perfect correlation, but it was certainly a factor in choosing what course to take.
It's probably not what you want but FleXML is a very fast way of parsing XML that conforms to a particular DTD. It's like lex and yacc - is that C++-like enough?
I completely agree about all the weird reinvent-the-wheel stuff that DOM and similar libraries contain: it would be so much better if they could use the STL in C++ and native data structures in other languages (nested lists in Lisp, etc etc). It's just that a basic function call interface is the lowest common denominator, so if you want the same library on every language you have to invent a whole new list and tree API. Perhaps this is an indication that the same library on every different language isn't such a good idea. (Think of the Mozilla debate: 'the same on every platform' versus 'native on every platform'. I have a feeling that in programming languages as well as GUIs the second choice is better.)
Does anyone ever bother checking that their HTML is compliant? By which I mean validating it against the DTD. This ought to be an elementary step in HTML writing - just like compiling a C program is a first step towards checking it works - but it seems so difficult to set up that hardly anyone does it.
Most Linux systems nowadays include nsgmls, but that command has so many obscure options and SGML prologues are hard to understand. There needs to be a single command 'html_validate' which runs nsgmls with all the necessary command-line options (and obscure environment variables, and DTD files stored in the right place) to validate an HTML document. If that existed then I'd run it every time before saving my document and I'm sure many others would too. But at the moment setting up nsgmls to do HTML validation (at least on Linux-Mandrake) is horribly complex. (Especially if you want to validate XML as well; you need to set environment variables differently between the two uses.)
It's a pity that some of the research into making faster, physically smaller RAM modules hasn't gone into making smaller, cheaper ones. ('Fast, small, cheap' - pick any two.') I mean producing massive 128 gigabit memory sticks but running at a slow speed of 160ns, say. That could be used as swap space and disk cache, so that main memory becomes a kind of L3 or L4 cache in effect.
I wonder what the yield is on current RAM chips, and whether the faulty ones would work reliably if clocked at a much lower speed? Probably the current yield is fairly high, and the answer to the second question is no, so we can't expect ultra-cheap supplies of slow RAM to hit the market.
On the first-generation IBM PS/2s, the amount of ram on the motherboard (or in IBM-speak 'planar') was limited, with more added by plugging cards into the MCA bus. I have a Model 80 which has only eight megabytes on the motherboard but another 32 on a Kingston MCA card. Back then, RAM speeds were a lot slower and the new bus was fast - memory on the expansion card is only about twice as slow as that on the motherboard. (I haven't yet found a way of persuading Linux of this fact, I would prefer the kernel to use the lower eight megs preferentially.)
There was even a feature called 'matched memory cycles' in the very early machines where the MCA bus would be temporarily underclocked when accessing memory so that it could work synchronously (cutting some wait states). But then the increasing speed of RAM and the fairly constant bus speed (MCA was 32 bits wide at 10MHz, standard PCI not that much better at 33MHz, while RAM access times have gone down hugely from 85ns to goodness knows what) made the idea look silly, and IBM abandonded MCA-bus memory cards for its second-generation models in 1992 or so. Nowadays you could never get away with using something so slow as the PCI bus for 'memory', so it has to be marketed as 'RAM disk'.
I suppose a good use of this is it may support much more RAM than you can get on the motherboard. You might have six PCI slots - filling each one of those with a RAM drive gets say 12 gigabytes of extra RAM or at least extremely fast swap space. With four DIMM sockets (which most motherboards don't have AFAIK) it would be hard to get more than 4 gigabytes on the motherboard.
OTOH, if you have such large memory requirements you'd probably be using some serious 64-bit hardware and not Intel-based toys.
If you just want dotfiles, you can use a floppy for that. Any real 'work' I do is version controlled with CVS, so I'd just need to get a checkout to ramdisk or other temporary storage.
I'm still not saying it would be entirely practical, just that it would be so nice if it were.
First, I'd hope that Knoppix doesn't boot with any services enabled, at least none that have externally open ports. So probably there won't be too many remote exploits. Second, if these do happen then just get a new DVD - they're cheap enough to burn, and changing the disc over is probably rather easier than updating packages, especially for a non-technical user. Third, even if the machine is rooted the potential for installing rootkits on a read-only medium is somewhat limited :-).
Seriously, you could subscribe to a Knoppix service where they mail you a new DVD once a month or so, with more frequent deliveries in the unlikely event of a serious security hole.
If there were a DVD version of Knoppix with *every* free program you could possibly want to use installed - essentially Debian testing on a DVD - then maybe you could do without ordinary Linux distributions altogether. I'd certainly consider it, if I had a PC that was left on 24x7 and important things like mail and CVS on a central server.
Or computer monitors shared between several users, some of whom seemingly have a tendency to fry eggs on the screen.
At least they won't run out of toilet paper. Mmm, lay-flat binding...
Step 1: Imagine a Beowulf cluster of these.
Step 2: ???
Step 3: Profit!
If you have a Windows box, this is an important step forward in the quest to Run Everything Under Cygwin. You can try out your existing apps to see if they work under Wine. If eventually you manage to get all your applications working on top of Cygwin (including some or fewer through Wine), then you can yank away the bottom two layers and switch to a Unixlike OS.
I still remember the Bagpuss episode where the frog tries to woo the princess after retrieving her silver ball, but she's not too keen on the idea. Then at the end he jumps up and kisses her, she turns into a frog and they live happily ever after. At least that's how I think it went.
I don't know who chose the domain 'kids' but goat-related domain names do not always have a spotless record...
What pisses me off about tar is that it appears to contain large chunks of zeroes to round everything off to a number of 'blocks'. Which, if they ever were relevant for tape devices, are surely not needed now. For example a tar file containing only a single empty file takes 10240 bytes - a large and suspicious number. By contrast a zipfile containing that empty file is a more reasonable 142 bytes.
'tar xvf' is fine; 'tar zxvf' is not because it has to read the whole archive sequentially and decompress it. (At least I think it does; in principle it could do a little better because gzip compression works in 32Kbyte blocks AFAIK.)
Hmm. 'Do one thing and do it well' might be a better strategy. There are existing very capable encryption and signing programs you can use on individual files or the whole zipfile; there are plenty of existing version management tools. Let the archiver just archive files.
With MS Office, try setting a password on your document. It gets compressed before being encrypted, so this is the easiest way to save disk space provided you can remember the password. At least, this was the case with the last versions of Word I used.
In the days when I used pkzip, I first bundled up the files into an uncompressed zipfile with -e0, and then compressed that. This gives you a few percent over compressing the files straight into a zipfile, when they are compressed individually. You lose the ability to extract individual files but who needs that anyway?
IMHO, since 99% of the time all you do with archives is create them or extract them, it's not worth implementing features like 'add to archive', 'delete from archive' or 'update archive'. Maybe those made sense with SEA ARC on CP/M when disk space was scarce and CPUs slow, but not now. You might as well take advantage of the simplicity and better compression that comes from treating the archive as a single lump.
Therefore the Unix model of tar and then a separate compression program makes more sense - even though tar is such a crusty and wasteful format. The only reason to use zipfiles still is compatibility.
(Although maybe someone will prove me wrong and say 'I update existing zipfiles every day, it's an essential feature, what I do is...'.)
The important part isn't the number of FLOPS (to get those you can just keep buying more PCs until you reach the desired number) but the performance in applications which are not 'embarassingly parallel'. In other words how good is the interconnect between machines? The article talks about a new network to replace Gigabit Ethernet.
You can work around faulty memory in software (like the BadRAM project) but it ought to be possible in hardware too. If you make a stick that is 128 gigabites + N where N is some number of bits greater than there will be faults, and have some hardware remapping. This might be slow if you want to do it with only a few extra transistors (I don't know), but for slow memory it might work.
This would be different to having a ROM with a defect list which is read by software: to the machine the memory would appear perfect. But I think that doing it in software is the technically better solution.
I don't have a permanent net connection so it's a bit awkward to use the validation web page; in any case, I'd have to write code to provide a command-line interface to it (I want the validation to run whenever I hit a keystroke in Emacs, ideally - and then hit Enter to jump to the offending line).
I'm fed up with these xenophobic jokes about those crazy Poles and how they are always 'about to flip'.
The link you posted to explains how to set up a RAM disk using the extra memory; that has been possible with Linux for a long time. But it doesn't really do what I want.
The lower, fast memory is only eight megabytes on the two particular systems I'm thinking of. If I used a process whose working set was greater than eight megs (or allowing for the kernel and X server, six megs) it would 'thrash' constantly to and from the RAM disk. Of course this is not nearly as bad as thrashing to physical disk, but it could be a lot worse than having a full 40 megs of RAM to use, even if most of that is slower.
The extra memory I have is real memory, you can address it just the same as RAM on the motherboard. If there is a need to, the kernel should be able to map processes into this slower RAM and run them from there, or run a process partly in faster and partly in slower RAM. Also, it should be able to choose between using the slower RAM for running processes, for 'swapping' or for disk cache, whereas if you set it up as a RAM disk it cannot be used as a disk cache. I would like the kernel to use the whole memory space but just *prefer* the lower eight megs, and maybe the memory manager could occasionally migrate pages between slow and fast memory depending on how recently they have been used.
In fact the machine has three memory speeds: the lower 8Mbyte, between 8Mbyte and 32Mbyte which is across the MCA bus but still cached by the processor's L1 cache (and I think by the L2 on the motherboard); and above 32Mbyte which is uncached. A simple RAM disk arrangement may be better than nothing but it is not ideal.
Does anyone know how I can tell Linux that some of the RAM is faster and some slower, while still having enough in total to run bloated apps like Emacs and Netscape 4.x without swapping?
At university I found you could usually tell how good a lecturer would be by the material used for slides. Those using LaTeX and its slides package usually had the most interesting courses (if more difficult); those with wordprocessors in the middle; PowerPoint usually meant fairly fluffy. There were exceptions and it wasn't a perfect correlation, but it was certainly a factor in choosing what course to take.
It's probably not what you want but FleXML is a very fast way of parsing XML that conforms to a particular DTD. It's like lex and yacc - is that C++-like enough?
I completely agree about all the weird reinvent-the-wheel stuff that DOM and similar libraries contain: it would be so much better if they could use the STL in C++ and native data structures in other languages (nested lists in Lisp, etc etc). It's just that a basic function call interface is the lowest common denominator, so if you want the same library on every language you have to invent a whole new list and tree API. Perhaps this is an indication that the same library on every different language isn't such a good idea. (Think of the Mozilla debate: 'the same on every platform' versus 'native on every platform'. I have a feeling that in programming languages as well as GUIs the second choice is better.)
Does anyone ever bother checking that their HTML is compliant? By which I mean validating it against the DTD. This ought to be an elementary step in HTML writing - just like compiling a C program is a first step towards checking it works - but it seems so difficult to set up that hardly anyone does it.
Most Linux systems nowadays include nsgmls, but that command has so many obscure options and SGML prologues are hard to understand. There needs to be a single command 'html_validate' which runs nsgmls with all the necessary command-line options (and obscure environment variables, and DTD files stored in the right place) to validate an HTML document. If that existed then I'd run it every time before saving my document and I'm sure many others would too. But at the moment setting up nsgmls to do HTML validation (at least on Linux-Mandrake) is horribly complex. (Especially if you want to validate XML as well; you need to set environment variables differently between the two uses.)
It's a pity that some of the research into making faster, physically smaller RAM modules hasn't gone into making smaller, cheaper ones. ('Fast, small, cheap' - pick any two.') I mean producing massive 128 gigabit memory sticks but running at a slow speed of 160ns, say. That could be used as swap space and disk cache, so that main memory becomes a kind of L3 or L4 cache in effect.
I wonder what the yield is on current RAM chips, and whether the faulty ones would work reliably if clocked at a much lower speed? Probably the current yield is fairly high, and the answer to the second question is no, so we can't expect ultra-cheap supplies of slow RAM to hit the market.
On the first-generation IBM PS/2s, the amount of ram on the motherboard (or in IBM-speak 'planar') was limited, with more added by plugging cards into the MCA bus. I have a Model 80 which has only eight megabytes on the motherboard but another 32 on a Kingston MCA card. Back then, RAM speeds were a lot slower and the new bus was fast - memory on the expansion card is only about twice as slow as that on the motherboard. (I haven't yet found a way of persuading Linux of this fact, I would prefer the kernel to use the lower eight megs preferentially.)
There was even a feature called 'matched memory cycles' in the very early machines where the MCA bus would be temporarily underclocked when accessing memory so that it could work synchronously (cutting some wait states). But then the increasing speed of RAM and the fairly constant bus speed (MCA was 32 bits wide at 10MHz, standard PCI not that much better at 33MHz, while RAM access times have gone down hugely from 85ns to goodness knows what) made the idea look silly, and IBM abandonded MCA-bus memory cards for its second-generation models in 1992 or so. Nowadays you could never get away with using something so slow as the PCI bus for 'memory', so it has to be marketed as 'RAM disk'.
I suppose a good use of this is it may support much more RAM than you can get on the motherboard. You might have six PCI slots - filling each one of those with a RAM drive gets say 12 gigabytes of extra RAM or at least extremely fast swap space. With four DIMM sockets (which most motherboards don't have AFAIK) it would be hard to get more than 4 gigabytes on the motherboard.
OTOH, if you have such large memory requirements you'd probably be using some serious 64-bit hardware and not Intel-based toys.