If you use Emacs then you can turn on some wacky option to keep numbered backups and then you really do have every saved version hanging around. But it's not the same as having it for _all_ applications.
I think you could probably hack libc to give numbered backups without that much difficulty, but there isn't really the demand for it when explicit version control systems are available.
But you are perfectly able to stick your files in RCS, CVS or any other version control system. CVS has a high learning curve at first, but once you know the half-dozen useful commands it's very easy (especially with the Emacs vc keybindings).
I agree it's not quite the same as getting past revisions _every_ time you save the file, and you do have the effort of typing a log message when you commit a new version of a file, but it's no trouble once you get used to it. I have a directory ~/cvsroot for my personal stuff and keep almost everything important in CVS nowadays.
People always talk about how Windows NT is a descendant of VMS, and that may be true internally, but from the user's and programmer's point of view, Windows seems more like a perverse Unix variant with slashes the wrong way round. It hasn't adopted the weird VMS pathnames or shell language or file versioning that are the most obvious things a user sees when interacting with the OS. And the Windows API owes more to Win16 and strange things that happened back in the mists of time with OS/2 and Windows 1.0 than to any user interface DEC provided with VMS.
I think Mac OS X is a lot closer to Unix than Windows is to VMS.
Are you arguing that users should not have disk quotas? That if there are several users with home directories on a disk, each user should be allowed to take as much as he wants until the disk fills up? (Including buggy programs which generate huge files of log messages, and so on.)
Apparently Dave Cutler wanted to put the file versioning in NT as well, but Microsoft thought that Windows users wouldn't understand it. From a response he sent to the (now 404) 'Dave Cutler Fan Club':
Versioning in the VMS file system was a great feature and one that I would have liked to brought forward into NT. However, it was so hard to sell a new file system at all and multiple versions of the same file, although managable by programmers, might not have been so manageable by PC Users.
100 megs of hard disk space may cost ten cents, but when it has to be backed up and shared across the network (and you have a large number of users all wanting 100 megs) the costs increase. Besides, I am pretty sure that two megs back in the VMS days cost rather more in hardware than a hundred does now.
Steady on - C++ is a high level language, at least by the standards people always used to use, and it does have a good range of bounds-checked containers (safe arrays and the like). It does allow you to do low-level stuff if you need to and you know what you're doing; the trouble is that too many people who don't really need the low-level memory access still don't use safe containers for spurious performance reasons. Yes, moving everyone to Lisp or ML or Perl or even Java would solve a lot of memory scribbling problems, but so would making sure everyone uses safe string libraries rather than char*s and safe container libraries rather than raw arrays.
Also you can use tools like Splint and CCured to check at compile time and at run time for unsafe memory accesses. As with so much else in C and C++, you can write safe and correct memory access, and you can shoot yourself in the foot. You just have to take care and do the right thing, or if you don't want to worry about being careful then switch to a different language. But switching is not the only way to solve the problem.
The advantage in peeling might not be the CPU time so much as making things easier to download. If you are streaming music and there is a temporary network outage, a peeled format can continue but at a lower quality until it catches up. OTOH, the peeled format would need to download slightly more before it could start playing at all, since it is no longer the case that the first 10% of the piece is in the first 10% of the file. And the initial few seconds of music would be of noticeably lower quality, unless they were given special treatment. But still it might make a more robust system for distributing music files.
Imagine - keep downloading until you can no longer hear any distortions in the sound, then stop the download. People with less sensitive ears would no longer need to pay the same high bandwidth bills as 'audiophiles'. And you wouldn't need to choose a bit rate before starting your download.
Re:XML is NOT just text!
on
XML and Perl
·
· Score: 1
XML is text. XML is not just text.
The point is that the document conforms to a certain structure: either rigidly (as when validating against a DTD or similar schema definition), loosely (as with well-formed XML, where elements must be closed correctly, but you can mix any elements and attributes you want), or something in between.
It's not obvious at all that Perl is a natural mix for processing XML. The things which Perl does so well - line-by-line file processing, string operations, regular expressions - are not very useful on XML. (For example you cannot match a balanced tree structure with a regular expression, so you can't use the standard string processing to do something so simple as extract an element and its contents.) Indeed they may lead you in a false direction at first. For quick throwaway tools, where the file is already pretty-printed in a certain way, Perl string operations may do the trick; for building applications that need to handle XML they are inadequate.
To read and write XML you will need libraries, and that is the case in any language. Perl has a good selection including the standard-API-but-very-slow XML::DOM, the nonstandard-API-but-useful XML::Twig, and the I-used-to-use-it-but-IMHO-it-is-best-avoided XML::Simple. But using these libraries isn't particularly easier from Perl than from any other language.
The ideal XML processing language would have a type system which could check at compile time whether the output you are generating will be valid for the DTD you have chosen; and it would also map the XML's DTD or schema onto the language's type system at input. For example, no need to get the list of child elements and get the first element from it, if the DTD specifies that there must be exactly one child.
I can't wait for this technology to become sufficiently miniaturized that you can have it fitted internally, and just excrete pure white snow directly. It would certainly make snowball fights more interesting.
I thought progressive JPEGs were the kind where it loads from top to bottom, so if you download half the file you get the top half of the image. But I think I've also seen a kind where it does a first pass in really fuzzy detail, then later passes improve the image quality. Are these two the same thing (or am I just imagining things)?
I think it is a reference to the tutorial/textbook 'Oh, Pascal!', which at the time it was published was noted for referring to the programmer as 'she'. I don't know if that choice fits in with the book's rather girly-sounding title - the sequel was called 'Oh, my! Modula-2'. (The next logical step, 'Oh bugger, Oberon' has yet to be published.)
So perhaps the author is a Pascal sympathizer. Get the pitchforks!
'single out the things you need everyday' - yeah, right. Nobody _needs_ a colour screen, few people need a digital camera or GPS or Linux-on-handheld. People buy these things because they are kewl. Those whose kewlness threshold has already been met will have bought a PDA already. Other people want a wider array of gadgets, so they will be later adopters.
There was a Slashdot article a while back mentioning Ogg Vorbis 'peeling'. Like interlaced GIFs (or those weird blocky JPEGs whose correct name I don't know), the first part of the file is a low-quality version and then downloading more bits gives progressively higher quality.
I wonder if FLAC could be adapted do this too, so you could 'head --bytes 1000000' to get a lossy version. Okay, maybe not quite such good quality as an Ogg Vorbis file of the same size, but it might be good enough.
Yes, do your design before your code, but you will _still_ need to refactor, partly because requirements change, but also because after doing some implementation you may realize the original design was not quite right.
That much I think is not contentious. Very few people (even those with experience) can pick the ideal design ahead of doing any implementation or predict what the changes in requirements will be.
More controversial, perhaps, is the XP idea that your initial design should not be any more general than it needs to be to implement the functional requirements of your first code drop. There is some merit in this, since the requirements _will_ change and unused generality is often a waste of coding effort (not to mention creating extra complexity which may not be tested enough), but still I feel you have to use common sense and often design for extensibility at the start, even if you are not 100% certain the extra flexibility will be needed. You might be 50% certain and that is often enough.
But I do feel that the XP approach fits in with my personality. If there is no bus approaching bus stop, I would rather walk to the next stop than wait for a bus to come along, because at least I am making some small progress and this journey strategy minimizes risk, even if the mean journey time is shorter by waiting around at the bus stop.
If the code is well written, there's few need for writing tests (except for sophisticated algorithms)
Well, duh. If it is well written it won't have bugs, by definition. But how do you find out whether the code is well written, except by thoroughly testing it?
Actually I think perl5 was a rewrite and perl4 was thrown away. So they are throwing away two.
But I think Brooks was talking about implementing the same specification, while perl5 is a much more powerful and grown-up language than perl4, and perl6 vs perl5 may turn out to be the same. So they are throwing away (or heavily modifying) the specification too, which is different.
The other day I hacked together a script similarity
which uses gzip compression to work out how similar two files are. I find this useful when searching for almost-duplicate files.
The article mentioned a sysadmin who bought Dell hardware but immediately wiped off the installed Linux and put Debian on there. The important part of buying Linux hardware is not the preinstalled OS (after all, there is no licence to worry about) but the fact that, because it ships with Linux, you know that all the hardware is supported.
Therefore if Dell sold Linux laptops with Red Hat on them, plenty of people would buy them and immediately install Mandrake. They wouldn't be as happy as if Mandrake were preinstalled, but it's a whole lot better than buying a laptop full of cheesy Winhardware. Also, don't forget you wouldn't have to pay for a copy of Windows you don't use (unless the vendor has restrictive agreements with Microsoft).
Check out Splint (formerly LCLint). Whereas traditional lint is less needed now that compilers have -W switches, splint has a whole bunch of extra stuff which gcc won't warn about. The only trouble with it is that by default, it wants you to add annotations to your program to help it be checked (for example if a function parameter is a pointerm you can annotate whether it is allowed to be null). If you go along with that then splint can give lots of help in finding places where null pointer dereferences could happen, and other bugs. But even if you don't want to annotate and you use the less strict checking it's still a handy tool. (OK, maybe C99 has some of this stuff too, but splint has more.)
I think it is the software version of 'zero tolerance'. Get rid of beggars and squeegee merchants and you make the more serious crimes (bugs) easier to detect and solve. Or something like that.
When I said 'seriously argue' I meant, like, reasoned debate by people who know what they are talking about. Which is not to say that the peer-review advocates don't have their own personal interests too, even Whit Diffie, but in general they are trying to discuss what makes for better security, rather than just blowing FUD.
Or in other words: an 'unnamed MS source' quoted on ZDnet does not qualify as serious argument.
(I mentioned Slashdot trolls because there is a possibility that some of them sincerely believe that binary-only software is more secure.)
If you use Emacs then you can turn on some wacky option to keep numbered backups and then you really do have every saved version hanging around. But it's not the same as having it for _all_ applications.
I think you could probably hack libc to give numbered backups without that much difficulty, but there isn't really the demand for it when explicit version control systems are available.
But you are perfectly able to stick your files in RCS, CVS or any other version control system. CVS has a high learning curve at first, but once you know the half-dozen useful commands it's very easy (especially with the Emacs vc keybindings).
I agree it's not quite the same as getting past revisions _every_ time you save the file, and you do have the effort of typing a log message when you commit a new version of a file, but it's no trouble once you get used to it. I have a directory ~/cvsroot for my personal stuff and keep almost everything important in CVS nowadays.
People always talk about how Windows NT is a descendant of VMS, and that may be true internally, but from the user's and programmer's point of view, Windows seems more like a perverse Unix variant with slashes the wrong way round. It hasn't adopted the weird VMS pathnames or shell language or file versioning that are the most obvious things a user sees when interacting with the OS. And the Windows API owes more to Win16 and strange things that happened back in the mists of time with OS/2 and Windows 1.0 than to any user interface DEC provided with VMS.
I think Mac OS X is a lot closer to Unix than Windows is to VMS.
Are you arguing that users should not have disk quotas? That if there are several users with home directories on a disk, each user should be allowed to take as much as he wants until the disk fills up? (Including buggy programs which generate huge files of log messages, and so on.)
100 megs of hard disk space may cost ten cents, but when it has to be backed up and shared across the network (and you have a large number of users all wanting 100 megs) the costs increase. Besides, I am pretty sure that two megs back in the VMS days cost rather more in hardware than a hundred does now.
Steady on - C++ is a high level language, at least by the standards people always used to use, and it does have a good range of bounds-checked containers (safe arrays and the like). It does allow you to do low-level stuff if you need to and you know what you're doing; the trouble is that too many people who don't really need the low-level memory access still don't use safe containers for spurious performance reasons. Yes, moving everyone to Lisp or ML or Perl or even Java would solve a lot of memory scribbling problems, but so would making sure everyone uses safe string libraries rather than char*s and safe container libraries rather than raw arrays.
Also you can use tools like Splint and CCured to check at compile time and at run time for unsafe memory accesses. As with so much else in C and C++, you can write safe and correct memory access, and you can shoot yourself in the foot. You just have to take care and do the right thing, or if you don't want to worry about being careful then switch to a different language. But switching is not the only way to solve the problem.
The advantage in peeling might not be the CPU time so much as making things easier to download. If you are streaming music and there is a temporary network outage, a peeled format can continue but at a lower quality until it catches up. OTOH, the peeled format would need to download slightly more before it could start playing at all, since it is no longer the case that the first 10% of the piece is in the first 10% of the file. And the initial few seconds of music would be of noticeably lower quality, unless they were given special treatment. But still it might make a more robust system for distributing music files.
Imagine - keep downloading until you can no longer hear any distortions in the sound, then stop the download. People with less sensitive ears would no longer need to pay the same high bandwidth bills as 'audiophiles'. And you wouldn't need to choose a bit rate before starting your download.
XML is text. XML is not just text.
The point is that the document conforms to a certain structure: either rigidly (as when validating against a DTD or similar schema definition), loosely (as with well-formed XML, where elements must be closed correctly, but you can mix any elements and attributes you want), or something in between.
It's not obvious at all that Perl is a natural mix for processing XML. The things which Perl does so well - line-by-line file processing, string operations, regular expressions - are not very useful on XML. (For example you cannot match a balanced tree structure with a regular expression, so you can't use the standard string processing to do something so simple as extract an element and its contents.) Indeed they may lead you in a false direction at first. For quick throwaway tools, where the file is already pretty-printed in a certain way, Perl string operations may do the trick; for building applications that need to handle XML they are inadequate.
To read and write XML you will need libraries, and that is the case in any language. Perl has a good selection including the standard-API-but-very-slow XML::DOM, the nonstandard-API-but-useful XML::Twig, and the I-used-to-use-it-but-IMHO-it-is-best-avoided XML::Simple. But using these libraries isn't particularly easier from Perl than from any other language.
The ideal XML processing language would have a type system which could check at compile time whether the output you are generating will be valid for the DTD you have chosen; and it would also map the XML's DTD or schema onto the language's type system at input. For example, no need to get the list of child elements and get the first element from it, if the DTD specifies that there must be exactly one child.
I can't wait for this technology to become sufficiently miniaturized that you can have it fitted internally, and just excrete pure white snow directly. It would certainly make snowball fights more interesting.
I thought progressive JPEGs were the kind where it loads from top to bottom, so if you download half the file you get the top half of the image. But I think I've also seen a kind where it does a first pass in really fuzzy detail, then later passes improve the image quality. Are these two the same thing (or am I just imagining things)?
I think it is a reference to the tutorial/textbook 'Oh, Pascal!', which at the time it was published was noted for referring to the programmer as 'she'. I don't know if that choice fits in with the book's rather girly-sounding title - the sequel was called 'Oh, my! Modula-2'. (The next logical step, 'Oh bugger, Oberon' has yet to be published.)
So perhaps the author is a Pascal sympathizer. Get the pitchforks!
'single out the things you need everyday' - yeah, right. Nobody _needs_ a colour screen, few people need a digital camera or GPS or Linux-on-handheld. People buy these things because they are kewl. Those whose kewlness threshold has already been met will have bought a PDA already. Other people want a wider array of gadgets, so they will be later adopters.
There was a Slashdot article a while back mentioning Ogg Vorbis 'peeling'. Like interlaced GIFs (or those weird blocky JPEGs whose correct name I don't know), the first part of the file is a low-quality version and then downloading more bits gives progressively higher quality.
I wonder if FLAC could be adapted do this too, so you could 'head --bytes 1000000' to get a lossy version. Okay, maybe not quite such good quality as an Ogg Vorbis file of the same size, but it might be good enough.
Yes, do your design before your code, but you will _still_ need to refactor, partly because requirements change, but also because after doing some implementation you may realize the original design was not quite right.
That much I think is not contentious. Very few people (even those with experience) can pick the ideal design ahead of doing any implementation or predict what the changes in requirements will be.
More controversial, perhaps, is the XP idea that your initial design should not be any more general than it needs to be to implement the functional requirements of your first code drop. There is some merit in this, since the requirements _will_ change and unused generality is often a waste of coding effort (not to mention creating extra complexity which may not be tested enough), but still I feel you have to use common sense and often design for extensibility at the start, even if you are not 100% certain the extra flexibility will be needed. You might be 50% certain and that is often enough.
But I do feel that the XP approach fits in with my personality. If there is no bus approaching bus stop, I would rather walk to the next stop than wait for a bus to come along, because at least I am making some small progress and this journey strategy minimizes risk, even if the mean journey time is shorter by waiting around at the bus stop.
Well, duh. If it is well written it won't have bugs, by definition. But how do you find out whether the code is well written, except by thoroughly testing it?
Actually I think perl5 was a rewrite and perl4 was thrown away. So they are throwing away two.
But I think Brooks was talking about implementing the same specification, while perl5 is a much more powerful and grown-up language than perl4, and perl6 vs perl5 may turn out to be the same. So they are throwing away (or heavily modifying) the specification too, which is different.
This brings to mind the most annoying thing you can say to a doctor:
Physician, heal thyself!
The other day I hacked together a script similarity which uses gzip compression to work out how similar two files are. I find this useful when searching for almost-duplicate files.
But I imagine their home brew DB is brewed to a slightly higher strength than the American server?
The article mentioned a sysadmin who bought Dell hardware but immediately wiped off the installed Linux and put Debian on there. The important part of buying Linux hardware is not the preinstalled OS (after all, there is no licence to worry about) but the fact that, because it ships with Linux, you know that all the hardware is supported.
Therefore if Dell sold Linux laptops with Red Hat on them, plenty of people would buy them and immediately install Mandrake. They wouldn't be as happy as if Mandrake were preinstalled, but it's a whole lot better than buying a laptop full of cheesy Winhardware. Also, don't forget you wouldn't have to pay for a copy of Windows you don't use (unless the vendor has restrictive agreements with Microsoft).
I think it's kind of the opposite of cromulent.
Check out Splint (formerly LCLint). Whereas traditional lint is less needed now that compilers have -W switches, splint has a whole bunch of extra stuff which gcc won't warn about. The only trouble with it is that by default, it wants you to add annotations to your program to help it be checked (for example if a function parameter is a pointerm you can annotate whether it is allowed to be null). If you go along with that then splint can give lots of help in finding places where null pointer dereferences could happen, and other bugs. But even if you don't want to annotate and you use the less strict checking it's still a handy tool. (OK, maybe C99 has some of this stuff too, but splint has more.)
I think it is the software version of 'zero tolerance'. Get rid of beggars and squeegee merchants and you make the more serious crimes (bugs) easier to detect and solve. Or something like that.
When I said 'seriously argue' I meant, like, reasoned debate by people who know what they are talking about. Which is not to say that the peer-review advocates don't have their own personal interests too, even Whit Diffie, but in general they are trying to discuss what makes for better security, rather than just blowing FUD.
Or in other words: an 'unnamed MS source' quoted on ZDnet does not qualify as serious argument.
(I mentioned Slashdot trolls because there is a possibility that some of them sincerely believe that binary-only software is more secure.)