I strongly disagree. I should not have to give up my constitutional right to protection from search and seizure in order to exercise my consitutional right to petition the government for redress of grievences. This world will always have a wacko or two like McVeigh, and it will always be necessary to catch them after the fact. I can live with that.
It is the responsibility of a free democratic government to not create entire regions of the world and peoples that want to kill us. Our government has completely failed by its complicity in creating and supporting oppressive regimes in the middle east. Now it's trying to cover its ass by removing our freedoms, and I won't stand for it.
The argument that "we must search you because you might be carrying a bomb" is true for any person, anywhere, any time. It was also true at the time the consitution was written, yet the forefathers had the wisdom to write the fourth ammendment.
The terrorists have already won. We have a fight ahead of us. The fight to turn the United States from a police state back into the Home of the Free. But it is a good fight, one we must take up. We must fight both for our own freedom, and for the freedom of those we have oppressed for so long in the name of secure access to oil.
Do you have any pointers as to the computational difficulty of parsing XML? Someone else provided a pointer to expat, and while it has impressive benchmarks, it is really meaningless. That it can parse a small xml file in 0.01s is not impressive if it goes like 0.01*N^3 for a file of N bytes.
For the record, xmldiff is STILL running (that's > 22 hrs). "Extraordinarliy stupid" is right. But nonetheless I want to do diffing of xml. The only other tools I can find are proprietary. This can't be that hard.
Also, galeon does not handle its bookmarks file in "a blink of an eye", it often sits for several seconds while the interface hangs, munging its bookmarks.
Also be careful with your comparisons. Giving XML an advantage by comparing it to a poorly designed binary format is not a reasonable comparison.
I have not seen the case you mention, multiple XML files zipped up, one of them an index. But that's an interesting idea.
It is not CPU usage that hogs the system, it is disk I/O. tar (for a large file) forces all running programs onto disk, so that all memory is being used as a cache for this huge file. Then whenever you try to do something with an interactive program it must swap the entire thing back into memory. It then stays in memory for about 5s until tar provides some more memory pressure and puts it back on disk.
It is the constant swapping-in and swapping-out that make the system unusable. Nice has absolutely no effect on this.
I am simply pointing out with an example that parsing XML is O(n^3) while diff is O(n). On the XML, that's one 'n' to find the end of the tag, another 'n' to find the closing tag, and another 'n' for the number of tags in the document. Ok now go argue that some of my n's are bigger than others, and reduce it to O(n^2), but that is the best you will do. O(n^2) is nothing to be proud of, especially for a data STORAGE format.
Show me a better xmldiff and I'll test it. Also see my other responses in this thread.
Another horrible example is XML-RPC. You want RPC to be fast, dammit.
And I still want an xmldiff that runs in a reasonable amount of time. I want to share my bookmarks among my computers.
10MB is not "large" by today's standards. And I simply don't believe you on that one. Any reasonable binary format will have some kind of internal structure (like an index of pointers) that will allow accessing the data without parsing the entire document. Only comparing parsing-the-entire-document to parsing-the-entire-document does XML come out close (and it will not be even because of the overhead of parsing the tags themselves, which a binary format does not have to do). For all other possible tasks it is slower (because all other tasks require parsing the entire document). A solution which can only reasonably handle data up to 10MB on modern hardware should not be considered a success...
But no parsing of a 650k file should take 45 minutes (it's up to 45 minutes CPU now).
In other projects like this I have directly compared regex parsing and tag-based parsing. You are right, doing it "right" takes about a second for most reasonable tasks. Using regexes to accomplish the same thing I can do most tasks in 0.01 seconds.
Again, parsing XML is a CPU-intensive task, thereby making it useless for anything which requires moving a large amount of data. It comes down to having to parse the entire tree to get at any tiny piece of data. (exactly as the author of the original article suggests)
It's quite simple really. Taking a look at my 650k galeon bookmarks file (stored in XBEL, an XML syntax for bookmarks), and using the utility xmldiff (a python script that attempts to find differences in xml), I obtain the following results:
$ time diff bookmarks.xbel bookmarks.xbel.old >/dev/null 0.073u 0.042s 0:00.12 91.6% 0+0k 0+0io 0pf+0w $ time xmldiff bookmarks.xbel bookmarks.xbel.old >/dev/null (I got impatient, top tells me 7:28 and counting -- if I remember correctly it takes more than 1/2 hour)
That is, parsing XML is more than 5000 times SLOWER than the line-by-line comparison. It is inappropriate for most purposes simply because the resources required to parse any decent-sized XML file are prohibitive.
The horrible problem with linux right now though is that because the memory management is so braindead, that backup will swap out everything in memory in favor of caching your multi-gigabyte backup file. Thus your method brings the machine to a standstill while the backup is occuring (which can take hours to days depending on the size of your filesystem).
Not a criticism of your method (in fact, I use this), just a rant that the Linux MM system NEEDS TO BE FIXED. I'm sick of watching as some trivial process that will only read or write once gets the whole filesystem cached for it while programs I'm using interactively get swapped to disk. Video recording and playing programs (mplayer, ogle) have the same problem.
Let's hope 2.6 is better than 2.4. Can any kernel hackers comment on this? In 2.5 will tar cvjf/home/mnt/backup/home.tar.bz2 bring my system to its knees?
I recently purchased a linux laptop, and dumped everything I learned from the experience onto this page. If I've missed any vendors feel free to email me and I'll add them.
First of all, hit ctrl-shift-numlock. Then you can use the numeric keypad to move the mouse cursor (probably to the lower right where it is not visible).
Second of all, you might be able to find a way to change the cursor pixmap/bitmap to be transparent. It can be done via an X API call, but I don't know if there are any command line utilities that will do it for you.
-- Bob
I disagree. Extracting 1-2% of profits for copyright renewal will still allow corporations to keep unpopular works (which had little profit to begin with) locked up.
Furthermore, I see no reason that popular things should receive more protection than unpopular things. The copyright clause was intended to stimulate innovation and creation. Milking old works for hundreds of years is not stimulating innovation.
Also, as time passes popular works become part of the creative unconscious. Nobody can write cartoons about mice with big black ears because they would get sued for infringement. Long copyrights on popular works effectively censor that which people think about most. You prevent derivative works, for all time.
No, popular works should enter the public domain, during the lifetime of those who knew them best. Casablanca, The Wizard of Oz, the Beatles, all should by now be in the public domain. Why should authors get to milk their old works until they die? Why shouldn't authors have to save their money for retirement during their best working years like the rest of us? Why are authors special?
Having recently had a lot of trouble with my laptop's BIOS, on an issue that I could most certainly fix if I had access to the code... I started wondering what benefit AMI and other vendors have by keeping BIOS code secret? I can think of none whatsoever.
An open-source TCPA BIOS might go a long way to alleviating the fears of the open source community, since we could see exactly what it is you're forcing on us. And hey, no doubt you'd get a few bug-fixing patches in return for your efforts.
So, is an open-source BIOS a possibility? (TCPA or otherwise)
The speed-up is indeed approximately 400% (or larger). Just gzip an html document and that compression ratio is the speedup. (as you say though this is server-side) Of course, the real killer for web page speed is images. The killer for modems is latency. The more items you have to request, the slower it is. Most modems have 300ms ping time to anywhere. That's why you want to strip out the banner ads to remove excess annoying images. In my experience the combination of the two makes for a significant speedup in load times over a modem.
I was downright alarmed at how fast my proxy was when running on a fast server, and I was pulling pages over the modem.
None of the responses yet seem to have noticed the "antenna" part of your post. Anyway, there is ample support for FM radio cards in linux. Check out drivers/media/radio/* in the linux kernel tree. A number of the TV tuner cards can also tune in to FM. Now then, let's see.
Here is a list of radio tuner apps for linux and here's another. Also try googling for "linux FM radio tuner card". These apps, along with a sound card (depending on what kind of FM tuner you get) and oggenc/lame and a little scripting (hint: cron job), and you should be in business.
At the Fermilab Computer Center there is a display at the entrance. On a round table about 4 feet in diameter are various storage devices over the years of various density. Floppies, hard drives, zip disks, etc. Then you realize the table itself is one of those 4 foot platters from one of those ancient hard drives...
The single best thing you can have for good drivers is a group of dedicated people working on the driver, with proper access to the specifications of the device. This is the primary reason why you see the variation in the drivers, not the GPL. Given two driver development groups where everything else is equal, the GPL'ed driver will quickly be superior, and it will outlive its binary counterpart.
Of course, there's also the fact that large parts of the ATI specs are not ever going to be released to the OSS community, so there's no way they could write a really fast driver... (from what I've heard)
That driver is written and supported by the manufacturer of the product.
This is almost universally a bad way to do it. It results in crappy drivers and poor user support (companies only want to sell you shit...after you buy it they don't really care if you can't get the driver to work). This is the source of 99% of the instability of Windoze...crappy drivers that bring the system down.
Back in the day there was this company called WordPerfect. Their schtick was that there were thousands of printers, but no universal way to get shit printed. So they wrote printer drivers, for all of them, and they were fantastic. WordPerfect quickly took over the market because they wrote printer drivers. They knew how the printers would be used and figured the best way to access them, and were motivated to maintain the whole base of drivers.
Open source drivers are much the same way. Owners of hardware have a pretty serious motivation to make the drivers work. You also get higher quality drivers because of the many-eyeball effect. The best situation for customers and companies alike is for the companies to release a "beta" driver, detailed specs, and hire one guy to coordinate work on the driver. Then let the thing evolve. It's the beauty of the source.
"Source code is like manure, if you spread it around things grow. If you hoarde it, it just smells bad."
Do not fall into the must-upgrade fallacy. Just because there is a new kernel version does not mean that company has to use it. Older kernels are still maintained, and security updates do make it in.
If it didn't do what you wanted at the time you installed linux, you made the wrong choice. If it did, then there's no motivation to upgrade.
This cyclic consumerism where we have to upgrade our computers once a week makes me sick. Buy tools to perform a particular job. They're not going to magically stop doing that job years down the road.
That's the point...Dell doesn't manufacture them, they're contracted out to ODM's (=Original Design Manufacturers) Compal/Clevo/Asus/Wistron/Mitac/FIC/Uniwill/Quanta/GVC. (I think I was wrong about Sager -- according to this page they're actually made by Clevo) So you'd actually want to go to one of the them and ask for 2000 notebooks. But that's exactly what the companies on my list do. They resell some of the same laptops that you find from Dell/Compaq/Sony/etc.
Now, Toshiba, Sony, Apple, and IBM do manufacture many of their own laptops, but not all. So in principle you could go to them and ask for 2k laptops... (I think only IBM and Apple manufacture all their own...)
We need to increase the number of Linux vendors though. No-OS laptop vendors have a hell of a time identifying and diagnosing hardware problems (since the software that gets installed varies wildly). If you ordered a batch of 2k, a percent or two would have some hardware problem that you'd have to deal with...
I can't stand them either. I got a logitech optical USB mouse and it works great. I think you can pick them up for $25 at best buy or compusa. You can configure X to use both the trackpad and mouse simultaneously.
I haven't seen a tackball on a laptop in a long time. Everyone either has the trackpad or the eraserhead. Sorry.:(
Having recently purchased a laptop, I extensively researched the companies that will sell laptops with no-OS or Linux preinstalled. This information is distressingly difficult to find, so I present a list below.
I encourage you all to vote with your dollar and do not send a single penny to the monopoly in Redmond.
You should realize though that most of these companies purchase the hardware from companies like Sager (Linux forum) and Compal, and those companies also supply the big-name guys like Compaq, HP, Dell, and Toshiba. So when you find some no-name laptop, it is usually equivalent to some branded laptop that never touched the hands of HP/Compaq/Toshiba/Dell. (And figuring out exactly *which* brand-name laptop it is equivalent to can be extremely difficult) Some of the below claim to manufacture their own notebooks, but what this means is that they buy them from Saeger/Compal or someone else, and put in a hard drive/CPU/RAM, which is why you will find identical looking cases at several of these vendors.
If you find a HP/Compaq/Toshiba/Dell/IBM/Sony branded laptop that has linux preinstalled, it means that the vendor paid for windows and removed it. I do not list them below because I think this is a despicable and deceptive practice. These manufacturers do not (yet) sell no-os or linux laptops. (But please, call them and ask!! The squeaky wheel gets the grease!) Also if you order a no-OS laptop, please request linux to be installed anyway, and tell them you'll pay for it! Let them know there is demand!
Mtech Laptops (these guys outright lied to me about what they could deliver, in order to get my order, were not able to deliver the laptop, and I had to cancel my order -- which took 3 months to process and they kept $5 for the priviledge -- do not do business with them)
-- Bob
It is the responsibility of a free democratic government to not create entire regions of the world and peoples that want to kill us. Our government has completely failed by its complicity in creating and supporting oppressive regimes in the middle east. Now it's trying to cover its ass by removing our freedoms, and I won't stand for it.
The argument that "we must search you because you might be carrying a bomb" is true for any person, anywhere, any time. It was also true at the time the consitution was written, yet the forefathers had the wisdom to write the fourth ammendment.
The terrorists have already won. We have a fight ahead of us. The fight to turn the United States from a police state back into the Home of the Free. But it is a good fight, one we must take up. We must fight both for our own freedom, and for the freedom of those we have oppressed for so long in the name of secure access to oil.
-- Bob
Not only that, but MSN finds 604 pages, google finds over 50 million. Are they actively removing linux sites from their crawler?
For the record, xmldiff is STILL running (that's > 22 hrs). "Extraordinarliy stupid" is right. But nonetheless I want to do diffing of xml. The only other tools I can find are proprietary. This can't be that hard.
Also, galeon does not handle its bookmarks file in "a blink of an eye", it often sits for several seconds while the interface hangs, munging its bookmarks.
Also be careful with your comparisons. Giving XML an advantage by comparing it to a poorly designed binary format is not a reasonable comparison.
I have not seen the case you mention, multiple XML files zipped up, one of them an index. But that's an interesting idea.
-- Bob
It is the constant swapping-in and swapping-out that make the system unusable. Nice has absolutely no effect on this.
-- Bob
Show me a better xmldiff and I'll test it. Also see my other responses in this thread.
-- Bob
Another horrible example is XML-RPC. You want RPC to be fast, dammit.
And I still want an xmldiff that runs in a reasonable amount of time. I want to share my bookmarks among my computers.
10MB is not "large" by today's standards. And I simply don't believe you on that one. Any reasonable binary format will have some kind of internal structure (like an index of pointers) that will allow accessing the data without parsing the entire document. Only comparing parsing-the-entire-document to parsing-the-entire-document does XML come out close (and it will not be even because of the overhead of parsing the tags themselves, which a binary format does not have to do). For all other possible tasks it is slower (because all other tasks require parsing the entire document). A solution which can only reasonably handle data up to 10MB on modern hardware should not be considered a success...
-- Bob
But no parsing of a 650k file should take 45 minutes (it's up to 45 minutes CPU now).
In other projects like this I have directly compared regex parsing and tag-based parsing. You are right, doing it "right" takes about a second for most reasonable tasks. Using regexes to accomplish the same thing I can do most tasks in 0.01 seconds.
Again, parsing XML is a CPU-intensive task, thereby making it useless for anything which requires moving a large amount of data. It comes down to having to parse the entire tree to get at any tiny piece of data. (exactly as the author of the original article suggests)
-- Bob
-- Bob
Not a criticism of your method (in fact, I use this), just a rant that the Linux MM system NEEDS TO BE FIXED. I'm sick of watching as some trivial process that will only read or write once gets the whole filesystem cached for it while programs I'm using interactively get swapped to disk. Video recording and playing programs (mplayer, ogle) have the same problem.
Let's hope 2.6 is better than 2.4. Can any kernel hackers comment on this? In 2.5 will tar cvjf /home /mnt/backup/home.tar.bz2 bring my system to its knees?
-- Bob
-- Bob
Well it's sqrt(1) = +/- 1. And that is the error.
First of all, hit ctrl-shift-numlock. Then you can use the numeric keypad to move the mouse cursor (probably to the lower right where it is not visible). Second of all, you might be able to find a way to change the cursor pixmap/bitmap to be transparent. It can be done via an X API call, but I don't know if there are any command line utilities that will do it for you. -- Bob
Furthermore, I see no reason that popular things should receive more protection than unpopular things. The copyright clause was intended to stimulate innovation and creation. Milking old works for hundreds of years is not stimulating innovation.
Also, as time passes popular works become part of the creative unconscious. Nobody can write cartoons about mice with big black ears because they would get sued for infringement. Long copyrights on popular works effectively censor that which people think about most. You prevent derivative works, for all time.
No, popular works should enter the public domain, during the lifetime of those who knew them best. Casablanca, The Wizard of Oz, the Beatles, all should by now be in the public domain. Why should authors get to milk their old works until they die? Why shouldn't authors have to save their money for retirement during their best working years like the rest of us? Why are authors special?
-- Bob
An open-source TCPA BIOS might go a long way to alleviating the fears of the open source community, since we could see exactly what it is you're forcing on us. And hey, no doubt you'd get a few bug-fixing patches in return for your efforts.
So, is an open-source BIOS a possibility? (TCPA or otherwise)
-- Bob
I was downright alarmed at how fast my proxy was when running on a fast server, and I was pulling pages over the modem.
-- Bob
Here is a list of radio tuner apps for linux and here's another. Also try googling for "linux FM radio tuner card". These apps, along with a sound card (depending on what kind of FM tuner you get) and oggenc/lame and a little scripting (hint: cron job), and you should be in business.
-- Bob
-- Bob
Of course, there's also the fact that large parts of the ATI specs are not ever going to be released to the OSS community, so there's no way they could write a really fast driver... (from what I've heard)
-- Bob
-- Bob
Back in the day there was this company called WordPerfect. Their schtick was that there were thousands of printers, but no universal way to get shit printed. So they wrote printer drivers, for all of them, and they were fantastic. WordPerfect quickly took over the market because they wrote printer drivers. They knew how the printers would be used and figured the best way to access them, and were motivated to maintain the whole base of drivers.
Open source drivers are much the same way. Owners of hardware have a pretty serious motivation to make the drivers work. You also get higher quality drivers because of the many-eyeball effect. The best situation for customers and companies alike is for the companies to release a "beta" driver, detailed specs, and hire one guy to coordinate work on the driver. Then let the thing evolve. It's the beauty of the source.
"Source code is like manure, if you spread it around things grow. If you hoarde it, it just smells bad."
-- Bob
If it didn't do what you wanted at the time you installed linux, you made the wrong choice. If it did, then there's no motivation to upgrade.
This cyclic consumerism where we have to upgrade our computers once a week makes me sick. Buy tools to perform a particular job. They're not going to magically stop doing that job years down the road.
-- Bob
Now, Toshiba, Sony, Apple, and IBM do manufacture many of their own laptops, but not all. So in principle you could go to them and ask for 2k laptops... (I think only IBM and Apple manufacture all their own...)
We need to increase the number of Linux vendors though. No-OS laptop vendors have a hell of a time identifying and diagnosing hardware problems (since the software that gets installed varies wildly). If you ordered a batch of 2k, a percent or two would have some hardware problem that you'd have to deal with...
-- Bob
I haven't seen a tackball on a laptop in a long time. Everyone either has the trackpad or the eraserhead. Sorry. :(
-- Bob
You should realize though that most of these companies purchase the hardware from companies like Sager (Linux forum) and Compal, and those companies also supply the big-name guys like Compaq, HP, Dell, and Toshiba. So when you find some no-name laptop, it is usually equivalent to some branded laptop that never touched the hands of HP/Compaq/Toshiba/Dell. (And figuring out exactly *which* brand-name laptop it is equivalent to can be extremely difficult) Some of the below claim to manufacture their own notebooks, but what this means is that they buy them from Saeger/Compal or someone else, and put in a hard drive/CPU/RAM, which is why you will find identical looking cases at several of these vendors.
If you find a HP/Compaq/Toshiba/Dell/IBM/Sony branded laptop that has linux preinstalled, it means that the vendor paid for windows and removed it. I do not list them below because I think this is a despicable and deceptive practice. These manufacturers do not (yet) sell no-os or linux laptops. (But please, call them and ask!! The squeaky wheel gets the grease!) Also if you order a no-OS laptop, please request linux to be installed anyway, and tell them you'll pay for it! Let them know there is demand!
-- Bob