Link spamming has happened on The Metaweb too. But it has not been a serious problem.
Mediawiki, the code base the Metaweb uses, has several features that minimize the damage done by link spammers. The first is watchlists - anyone who uses the Metaweb can set up a watchlist so that they are notified when there's a change to one of their favorite pages. Link spammers usually hit someone's favorite page, and that someone undoes the damage in a few minutes to a few hours.
Mediawiki also allows administrators to lock pages from edits (the front page is locked), and to ban IP addresses. And, like any Wiki, anyone can roll back changes to a page.
The combination - peer review, notification of changes, and tools that make it easier to undo damage than to create it - combined with a (small) community of active users, means that while link spammers have hit us, the damage gets undone quickly enough that we're not worth it.
Of course it's a cool idea. I'm just using the same accounting that the article uses (that is, only counting the cost of the antenna).
In my case, I could have used the USB solution and placed the USB transceiver in the microwave horn. The antenna doesn't care whether I'm using an Orinoco card or not, just that the source of the signal is at the right place.
With gzip -9, you get either 20% or 10% of the original size of a text document (this is what the Internet Archive used for its archive of HTML when I was there).
An index into a text collection is on the order of 10% - 100% of the size of the original collection, depending on what features you want to offer at speed. 10-50% is a reasonable size.
So for 10M messages at 10k each, assuming the compression ratio above (which might not hold for MS Word attachments - a big caveat) you have 100G of source, 10-20G compressed, with a 10-50G index.
Total storage is then between 20G and 70G, or 20-70% of the original uncompressed text.
This has to be the stupidest approach to the problem. Their networks are too slow, so instead, they're going to have each employee go through their old email and save individually important messages to their local hard disk? Not only are they going to tie up employees with this manual effort, they're also going to lose key documents and a key service - the ability to centrally search and reply to requests for information. In the future, each department will have to search their local hard drives for this information.
They've taken a simple problem of old or improperly speced equipment and turned it into a manual labor solution instead. That's an insane waste of time and salary. They should just upgrade their network and storage. If I can build a 4 terabyte RAIDed PC for a few thousand dollars, they can centralize their mailserver and back it up for say a hundred thousand, even with extra redundancy and inefficiencies and admin costs.
By contrast, forcing every current employee to perform a task that would eat up weeks of time per employee per year, in a city of Baltimore's size, will cost tens of millions of dollars.
$5 is only possible if the author didn't count the cost of the parts (yes, I'm too lazy to read the article).
Anyhow, I have him beat.
I built a cardboard and tinfoil microwave horn antenna. Construction was incredibly simple, and the design is robust. I get 16dB of gain over an Orinoco wireless card when I hook up the antenna (which probably means 18-19dB of antenna gain, assuming that the card's built-in antenna gets 2-3dB).
I found a design on the net, learned enough microwave theory to make sure I knew which measurements had to be particularly good, and which ones didn't matter as much, and built it in a weekend.
The only part I had to purchase was a pigtail connector for the Orinoco card, and I think that was $20.
I was able to establish a link across a seriously cluttered and heavily WiFi enabled San Francisco skyline to an open access point 8 miles away (from 5th and Howard to the BARWN access point on Mt. San Bruno).
I actually have downloaded and grepped the internet.
I used to do research on the Internet Archive's web collection. Each web snapshot was distributed across many unix boxes stuffed with disk in ARC files (a text archive format developed by IA for web crawls).
With the architecture at that time (around 1999) you could gre p the internet in I believe half an hour. The way you would do it would be to remotely run grep on each box and then collect the results.
This is also remotely like Desktop.com - an office environment originally written in, I believe, Javascript. It had a desktop, file storage, and some basic applications (word processor, possibly a spreadsheet, and most importantly, Tetris.)
If I remember correctly, it ran almost tolerably on PIII 500Mhz machines of the era.
I believe people here are missing the point. KCEasy is one of the few Windows clients that is compatible with Kazaa's Fastrack network (besides Kazaa-lite). Sharman probably doesn't care nearly as much about Linux clones.
The KCEasy site still links to the old version, 0.11, which includes the Fastrack plugin. It runs fine on Windows XP and works with Fastrack (Kazaa), Gnutella, and OpenFT.
Some of the complexity of new cars also makes them much lower maintenance. For example, the engine computer on most cars replaces a system that required serious and frequent maintenance.
This trend is also driving mechanics out of business. It used to be that a car would generate serious $$$ in terms of annual scheduled maintenance.
So consider the plight of independent mechanics - not only does it now require the equivalent of a college degree's education to understand most cars, but it's also less rewarding because there are fewer opportunities for maintenance.
Warren Buffet is the classic counterexample. He lives frugally in a middle class home. He has 14 employees. He's giving his 43 billion to charity when he dies, minus a small trust fund for his kids. Admittedly rare, but still a good counterexample.
This is mostly, but not entirely, correct. Warren Buffett also has his own corporate jet called, I believe, the Indefensible, and also I think has a nice second home in California. He does not live a completely middle-class life. That said, he certainly does live a more normal life than almost anyone with even a ten-thousandth of his money.
Also, while Warren Buffett's office only has a handful of employees, his company owns all or a large part of many large corporations and retail stores (General Re, Jordan's Furniture, See's Candy, Coke). Just counting the 100% ownership subsidiaries, he has thousands of employees.
Finally, a small percent, but a very large dollar value, of his jointly owned stock with his wife Susan will go to her control, and not the charity of his choice, when he dies.
I forget why, possibly based on some of his existing charitible contributions, but while Buffett hasn't said, I suspect the charity that will receive most of the money is Planned Parenthood.
I just wanted to clear up one point about the Metaweb (the site linked to in the article).
I'm one of the Metaweb administrators; I am also a Wikipedia administrator. The two sites run the same software - Mediawiki - but have different goals.
The Metaweb currently has extensive annotations on Quicksilver, many written by Neal, but also many contributed by readers. I hope that any Slashdot readers who are interested in The Confusion and would like to annotate the book will participate on the Metaweb.
The Cray CTO makes the point that Linux clusters get, at best, just under 10% peak as sustained performance and uses this as a justification that Linux clusters are not HPCs.
This is a reasonable criticism. Let's take the percentage he cites as real for a moment.
Now what is the cost difference between a Linux cluster and a Cray (not some future offering, but today) and how much more of a Linux cluster could you afford?
Would that offset the quoted inefficiency?
Would the flexibility of being able to use commodity components further offset any advantage Cray might have?
What about 24hr or same-day parts replacement without a hyper-expensive service contract?
At the end of the day, I suspect the Linux cluster wins out even given the sub-10% efficiency figure Cray cites.
--Pat / zippy@cs.brandeis.edu
I was persistent - I applied two or three times over as many years. I think the rejection would have been less annoying had it been accompanied by a "rejected by so-and-so for the following reason" with at least some chance of dialog.
Instead, rejection was denial without any appeal. As you pointed out, this is a good way to turn off potential volunteers.
Building a successful community is a surprisingly tough balance of rules and social structures. It's something that communities almost never get right in the first few iterations. I hope ODP/DMoz continues to work on this.
Permit me a constructive mini-rant here - please read it before moderating it as -5 troll.
ODP/DMoz is dead.
I don't mean that it's a bad idea, I mean that while I found ODP/DMoz to be very, very useful four years ago, I no longer search it for starting points. The links in ODP are stale and rarely of better quality than what I get back from Google.
And now to my rant.
For several years, I've volunteered to participate as a DMoz/ODP editor. I enjoy helping out and volunteering, and I submitted applications in which I had very, very strong domain knowledge (collaborative filtering was one).
I went through a fair amount of work filling out the application form for ODP/DMoz editor status, for a subject that had no editor, and what happened? They rejected me without comment.
Here I am, a domain expert on collaborative filtering, not just with academic credentials, but with two deployed and fairly heavily used systems, and they dropped my application without comment. (And at the time, I had no commercial relationship with either filter, so I doubt it was because of perceived bias).
Same thing happened when I applied to be an editor of another unrelated category.
These were both categories that did not yet have editors, and here I was, a pretty qualified applicant, and getting rejected without comment.
So I gave up. I just didn't get it, and left with the perception that DMoz/ODP was some collection of people who all knew each other, rather than an open volunteer effort. I don't know that this is true, but it's why I didn't vclunteer any more.
Is ODP/DMoz dead? I don't know, but as a user, I find Google better, and as someone who volunteers for community projects (Wikipedia admin, journal reviewer, scientific conference organizer), I think ODP/DMoz seems broken from the community side as well.
Here are my suggestions: ODP should open up the editorial application process. None of this secret anonymous stuff. Further, they should actively seek qualified volunteers. Finally, they should automate as much as possible to increase coverage and accuracy. DMoz is still a great idea, and I believe it can again become the directory of useful knowledge - the place I would turn to when a straight search fails.
I am curious whether the wiretapping law they're talking about is specific to phones, to person-to-person communications, or to any wire that carries signals.
If the latter, I'm going to unplug my toaster, as it's intercepting the 60Hz signal from my power company.
I also discovered that I'd apparently had an Alexa phone-home browser extension installed as a "Browser Helper Object" in IE, god knows for how long.
I believe the Alexa BHO you saw is one that Microsoft includes in IE's for the "Show Related Links" tool. This is similar to Netscape and Mozilla's "What's Related" button. This BHO only phones home when you do "Tools -> Show Related Links"
Alexa also makes a separate downloadable toolbar that shows related links automatically on each page transition, and so tracks (almost) every site you visit, but this is different than the BHO bundled with IE.
No, they've just said no one has proposed it. From the IAU's FAQ:
No proposal to change the status of Pluto as the ninth planet in the solar system has been made by any Division, Commission or Working Group of the IAU responsible for solar system science. Accordingly, no such initiative has been considered by the Officers or Executive Committee, who set the policy of the IAU itself.
Reading the rest of the FAQ, their position seems to be that a) Pluto's status is a sensitive issue, b) it probably shouldn't be a planet, c) for the IAU to change its status requires that someone propose the change, d) no one within the IAU has proposed this, e) the Planetary Systems Sciences Small Bodies Naming Commission in particular does not want to push the issue.
A decent motherboard costs $100 or less. Is there anything else I would have to replace, besides the CPU, if I wanted to upgrade from the current chipset?
If not, I don't see why I would want to wait for the next chipset.
Let me make it simple for you, in numbered fashion.
...
5. Commerical ISPs are largely covered by laws that make them a "common carrier" - meaning they are not generally liable for the content their users consume or provide. Universities are not presently generally protected in this manner. Therefore, the actions of a few infringing students puts the entire University and ultimately the citizens at financial risk.
This is the most interesting one to me. What surprises me is that campuses don't deal with this by outsourcing their ISP-like functions to someone like Speakeasy - someone who does have common carrier protection.
You'd be insulated from the liability and much of the end-user handholding.
Trading 10 bandwidth tryants for 10,000 isn't progress.
Like I said in the parent post, the nerve of those 10,000 users to actually use bandwidth.
But the real point is that, while no decent ISP blocks, University ISPs (aka IT departments) do. They do this for a mixture of reasons, but one key reason is that they cannot afford sufficient or sufficiently competent staff to put a more complex rule into place than "block port X."
What would any ISP do in the situation Dan describes? They would either buy additional capacity, or boot the smallest fraction of users necessary to unclog the network, or do bandwidth throttling.
Will the recent acceptance by such reputable companies open the possibility to Universities that not all P2P distribution is inherently bad?"
No.
Many universities (my own alma mater being an exception) tend not particularly progressive in any area but instruction. IT departments at universities often have very limited staff and budget, and block P2P services as much due to the hassle or threat of lawsuits as to cut down on bandwidth (the nerve of people to actually use the network connection!)
Nike missile bases are very different than this. There's one in Marin County that's restored and open to the public: SF-88.
The most extensive Nike bases I've seen: SF-88 and one of its sister sites in the Presidio of SF, had an above-ground radome or two, an above-ground launching area, and a small below-ground bunker - more of a garage, really - for storing the Nike missiles. The bunkers were not hardened; their purpose was to protect the surrounding area if one of the missiles exploded in storage.
Mediawiki, the code base the Metaweb uses, has several features that minimize the damage done by link spammers. The first is watchlists - anyone who uses the Metaweb can set up a watchlist so that they are notified when there's a change to one of their favorite pages. Link spammers usually hit someone's favorite page, and that someone undoes the damage in a few minutes to a few hours.
Mediawiki also allows administrators to lock pages from edits (the front page is locked), and to ban IP addresses. And, like any Wiki, anyone can roll back changes to a page.
The combination - peer review, notification of changes, and tools that make it easier to undo damage than to create it - combined with a (small) community of active users, means that while link spammers have hit us, the damage gets undone quickly enough that we're not worth it.
--Pat / zippy@cs.brandeis.edu
My 31MB inbox in mbox format became 10MB after gzip -9
.doc files in there, but little spam.
I have a fair number of photos and
--Pat / zippy@cs.brandeis.edu
Of course it's a cool idea. I'm just using the same accounting that the article uses (that is, only counting the cost of the antenna).
In my case, I could have used the USB solution and placed the USB transceiver in the microwave horn. The antenna doesn't care whether I'm using an Orinoco card or not, just that the source of the signal is at the right place.
--Pat
The math is more like this.
With gzip -9, you get either 20% or 10% of the original size of a text document (this is what the Internet Archive used for its archive of HTML when I was there).
An index into a text collection is on the order of 10% - 100% of the size of the original collection, depending on what features you want to offer at speed. 10-50% is a reasonable size.
So for 10M messages at 10k each, assuming the compression ratio above (which might not hold for MS Word attachments - a big caveat) you have 100G of source, 10-20G compressed, with a 10-50G index.
Total storage is then between 20G and 70G, or 20-70% of the original uncompressed text.
--Pat
This has to be the stupidest approach to the problem. Their networks are too slow, so instead, they're going to have each employee go through their old email and save individually important messages to their local hard disk? Not only are they going to tie up employees with this manual effort, they're also going to lose key documents and a key service - the ability to centrally search and reply to requests for information. In the future, each department will have to search their local hard drives for this information.
They've taken a simple problem of old or improperly speced equipment and turned it into a manual labor solution instead. That's an insane waste of time and salary. They should just upgrade their network and storage. If I can build a 4 terabyte RAIDed PC for a few thousand dollars, they can centralize their mailserver and back it up for say a hundred thousand, even with extra redundancy and inefficiencies and admin costs.
By contrast, forcing every current employee to perform a task that would eat up weeks of time per employee per year, in a city of Baltimore's size, will cost tens of millions of dollars.
Dumb, dumb, dumb.
--Pat / zippy@cs.brandeis.edu
Anyhow, I have him beat.
I built a cardboard and tinfoil microwave horn antenna. Construction was incredibly simple, and the design is robust. I get 16dB of gain over an Orinoco wireless card when I hook up the antenna (which probably means 18-19dB of antenna gain, assuming that the card's built-in antenna gets 2-3dB).
I found a design on the net, learned enough microwave theory to make sure I knew which measurements had to be particularly good, and which ones didn't matter as much, and built it in a weekend.
The only part I had to purchase was a pigtail connector for the Orinoco card, and I think that was $20.
I was able to establish a link across a seriously cluttered and heavily WiFi enabled San Francisco skyline to an open access point 8 miles away (from 5th and Howard to the BARWN access point on Mt. San Bruno).
--Pat / zippy@cs.brandeis.edu
I used to do research on the Internet Archive's web collection. Each web snapshot was distributed across many unix boxes stuffed with disk in ARC files (a text archive format developed by IA for web crawls).
With the architecture at that time (around 1999) you could gre p the internet in I believe half an hour. The way you would do it would be to remotely run grep on each box and then collect the results.
--Pat / zippy@cs.brandeis.edu
If I remember correctly, it ran almost tolerably on PIII 500Mhz machines of the era.
--Pat / zippy@cs.brandeis.edu
The KCEasy site still links to the old version, 0.11, which includes the Fastrack plugin. It runs fine on Windows XP and works with Fastrack (Kazaa), Gnutella, and OpenFT.
KCEasy v0.11 with Fastrack Goodness
--Pat / zippy@cs.brandeis.edu
This trend is also driving mechanics out of business. It used to be that a car would generate serious $$$ in terms of annual scheduled maintenance.
So consider the plight of independent mechanics - not only does it now require the equivalent of a college degree's education to understand most cars, but it's also less rewarding because there are fewer opportunities for maintenance.
This is a double-hit.
--Pat / zippy@cs.brandeis.edu
Also, while Warren Buffett's office only has a handful of employees, his company owns all or a large part of many large corporations and retail stores (General Re, Jordan's Furniture, See's Candy, Coke). Just counting the 100% ownership subsidiaries, he has thousands of employees.
Finally, a small percent, but a very large dollar value, of his jointly owned stock with his wife Susan will go to her control, and not the charity of his choice, when he dies.
I forget why, possibly based on some of his existing charitible contributions, but while Buffett hasn't said, I suspect the charity that will receive most of the money is Planned Parenthood.
--Pat / zippy@cs.brandeis.edu
I'm one of the Metaweb administrators; I am also a Wikipedia administrator. The two sites run the same software - Mediawiki - but have different goals.
For a summary of the differences, see the Metaweb vs Wikipedia FAQ
The Metaweb currently has extensive annotations on Quicksilver, many written by Neal, but also many contributed by readers. I hope that any Slashdot readers who are interested in The Confusion and would like to annotate the book will participate on the Metaweb.
--Pat / zippy@cs.brandeis.edu
The Cray CTO makes the point that Linux clusters get, at best, just under 10% peak as sustained performance and uses this as a justification that Linux clusters are not HPCs. This is a reasonable criticism. Let's take the percentage he cites as real for a moment. Now what is the cost difference between a Linux cluster and a Cray (not some future offering, but today) and how much more of a Linux cluster could you afford? Would that offset the quoted inefficiency? Would the flexibility of being able to use commodity components further offset any advantage Cray might have? What about 24hr or same-day parts replacement without a hyper-expensive service contract? At the end of the day, I suspect the Linux cluster wins out even given the sub-10% efficiency figure Cray cites. --Pat / zippy@cs.brandeis.edu
Instead, rejection was denial without any appeal. As you pointed out, this is a good way to turn off potential volunteers.
Building a successful community is a surprisingly tough balance of rules and social structures. It's something that communities almost never get right in the first few iterations. I hope ODP/DMoz continues to work on this.
--Pat
ODP/DMoz is dead.
I don't mean that it's a bad idea, I mean that while I found ODP/DMoz to be very, very useful four years ago, I no longer search it for starting points. The links in ODP are stale and rarely of better quality than what I get back from Google.
And now to my rant.
For several years, I've volunteered to participate as a DMoz/ODP editor. I enjoy helping out and volunteering, and I submitted applications in which I had very, very strong domain knowledge (collaborative filtering was one).
I went through a fair amount of work filling out the application form for ODP/DMoz editor status, for a subject that had no editor, and what happened? They rejected me without comment.
Here I am, a domain expert on collaborative filtering, not just with academic credentials, but with two deployed and fairly heavily used systems, and they dropped my application without comment. (And at the time, I had no commercial relationship with either filter, so I doubt it was because of perceived bias).
Same thing happened when I applied to be an editor of another unrelated category.
These were both categories that did not yet have editors, and here I was, a pretty qualified applicant, and getting rejected without comment.
So I gave up. I just didn't get it, and left with the perception that DMoz/ODP was some collection of people who all knew each other, rather than an open volunteer effort. I don't know that this is true, but it's why I didn't vclunteer any more.
Is ODP/DMoz dead? I don't know, but as a user, I find Google better, and as someone who volunteers for community projects (Wikipedia admin, journal reviewer, scientific conference organizer), I think ODP/DMoz seems broken from the community side as well.
Here are my suggestions: ODP should open up the editorial application process. None of this secret anonymous stuff. Further, they should actively seek qualified volunteers. Finally, they should automate as much as possible to increase coverage and accuracy. DMoz is still a great idea, and I believe it can again become the directory of useful knowledge - the place I would turn to when a straight search fails.
--Pat
If the latter, I'm going to unplug my toaster, as it's intercepting the 60Hz signal from my power company.
--Pat
Alexa also makes a separate downloadable toolbar that shows related links automatically on each page transition, and so tracks (almost) every site you visit, but this is different than the BHO bundled with IE.
--Pat / ex-Alexan
--Pat / zippy@cs.brandeis.edu
Along with Googlebar and MultiZilla, it's my favorite Mozilla plugin.
--Pat / zippy@cs.brandeis.edu
Age / Sex / Location.
--Pat / zippy@cs.brandeis.edu
P.S. If I get +5 Informative for this, it's a sign of the coming apocalypse.
If not, I don't see why I would want to wait for the next chipset.
--Pat / zippy@cs.brandeis.edu
You'd be insulated from the liability and much of the end-user handholding.
Are any universities doing this?
--Pat / zippy@cs.brandeis.edu
Like I said in the parent post, the nerve of those 10,000 users to actually use bandwidth.
But the real point is that, while no decent ISP blocks, University ISPs (aka IT departments) do. They do this for a mixture of reasons, but one key reason is that they cannot afford sufficient or sufficiently competent staff to put a more complex rule into place than "block port X."
What would any ISP do in the situation Dan describes? They would either buy additional capacity, or boot the smallest fraction of users necessary to unclog the network, or do bandwidth throttling.
What do most universities do? Block port X.
--Pat
No.
Many universities (my own alma mater being an exception) tend not particularly progressive in any area but instruction. IT departments at universities often have very limited staff and budget, and block P2P services as much due to the hassle or threat of lawsuits as to cut down on bandwidth (the nerve of people to actually use the network connection!)
--Pat / zippy@cs.brandeis.edu
The most extensive Nike bases I've seen: SF-88 and one of its sister sites in the Presidio of SF, had an above-ground radome or two, an above-ground launching area, and a small below-ground bunker - more of a garage, really - for storing the Nike missiles. The bunkers were not hardened; their purpose was to protect the surrounding area if one of the missiles exploded in storage.
--Pat / zippy@cs.brandeis.edu