My $.02. disclaimer: I'm one of the SA developers.
"The Corpus was Classified by SpamAssassin, for SpamAssassin", and
"The Accuracy of the Test Subject's Corpus is Questionable":
No, this
is incorrect. Firstly, he states that he used user feedback to reclassify FNs
and FPs (p. 4).
The misunderstanding probably comes from p. 6, where
he notes that he also ran SpamAssassin 2.63 over the "gold standard"
corpus once it was complete, to verify his original classifications.
However, in addition to that, he states 'all subsequent disagreements
between the gold standard and later runs were also manually adjudicated, and
all runs were repeated with the updated gold standard. The results presented
here are based on this revised standard, in which all cases of disagreement
have been vetted manually.' So in other words, the "gold standard" should be
as near as possible to 100% accurate, since all the tested filters and
the human classification have "had a shot" at classifying every mail, and the
human has had final say on every misclassification.
In other words, if
any misclassifications remain in the "gold standard" corpus, every one
of the tested filters agreed on that misclassification.
IMO, that's as
good as a hand-classified corpus can get.
"old versions of
software were used":
It's unrealistic to expect the author to use the
most up-to-date versions of filters available by the time the paper is made
available to the public. That's the difference between results and a paper --
it takes time to analyze results, write it up and come to valid conclusions,
once the testing results are obtained. IMO, the author can't be faulted for
spending some time on that end of things.
Given that, using 6-month old
release versions of the software under test seems reasonable.
SpamAssassin 2.60, when new SpamAssassin rules were last added to a released
ruleset, is 9 months old (released 2003-09-22); so logically, in testing
against DSPAM 2.8 (released 2003-11-26), DSPAM should therefore have had the
edge.;)
"test started with untrained filters":
IMO,
that's the real world. People don't start with fully-trained filters.
In addition, the graphs on pp. 15-20 show accuracy over the course of the
entire 8 month period, so "post-training" accuracy can be viewed there.
"spam in the test is as old as 14 months":
Nope, he
states (p. 4) that the corpus uses mail between August 2003 and March 2004.
"it should purge old data":
SpamAssassin purges its
Bayes databases automatically, based on the age of messages in the corpus. We
call it "expiry".
In that test, the "SA-Standard" dataset would be
using this, so stating "Cormack did not perform any purge simulation at all" is
not accurate. However, that would not have increased SpamAssassin's accuracy
figures, since we have generally have found that while it keeps the overhead of
bayes database sizes and memory down, it marginally reduces accuracy, instead
of increasing it (at the default settings).
(Also worth noting that it
can deal with being run from an en-masse check over a static corpus, as it uses
the timestamp information in the Received headers rather than the current
system time. So even if this test was run in the course of 4 hours, it'd still
be an accurate simulation of what would happen in "real world" use over the
course of 8 months.)
So I did talk to some of these lenders. Apparently they buy leads from
www.lendergateway.com . One
guy that I talked to was irritated because it costs him $100 per lead
they sell him and it's supposed to only be sold to him. He apologized
quite a bit and was nice enough to give me the information on who sold
him the names. The number he game me goes to voicemail which I'm going
to try later. A couple other people told me what I can do with myself
and one lady kept saying that she couldn't give me information on who
provided her with my information.
The stupid thing is each time I talk to them I tell them I'm on a cell
and that I need their name and number and I'll call them right back.
They give it to me... So when they hang up I start calling again and
again. I've been irritating the hell out of them...
Anyways, that's the fun storing of what happens when these forms are
filled out.
I agree -- this sounds like a very effective way to cause trouble for the spammers; if their customers aren't happy, they won't be ordering many more spam runs....
Actually, I'd suggest the poster learn a little more about patents, instead of ranting.
"The f**king summary" -- or at least the Claims part -- is exactly what governs what other implementations are judged to be infringing, or not. No matter how complex the further explanation is, if the claims are simple and broad, the danger for other software developers are similarly simple and broad.
(The further explanation is supposed to be a way for other implementors to easily use the patent to implement the same system, assuming they then go ahead and license it from the original inventor. Of course, in the software field, that explanation is generally never coherent or detailed enough to do so, without having to expend pretty much as much effort as if you wrote it from scratch yourself.)
Our solution has been to ensure that all changes are emailed to a mailing list, where we can monitor them and remove the spam links within minutes of their arrival.
An ideal solution: Google should define an attribute for the A tag, which indicates that a URL should not be used in computing Page Rank. We could then modify our Wikis so that page links from Wikis are not included.
Same thing would work for weblog comment spamming, too.
To save the 'poor bastard', I'd suggest getting
this 10MB Quicktime.mov version instead; it loses the unfunny subtitles, and is hosted on archive.org, which can handle the traffic.
I don't think you understand what happens if a developing country annoys the WTO by ignoring provisions of the WIPO and TRIPS treaties.
Check out what happened to Brazil when they tried to manufacture generic AZT without paying license fees to Glaxo. Here's a snippet from this doc:
'In 1996, Brazil passed a law authorizing the local production of five key anti-retroviral drugs used in the US. Some of the medications, such as AZT, an anti-retroviral drug that prevents the transmission of HIV from mother to child, were patented prior to 1995 when the WTO provisions first applied. These medicines fall outside the scope of TRIPS. Through its patent law, Brazil allows the drugs to be produced legally, without paying royalties. As a result, Brazil is able to provide free drugs to people living with HIV/AIDS. Recently, Brazil managed to persuade the US company Merck to lower the prices of two of its drugs, Crixivan and Stocrin, used to treat people with AIDS, by threatening to permit compulsory licensing if Merck did not cut prices by 50 per cent.
In the US government's view, a section of Brazil's law discriminated against foreign owners of patents. Under the law, designed to help build a national pharmaceutical industry and reduce the price of medicines, Brazil will honour a patent only if the drug is produced locally. Therefore, foreign companies must establish a presence in Brazil in order to enjoy protection. According to the US, TRIPS prohibited this kind of discrimination. The US government maintained steady diplomatic pressure on Brazil to get it to change its patent regime and medicines policy, backing up the pressure with a threat of unilateral trade sanctions.'
So, a developing country that came to the attention of a sufficiently-powerful US corporation in ignoring specific IP-related trade treaties, got slapped down with threats of unilateral trade sanctions.
For a developing country, sanctions are no small deal. Hell, even for the US, they're no small deal;) I'd say the local software industry would quickly find out that TRIPS was back on the menu....
Do yourself a favour -- don't try running the server on a Windows machine, it'll be a world of pain.
Just get hold of a clunky old PC, install linux, and use that as a dedicated source code control server with whatever system you want to use. You'll save yourself a lot of bother (and a bit more immunity to disk crashes, too).
'One, wouldn't a normal Bayesian filter do this automatically? I.e., pick up that url in mail classified as spam and then weight it positively in the future?'
Yep, that's the case, in SpamAssassin 2.6x at least.
I find it amazing that, because spam comes from various offshore addresses, people always say that spam laws are pointless because the spam "all comes from overseas anyway". People say this no matter where the law is being discussed!
If it's not illegal *anywhere* then we've made no legislative progress whatsoever.
Basically, if spamming is illegal in the UK (and Ireland, and Australia, etc.) then (A) spammers cannot offshore to those countries, or outsource to spam bureaus there, so that's one set of possible spamhosting ISPs we don't have to worry about. (B) if a multinational company spams from the US to a recipient in the UK/Ireland/Australia etc., and have an office in those countries, they can still be held accountable for their spamming even despite the US' weak laws. and (C) at least in Ireland, it may be possible to prosecute spammers in other European countries due to EU harmonization of laws and jurisdictions. -- (I think. IANAL.)
Me, I'm thinking that (B) may turn out to be handy against the serious mainsleaze spammers -- of which there are plenty, and given the CAN-SPAM act, there will be many many more quite soon.
Ironport's appliances are good for sending lots of mail. That probably means they may be useful for sending lots of spam, but it also definitely means they're useful for sending legit bulk mail. It's a "dual-use" thing.
In fact, that applies to SMTP email in general. Consider all the mailing lists you read; they're "bulk email". That's one reason why spam filtering is harder in email than other protocols like IM; bulk one-to-many contact is a lot more common in the SMTP case. The IETF recognised this, and hence we have ESMTP.
Given this story, if I was Eric Allman or Wietse Venema, I'd be worried about people complaining that sendmail or postfix are spammer tools...
'Eugenia Loli-Queru: In your opinion, which is the hardest step to take in the road ahead for full interoperability between DEs? How far are we from the realization of this step?
'Havoc Pennington: I think the "URI namespace" or "virtual file system" issue is the ugliest problem right now. It bleeds into other things, such as MIME associations and WinFS-like functionality. It's technically very challenging to resolve this issue, and the impact of leaving it unresolved is fairly high. Here are some links on that here, here and here. '
OK -- so, unsurprisingly, having GNOME have one set of apps that can read one namespace, KDE have another set that can read another namespace, and a whole load of command line tools that can't read either, is a problem.
I still can't understand why this hasn't made it into a mainline kernel hook, or at least a shared library kludge. Something like AVFS is infinitely preferable to a filesystem that can only be accessed by a small subset of applications...
One group seems to have written this 'Warp Pipe' tool, using Sourceforge infrastructure, declaring it under a BSD license (as far as I can make out from the comments) when they set up the SF project.
Another group then starting working off that (supposedly open-source) codebase. The first group are not happy about this, and have decided it's now proprietary and want to remove rights to use that code.
(Either that, or they think users of a BSD-licensed package needs 'express written consent of Warp Pipe to repackage or redistribute in any way'.)
Apparently, they didn't *actually* specify license terms in the source; but they must have claimed an open-source license in order to use Sourceforge. So at some point, they were a little 'unclear' about the license.
All very amateurish...
BTW, the sf.net project page is still there: here's a link: http://sourceforge.net/projects/cubeonline23/
And CVS: http://cvs.sourceforge.net/viewcvs.py/cubeonline23/WarpPipe/
Stop the presses -- the original paper looks like it was correct, as far as
review of the M&M results reveals so far. It seems a screw-up
somewhere resulted in exporting 159 columns of data into a 112-column Excel
spreadsheet, which screwed up the analysis for this . (Blame MS!;)
Also, theirs is not the only paper that supports the 'hockey stick' graph
anyway -- there's quite a few others, too.
But anyway -- we're jumping the peer-review process heavily here. USA Today
stories are supposed to happen after the peers do the reviewing;)
Gotta say, I hate the idea. I've dealt with unusual apps in charge of
starting services in the past (AIX had some kind of DCE-based service
control daemon) -- and it was a world of hell. Shell scripts, by
comparison, are comprehensible, tweakable, and very very easy to deal
with. I know -- this sounds very unlikely -- but any system that has
to deal with as many settings/dependencies/external hooks etc. as the boot
scripts, is going to be that confusing anyway no matter what language
it's in!
But I do like the idea of parallelization of the boot scripts, and
starting X a whole lot earlier (like before the daemons are all started);
I hacked up the init scripts to do this on my desktop linux machine a few years ago, and on Solaris and SunOS machines before that, and it
was great for boot time.
Richard Gooch's need(8) and provide(8) tools look like a fantastic way to do this simply, comprehensibly, and without rewriting everything
in a new language.
that's available here, and that page notes that it should be in
versions of init in util-linux since 2.10q.
'Snopes was set up in early 1995 by the CIA as a way to debunk popular conspiracy theories, Companies and individuals can now pay to have their urban legend denied on the site, a prime beneficiary being Richard Gere.'
'because you have to mark as GM anything that even could have come into contact with GM crops - this is 99.9% of American crops - nobody in the EU will buy any food exports from the US'.
Come on. Is this really a good argument? Why would you be against labelling a foodstuff as to its origin and provenance?
Sorry, I don't agree. IMO, the more info a consumer has on where their food comes from, how it was grown, what pesticides were used, whether it may contain GM pollen, how it was treated after picking, etc. -- the better.
It's simply called informing the consumer. Then the consumer can use their judgement instead of trusting some big, faceless organisation who Knows What's Good For You.
And then interested parties can persuade the consumers that GM is safe, and eating the tomatos with the GM sticker is fine. That's OK, that makes sense. But don't use this 'information is bad' line, it's crap.
PS: re GM patents, etc. IMO the GM industry at the moment is acting like the RIAA; there's lots of good ways to use GM, but they're focused on the short term gain -- make $$$$ fast.
'Any legislation that permits all of America's estimated 23 million small
businesses to legally send everyone at least one email cannot be
considered anti-spam. And any bill that limits a consumer's recourse to
clicking an opt-out link 23 million times isn't going to make our lives
any better.....
Opt-out
laws have let the problem grow to the state it is today; no one in
Congress can supply an adequate explanation as to why opt-out at a
national level will make any difference. Opt-out in Korea has been an
unmitigated disaster and their legislature is rushing to repair the
global damage their opt-out law has done to their Internet economy.
California's opt-out law is being scrapped. And the European Union knew
better than to waste time with a discredited approach and went straight
to opt-in.'
At least this law allows ISPs to prosecute spammers, and it does not block class action suits from multiple spam recipient consumers (AFAICS). Also the damages of $500 per message is a lot better than the proposed Texas state law's puny $10 per message.
But consider these facts: there's 23 million small businesses in the US. That means a lot of "I would like to opt out" mails you'll be sending. Multiply that by however many possible addresses you can receive mail at: foo@domain1.com, foo@[211.11.22.34], foo%domain1.com@domain1.com, root@domain1.com, postmaster@domain1.com, foo@forwardingservice.net, foo@perl.org, foo@users.sourceforge.net, etc. etc. etc.
Then there's the "tagged addressing" concept,
where you "tag" the addresses you give out with
additional text to identify who you gave it to, e.g. foo+amazon@domain1.com, foo+slashdot@domain1.com. Each of those is a different "e-mail address".
Here the World Trade Organization (WTO) lent the biotech industry a shoulder to cry on by allowing the major players to formulate the Trade Related Intellectual Property Rights Agreement (TRIPS) which came into force in 1995. TRIPS aims to force all countries to take on board a menu of biotech patents and 'harmonize' their national patenting regimes accordingly - the aim is to make the world follow the US example.
This book review at Nature says: 'Central to this analysis is the account of the negotiation of TRIPS, whereby the campaign for globalized intellectual-property standards was shifted to the international trade agenda. Developing countries were persuaded to sign up to TRIPS in exchange for the liberalization of world trade markets. The subsequent failure of these markets to materialize (witness US steel tariffs and farm subsidies in the United States and Europe) also goes some way to explaining the growing disenchantment with TRIPS.'
See also
why Biotech patents are patently absurd. As members of the WTO, and signatories to TRIPS,
these countries really don't have a choice;
they'd be in breach of the TRIPS treaty if they
do not ratify these laws.
There is one, for exactly this reason -- the SpamAssassin
public corpus. I made it available for developers of spam tools to compare effectiveness using a good, recent corpus from 1 person's mail feed (as much as that was possible).
This is a selection of mail
messages, suitable for use in testing spam filtering systems. Pertinent
points:
All headers are reproduced in full. Some address obfuscation has taken
place, and hostnames in some cases have been replaced with
"spamassassin.taint.org" (which has a valid MX record). In most cases
though, the headers appear as they were received.
All of these messages were posted to public fora, were sent to me in the
knowledge that they may be made public, were sent by me, or originated as
newsletters from public news web sites.
relying on data from public networked blacklists like DNSBLs, Razor, DCC
or Pyzor for identification of these messages is not recommended, as a
previous downloader of this corpus might have reported them!
Copyright for the text in the messages remains with the original senders.
OK, now onto the corpus description. It's split into three parts, as follows:
spam: 500 spam messages, all received from non-spam-trap sources.
easy_ham: 2500 non-spam messages. These are typically quite easy to
differentiate from spam, since they frequently do not contain any spammish
signatures (like HTML etc).
hard_ham: 250 non-spam messages which are closer in many respects to
typical spam: use of HTML, unusual HTML markup, coloured text,
"spammish-sounding" phrases etc.
easy_ham_2: 1400 non-spam messages. A more recent addition to the set.
spam_2: 1397 spam messages. Again, more recent.
Total count: 6047 messages, with about a 31% spam ratio.
One reason this has come up as an issue, is because the US (via the WTO) have been applying pressure to countries around the world to "reform" their IP systems -- to match the US' own system -- for quite a while.
The TRIPS (Trade Related Aspects of Intellectual Property Rights) treaty, and GATT, are the main methods used to do this. The
FFII page on the treaty notes 'Article 27 has often been construed by patent lawyers to imply that patent claims must be allowed to extend to computer programs' (my emphasis).
FFII go on to make the case that this can be circumvented BTW; here's hoping, since all of
Europe has signed up to TRIPS AFAIK.
If you are a European and bothered by software patents, now is the time to write to (or even email) MEPs asking them to oppose this directive; it's the 'proposed software patentability directive as amended by JURI' (COM(2002)92 2002/0047). The letter should support the
FFII/Eurolinux and/or Green position.
1. Mrs AHERN, Nuala
Group of the Greens/European Free Alliance
2. Mr ANDREWS, Niall
Union for Europe of the Nations Group
3. Mrs BANOTTI, Mary Elizabeth
Group of the European People's Party (Christian Democrats) and European Democrats
4. Mr COLLINS, Gerard
Union for Europe of the Nations Group
5. Mr COX, Pat
Group of the European Liberal, Democrat and Reform Party
6. Mr CROWLEY, Brian
Union for Europe of the Nations Group
7. Mr CUSHNAHAN, John Walls
Group of the European People's Party (Christian Democrats) and European Democrats
8. Mr DE ROSSA, Proinsias
Group of the Party of European Socialists
9. Mrs DOYLE, Avril
Group of the European People's Party (Christian Democrats) and European Democrats
10. Mr FITZSIMONS, James (Jim)
Union for Europe of the Nations Group
11. Mr HYLAND, Liam
Union for Europe of the Nations Group
12. Mr McCARTIN, John Joseph
Group of the European People's Party (Christian Democrats) and European Democrats
13. Mrs McKENNA, Patricia
Group of the Greens/European Free Alliance
14. Mr O' NEACHTAIN, Sean
Union for Europe of the Nations Group
15. Mrs SCALLON, Dana Rosemary
Group of the European People's Party (Christian Democrats) and European Democrats
Please take the time to send them a letter, or even a mail. This really is a terrible proposal, and the last thing open source and small software developers need, is more software patents with an expanded range.
My $.02. disclaimer: I'm one of the SA developers.
"The Corpus was Classified by SpamAssassin, for SpamAssassin", and "The Accuracy of the Test Subject's Corpus is Questionable":
No, this is incorrect. Firstly, he states that he used user feedback to reclassify FNs and FPs (p. 4).
The misunderstanding probably comes from p. 6, where he notes that he also ran SpamAssassin 2.63 over the "gold standard" corpus once it was complete, to verify his original classifications.
However, in addition to that, he states 'all subsequent disagreements between the gold standard and later runs were also manually adjudicated, and all runs were repeated with the updated gold standard. The results presented here are based on this revised standard, in which all cases of disagreement have been vetted manually.' So in other words, the "gold standard" should be as near as possible to 100% accurate, since all the tested filters and the human classification have "had a shot" at classifying every mail, and the human has had final say on every misclassification.
In other words, if any misclassifications remain in the "gold standard" corpus, every one of the tested filters agreed on that misclassification.
IMO, that's as good as a hand-classified corpus can get.
"old versions of software were used":
It's unrealistic to expect the author to use the most up-to-date versions of filters available by the time the paper is made available to the public. That's the difference between results and a paper -- it takes time to analyze results, write it up and come to valid conclusions, once the testing results are obtained. IMO, the author can't be faulted for spending some time on that end of things.
Given that, using 6-month old release versions of the software under test seems reasonable.
SpamAssassin 2.60, when new SpamAssassin rules were last added to a released ruleset, is 9 months old (released 2003-09-22); so logically, in testing against DSPAM 2.8 (released 2003-11-26), DSPAM should therefore have had the edge. ;)
"test started with untrained filters":
IMO, that's the real world. People don't start with fully-trained filters.
In addition, the graphs on pp. 15-20 show accuracy over the course of the entire 8 month period, so "post-training" accuracy can be viewed there.
"spam in the test is as old as 14 months":
Nope, he states (p. 4) that the corpus uses mail between August 2003 and March 2004.
"it should purge old data":
SpamAssassin purges its Bayes databases automatically, based on the age of messages in the corpus. We call it "expiry".
In that test, the "SA-Standard" dataset would be using this, so stating "Cormack did not perform any purge simulation at all" is not accurate. However, that would not have increased SpamAssassin's accuracy figures, since we have generally have found that while it keeps the overhead of bayes database sizes and memory down, it marginally reduces accuracy, instead of increasing it (at the default settings).
(Also worth noting that it can deal with being run from an en-masse check over a static corpus, as it uses the timestamp information in the Received headers rather than the current system time. So even if this test was run in the course of 4 hours, it'd still be an accurate simulation of what would happen in "real world" use over the course of 8 months.)
And finally, what Henry said in comment 9520473.
--j.
Actually, I'd suggest the poster learn a little more about patents, instead of ranting.
"The f**king summary" -- or at least the Claims part -- is exactly what governs what other implementations are judged to be infringing, or not. No matter how complex the further explanation is, if the claims are simple and broad, the danger for other software developers are similarly simple and broad.
(The further explanation is supposed to be a way for other implementors to easily use the patent to implement the same system, assuming they then go ahead and license it from the original inventor. Of course, in the software field, that explanation is generally never coherent or detailed enough to do so, without having to expend pretty much as much effort as if you wrote it from scratch yourself.)
We've also had problems on the SpamAssassin Wiki.
Our solution has been to ensure that all changes are emailed to a mailing list, where we can monitor them and remove the spam links within minutes of their arrival.
An ideal solution: Google should define an attribute for the A tag, which indicates that a URL should not be used in computing Page Rank. We could then modify our Wikis so that page links from Wikis are not included.
Same thing would work for weblog comment spamming, too.
To save the 'poor bastard', I'd suggest getting this 10MB Quicktime .mov version instead; it loses the unfunny subtitles, and is hosted on archive.org, which can handle the traffic.
Check out what happened to Brazil when they tried to manufacture generic AZT without paying license fees to Glaxo. Here's a snippet from this doc:
So, a developing country that came to the attention of a sufficiently-powerful US corporation in ignoring specific IP-related trade treaties, got slapped down with threats of unilateral trade sanctions.For a developing country, sanctions are no small deal. Hell, even for the US, they're no small deal ;) I'd say the local software industry would quickly find out that TRIPS was back on the menu....
Do yourself a favour -- don't try running the server on a Windows machine, it'll be a world of pain.
Just get hold of a clunky old PC, install linux, and use that as a dedicated source code control server with whatever system you want to use. You'll save yourself a lot of bother (and a bit more immunity to disk crashes, too).
'One, wouldn't a normal Bayesian filter do this automatically? I.e., pick up that url in mail classified as spam and then weight it positively in the future?'
Yep, that's the case, in SpamAssassin 2.6x at least.
If it's not illegal *anywhere* then we've made no legislative progress whatsoever.
Basically, if spamming is illegal in the UK (and Ireland, and Australia, etc.) then (A) spammers cannot offshore to those countries, or outsource to spam bureaus there, so that's one set of possible spamhosting ISPs we don't have to worry about. (B) if a multinational company spams from the US to a recipient in the UK/Ireland/Australia etc., and have an office in those countries, they can still be held accountable for their spamming even despite the US' weak laws. and (C) at least in Ireland, it may be possible to prosecute spammers in other European countries due to EU harmonization of laws and jurisdictions. -- (I think. IANAL.)
Me, I'm thinking that (B) may turn out to be handy against the serious mainsleaze spammers -- of which there are plenty, and given the CAN-SPAM act, there will be many many more quite soon.
In fact, that applies to SMTP email in general. Consider all the mailing lists you read; they're "bulk email". That's one reason why spam filtering is harder in email than other protocols like IM; bulk one-to-many contact is a lot more common in the SMTP case. The IETF recognised this, and hence we have ESMTP.
Given this story, if I was Eric Allman or Wietse Venema, I'd be worried about people complaining that sendmail or postfix are spammer tools...
'Eugenia Loli-Queru: In your opinion, which is the hardest step to take in the road ahead for full interoperability between DEs? How far are we from the realization of this step?
'Havoc Pennington: I think the "URI namespace" or "virtual file system" issue is the ugliest problem right now. It bleeds into other things, such as MIME associations and WinFS-like functionality. It's technically very challenging to resolve this issue, and the impact of leaving it unresolved is fairly high. Here are some links on that here, here and here. '
OK -- so, unsurprisingly, having GNOME have one set of apps that can read one namespace, KDE have another set that can read another namespace, and a whole load of command line tools that can't read either, is a problem.
I still can't understand why this hasn't made it into a mainline kernel hook, or at least a shared library kludge. Something like AVFS
is infinitely preferable to a filesystem that can only be accessed by a small subset of applications...
I took a look -- it's crazy.
3 /WarpPipe/
One group seems to have written this 'Warp Pipe' tool, using Sourceforge infrastructure, declaring it under a BSD license (as far as I can make out from the comments) when they set up the SF project.
Another group then starting working off that (supposedly open-source) codebase. The first group are not happy about this, and have decided it's now proprietary and want to remove rights to use that code.
(Either that, or they think users of a BSD-licensed package needs 'express written consent of Warp Pipe to repackage or redistribute in any way'.)
Apparently, they didn't *actually* specify license terms in the source; but they must have claimed an open-source license in order to use Sourceforge. So at some point, they were a little 'unclear' about the license.
All very amateurish...
BTW, the sf.net project page is still there: here's a link: http://sourceforge.net/projects/cubeonline23/
And CVS: http://cvs.sourceforge.net/viewcvs.py/cubeonline2
doh. 'apply often and apply for anything' was what I meant to say. (mental note: use preview in future!)
IMO, IBM are doing the right thing in many areas, but their patent policy (apply and apply for anything) seems to be out of control.
Also, theirs is not the only paper that supports the 'hockey stick' graph anyway -- there's quite a few others, too.
But anyway -- we're jumping the peer-review process heavily here. USA Today stories are supposed to happen after the peers do the reviewing ;)
Yeah -- and I, as an Irishman, am well ashamed of that.
But I do like the idea of parallelization of the boot scripts, and starting X a whole lot earlier (like before the daemons are all started); I hacked up the init scripts to do this on my desktop linux machine a few years ago, and on Solaris and SunOS machines before that, and it was great for boot time.
Richard Gooch's need(8) and provide(8) tools look like a fantastic way to do this simply, comprehensibly, and without rewriting everything in a new language. that's available here, and that page notes that it should be in versions of init in util-linux since 2.10q.
Come on. Is this really a good argument? Why would you be against labelling a foodstuff as to its origin and provenance?
Sorry, I don't agree. IMO, the more info a consumer has on where their food comes from, how it was grown, what pesticides were used, whether it may contain GM pollen, how it was treated after picking, etc. -- the better.
It's simply called informing the consumer. Then the consumer can use their judgement instead of trusting some big, faceless organisation who Knows What's Good For You.
And then interested parties can persuade the consumers that GM is safe, and eating the tomatos with the GM sticker is fine. That's OK, that makes sense. But don't use this 'information is bad' line, it's crap.
PS: re GM patents, etc. IMO the GM industry at the moment is acting like the RIAA; there's lots of good ways to use GM, but they're focused on the short term gain -- make $$$$ fast.
At least this law allows ISPs to prosecute spammers, and it does not block class action suits from multiple spam recipient consumers (AFAICS). Also the damages of $500 per message is a lot better than the proposed Texas state law's puny $10 per message.
But consider these facts: there's 23 million small businesses in the US. That means a lot of "I would like to opt out" mails you'll be sending. Multiply that by however many possible addresses you can receive mail at: foo@domain1.com, foo@[211.11.22.34], foo%domain1.com@domain1.com, root@domain1.com, postmaster@domain1.com, foo@forwardingservice.net, foo@perl.org, foo@users.sourceforge.net, etc. etc. etc.
Then there's the "tagged addressing" concept, where you "tag" the addresses you give out with additional text to identify who you gave it to, e.g. foo+amazon@domain1.com, foo+slashdot@domain1.com. Each of those is a different "e-mail address".
Better get those typing fingers in shape :(
See also why Biotech patents are patently absurd. As members of the WTO, and signatories to TRIPS, these countries really don't have a choice; they'd be in breach of the TRIPS treaty if they do not ratify these laws.
There is one, for exactly this reason -- the SpamAssassin public corpus. I made it available for developers of spam tools to compare effectiveness using a good, recent corpus from 1 person's mail feed (as much as that was possible).
Here's the pertinent part of the README :
One reason this has come up as an issue, is because the US (via the WTO) have been applying pressure to countries around the world to "reform" their IP systems -- to match the US' own system -- for quite a while.
The TRIPS (Trade Related Aspects of Intellectual Property Rights) treaty, and GATT, are the main methods used to do this. The FFII page on the treaty notes 'Article 27 has often been construed by patent lawyers to imply that patent claims must be allowed to extend to computer programs' (my emphasis).
FFII go on to make the case that this can be circumvented BTW; here's hoping, since all of Europe has signed up to TRIPS AFAIK.
-
1. Mrs AHERN, Nuala
Group of the Greens/European Free Alliance
-
2. Mr ANDREWS, Niall
Union for Europe of the Nations Group
-
3. Mrs BANOTTI, Mary Elizabeth
Group of the European People's Party (Christian Democrats) and European Democrats
-
4. Mr COLLINS, Gerard
Union for Europe of the Nations Group
-
5. Mr COX, Pat
Group of the European Liberal, Democrat and Reform Party
-
6. Mr CROWLEY, Brian
Union for Europe of the Nations Group
-
7. Mr CUSHNAHAN, John Walls
Group of the European People's Party (Christian Democrats) and European Democrats
-
8. Mr DE ROSSA, Proinsias
Group of the Party of European Socialists
-
9. Mrs DOYLE, Avril
Group of the European People's Party (Christian Democrats) and European Democrats
-
10. Mr FITZSIMONS, James (Jim)
Union for Europe of the Nations Group
-
11. Mr HYLAND, Liam
Union for Europe of the Nations Group
-
12. Mr McCARTIN, John Joseph
Group of the European People's Party (Christian Democrats) and European Democrats
-
13. Mrs McKENNA, Patricia
Group of the Greens/European Free Alliance
-
14. Mr O' NEACHTAIN, Sean
Union for Europe of the Nations Group
-
15. Mrs SCALLON, Dana Rosemary
Group of the European People's Party (Christian Democrats) and European Democrats
Please take the time to send them a letter, or even a mail. This really is a terrible proposal, and the last thing open source and small software developers need, is more software patents with an expanded range.'ghosts' is a command which has been included with perl in the 'eg' directory since at least 4.036. It does this effectively, allowing you to do
;). *EXTREMELY* simple, too.
gsh somemachines somecommand
or
gcp somefile somemachines:/etc/newfile
worked great, last time I had to admin a large network (about 5 years ago
http://outflux.net/unix/software/gsh/ seems to be an updating of this tool.