I have a similar problem - when I go to bed late I tend to just smack the alarm clock in the morning and continue sleeping. I don't even remember a thing.
A very quick and dirty solution for me has been to put the alarm clock across the room so that I have to physically get up to turn it off. That forces my body to waken up so that I won't go right back to bed. HTH
Get your facts straight - H1-B workers work in THIS country and PAY TAXES JUST LIKE YOU DO. And yes there is a lower limit on how much they can be paid - it's called the prevailing wage. And yes companies are supposed to make sure that an American citizen cannot fill the position.
For your info the leading Chinese manufacturer Legend is one of the biggest in the world. And it's only going to be bigger. Check out the population of China vs that of the US sometimes.
While many people see gzip as a compression tool, it also makes a credible spam filter. Here's how.
I was reading through a bioinformatics book the other day, and was reminded of a useful shortcut for comparing a text against various corpora. A number of researchers have simply fed DNA sequence data into the popular Ziv-Lempel compression algorithm, to see how much redundancy it contains. Loosely speaking, the LZ (Zip) and the related gzip compression algorithms look for repeated strings within a text, and replace each repeat with a reference to the first occurrence. The compression ratio achieved therefore measures how many repeated fragments, words or phrases occur in the text.
A related technique allows us to measure how much a given, "test" text has in common with a corpus of possibly similar documents. If we concatenate the corpus and the test text, and gzip them together, the test text will get a better compression ratio if it has more fragments, words, or phrases in common with the corpus, and a worse ratio if it is dissimilar. Since the LZ algorithm scans the entire input for repetitions, it tends to map pieces of the test text to previous occurrences in the corpus, thereby achieving a high "appended compression ratio" if the test text is similar to what it's appended to.
In this case, we wish to compare an incoming email message against two possible corpora: spam and non-spam (ham). If we maintain archives of both, we can compare the appended compression ratios relative to each, to judge how similar a new message is to spam or ham.
As a simple test, I downloaded some sample spam and ham from the Spamassassin archive. I removed headers from the messages (to focus on message text only), and created spam and ham "training sets" 1-2 megabytes in size. I then tested spam and ham messages not in the training sets for for their compressed sizes when appended.
The file sizes output were compared to the compressed sizes of spam.txt and ham.txt without new-message-body.txt appended, to see how many bytes were consumed by the new-message-body.
The results for "ham" messages were the most dramatic. The average compressed size of a ham message appended to spam was 38% higher than when appended to other ham. For spam messages, the same comparison yielded a compressed size 6% smaller when appended to spam vs. ham, so in both cases, compressing a message with others of its kind yielded a smaller file, on average.
Individual results were also quite clear: while some spam messages compressed slightly better when mixed with ham, ham messages still maintained a margin of 15% or more between the most spamlike ham, and the most hamlike spam. I would put the threshold somewhere around 110%; if a message's size when gzipped with spam is less than 110% of its size when compressed with ham, it's probably spam.
In conclusion, gzip is a fairly blunt instrument for spam detection, but the effectiveness of its relatively blind repetition-finding is worth noting. The current fad among spam filters is word-counting, with various statistical heuristics applied to the results. Algorithms like LZ and gzip go beyond word matching, finding entire phrases and paragraphs of repetition, but do not attempt to measure their statistical significance. More sophisticated approaches, which combine phrase matching with statistical analysis, may be more effective.
There have been a number of people recently in academia and prestigiuous labs that have fabricated evidence and people have believed them for years until they were discovered. Now I am not saying that this guy did... Also, what if YOU are the poor and infirm, huh?
I disagree - others have posted a number of reasons why a million bucks actually mean something. And then there is the precedent that this sets. Once a movement gains momentum other donations are sure to follow.
Your arguments don't hold water. If broadband was so commonplace that pc manufacturers started selling computers with nic-cards rather than modems the "normal" user would be more than happy to use all of that excess badnwidth they "don't need or want". And for all of that to happen the prices need to fall down. And they won't until telecoms are in the driving seat.
DRM or not, the fact of life is that paid "content" providers will not take off until they can actually offer something better than the mainstream pimpled-teen crapola that is out there. Yah I have checked Pressplay and they have precisely 2% of the over 100 bands that I like to listen to. Yes, I am willing to pay to download MP3's but I want unlimited download for a set monthly fee and no, I don't want the MP3's to expire after 3 months so I can't play them any more, thank you very much.
Must be tough to be a RIAA/MPAA/Label Exec - loosen those ties, dudes, they're restricting the oxygen supply to your brains!
I am even willing to pay the full price of CDs if I knew the money was going in the hands of the artist and not in some phat exec's pockets. Cut out the middle-man, yeah - cry me a river because technology is making you obsolete.
For your info WorldCom did not go bankrupt because their customers demanded more bandwidth. This is the stupidest economic statement I have heard in a while. They went bankrupt because of sloppy management, corrupted accounting practices, and greed. Can't wait to see which one is next?!
It is widely acknowledged that Jonathan Swift predicted the existance of the two Mars satellites: Phobos and Deimos. Although the two moons were not discovered until 1877, Jonathan Swift had written in 1720 (in Gulliver's Travels, chapter 3) that the inhabitants of Laputa had made important astronomical observations of 10 000 fixed stars and of the two satellites of Mars, one orbiting with a period 10 hours and the other with a period of 21 hours. This is pretty spooky for a prediction!
Also, Sir Arthur Clarke is considered the inventor of the geostationary satellites.
Win 95 sure did work flawlessly. It was a bummer when 98 came out. My BSOD went through the roof from 5/day to 30/day.
The Japanese Earth Simulator went live in May 2002 - see http://www.es.jamstec.go.jp/esc/eng/ES/birth.html.
Try xCache - I've used it before and it's quite good: http://www.xcache.com/home/default.asp
Many Fortune 500 companies use it.
I have a similar problem - when I go to bed late I tend to just smack the alarm clock in the morning and continue sleeping. I don't even remember a thing.
A very quick and dirty solution for me has been to put the alarm clock across the room so that I have to physically get up to turn it off. That forces my body to waken up so that I won't go right back to bed. HTH
... most Slashdot readers are safe. Smart forever, mooohaha ...
Get your facts straight - H1-B workers work in THIS country and PAY TAXES JUST LIKE YOU DO. And yes there is a lower limit on how much they can be paid - it's called the prevailing wage. And yes companies are supposed to make sure that an American citizen cannot fill the position.
For your info the leading Chinese manufacturer Legend is one of the biggest in the world. And it's only going to be bigger. Check out the population of China vs that of the US sometimes.
Originally posted on kuro5hin.org
By KWillets
Sun Jan 26th, 2003 at 07:03:35 AM EST
While many people see gzip as a compression tool, it also makes a credible spam filter. Here's how.
I was reading through a bioinformatics book the other day, and was reminded of a useful shortcut for comparing a text against various corpora. A number of researchers have simply fed DNA sequence data into the popular Ziv-Lempel compression algorithm, to see how much redundancy it contains.
Loosely speaking, the LZ (Zip) and the related gzip compression algorithms look for repeated strings within a text, and replace each repeat with a reference to the first occurrence. The compression ratio achieved therefore measures how many repeated fragments, words or phrases occur in the text.
A related technique allows us to measure how much a given, "test" text has in common with a corpus of possibly similar documents. If we concatenate the corpus and the test text, and gzip them together, the test text will get a better compression ratio if it has more fragments, words, or phrases in common with the corpus, and a worse ratio if it is dissimilar. Since the LZ algorithm scans the entire input for repetitions, it tends to map pieces of the test text to previous occurrences in the corpus, thereby achieving a high "appended compression ratio" if the test text is similar to what it's appended to.
In this case, we wish to compare an incoming email message against two possible corpora: spam and non-spam (ham). If we maintain archives of both, we can compare the appended compression ratios relative to each, to judge how similar a new message is to spam or ham.
As a simple test, I downloaded some sample spam and ham from the Spamassassin archive. I removed headers from the messages (to focus on message text only), and created spam and ham "training sets" 1-2 megabytes in size. I then tested spam and ham messages not in the training sets for for their compressed sizes when appended.
Compression was measured as follows:
$ cat spam.txt new-message-body.txt |gzip - |wc -c
$ cat ham.txt new-message-body.txt |gzip - |wc -c
The file sizes output were compared to the compressed sizes of spam.txt and ham.txt without new-message-body.txt appended, to see how many bytes were consumed by the new-message-body.
The results for "ham" messages were the most dramatic. The average compressed size of a ham message appended to spam was 38% higher than when appended to other ham. For spam messages, the same comparison yielded a compressed size 6% smaller when appended to spam vs. ham, so in both cases, compressing a message with others of its kind yielded a smaller file, on average.
Individual results were also quite clear: while some spam messages compressed slightly better when mixed with ham, ham messages still maintained a margin of 15% or more between the most spamlike ham, and the most hamlike spam. I would put the threshold somewhere around 110%; if a message's size when gzipped with spam is less than 110% of its size when compressed with ham, it's probably spam.
In conclusion, gzip is a fairly blunt instrument for spam detection, but the effectiveness of its relatively blind repetition-finding is worth noting. The current fad among spam filters is word-counting, with various statistical heuristics applied to the results. Algorithms like LZ and gzip go beyond word matching, finding entire phrases and paragraphs of repetition, but do not attempt to measure their statistical significance. More sophisticated approaches, which combine phrase matching with statistical analysis, may be more effective.
Dork - I use Kylix in a live site. And no I am not going to give you the URL.
... a Beowulf cluster of those. :) Just hadda do it.
There have been a number of people recently in academia and prestigiuous labs that have fabricated evidence and people have believed them for years until they were discovered. Now I am not saying that this guy did ... Also, what if YOU are the poor and infirm, huh?
I disagree - others have posted a number of reasons why a million bucks actually mean something. And then there is the precedent that this sets. Once a movement gains momentum other donations are sure to follow.
1. Guilt
2. Altruism
3. Pure idealism
Aagh!
Your arguments don't hold water. If broadband was so commonplace that pc manufacturers started selling computers with nic-cards rather than modems the "normal" user would be more than happy to use all of that excess badnwidth they "don't need or want". And for all of that to happen the prices need to fall down. And they won't until telecoms are in the driving seat.
Isn't this going to be discrimination against mental and heart patients?
Must be tough to be a RIAA/MPAA/Label Exec - loosen those ties, dudes, they're restricting the oxygen supply to your brains!
I am even willing to pay the full price of CDs if I knew the money was going in the hands of the artist and not in some phat exec's pockets. Cut out the middle-man, yeah - cry me a river because technology is making you obsolete.
Since when is CDNow a peer-to-peer network?
For your info WorldCom did not go bankrupt because their customers demanded more bandwidth. This is the stupidest economic statement I have heard in a while. They went bankrupt because of sloppy management, corrupted accounting practices, and greed. Can't wait to see which one is next?!
Bulky, grotesque piece of overhyped fad electronics. Not to mention the atrocious flash animation.
You must be frickin rich, big daddy to be spending $30B of *your* tax money.
yeah and
4) Buy a dictionary and a grammar. Learn to spell.
>>Are you ready for the answer? ... 50%
50% NOT - because probability is based on the facts - how a certain entity has behaved in the past and not on how *you* think it should behave.
What do you work as to be able to afford hi-fi in college? Gigollo? Looks like mommy and daddy send a check after all ...
It is widely acknowledged that Jonathan Swift predicted the existance of the two Mars satellites: Phobos and Deimos. Although the two moons were not discovered until 1877, Jonathan Swift had written in 1720 (in Gulliver's Travels, chapter 3) that the inhabitants of Laputa had made important astronomical observations of 10 000 fixed stars and of the two satellites of Mars, one orbiting with a period 10 hours and the other with a period of 21 hours. This is pretty spooky for a prediction!
Also, Sir Arthur Clarke is considered the inventor of the geostationary satellites.