Domain: pobox.com
Stories and comments across the archive that link to pobox.com.
Stories · 145
-
Righthaven Loses In Colorado; Abused the Copyright Act
First time accepted submitter djl4570 writes "Federal Judge John Kane ruled that Righthaven LLC of Las Vegas lacked standing to file copyright infringement lawsuits in Colorado under its lawsuit contract with the Denver Post and abused the Copyright Act in doing so. Righthaven was ordered to reimburse the defendant his costs including reasonable attorneys fees." -
Why Is It So Difficult To Allow Cross-Platform Play?
cookiej writes "I just got the most recent version of the Madden franchise ('10) for the PS3. Can somebody explain to me why EA has separate networks for the different platforms, only allowing players to compete with people using the same console? Back in the day, there were large discrepancies between the consoles, but these days it seems like the Xbox and the PS3 are at least near the same level. After so many releases for this franchise, they've got to have a fairly standardized protocol for networking; it seems arbitrary not to let them compete. Or am I just missing something obvious? Is it just a matter of Xbox Live and the PlayStation Network not working together?" -
California Student Arrested For Console Hacking
jhoger writes "Matthew Crippen was arrested yesterday for hacking game consoles (for profit) in violation of the Digital Millennium Copyright Act. He was released on a $5,000 bond, but faces up to 10 years in prison. This is terribly disturbing to me; a man could lose 10 years of his freedom for providing the service of altering hardware. He could well lose much of his freedom for providing a modicum of it to others. There is no piracy going on, necessarily — the games a modified console could run may simply not be signed by the vendor. It's much like jailbreaking an iPhone. But it seems because he is disabling a 'circumvention device' it is a criminal issue. Guess it's time to kick a few dollars over to the EFF." -
Recovering Moldy Electronics?
cookiej writes "We just completed having our basement gutted and our house decontaminated from mold. The finished basement is gone, my office floor has been removed as well as 24' of drywall around the base of the room. So, we had a full home theater downstairs along with a couple of computers in the electronics closet that were completely immersed (rainwater, not sewage). We moved them to a sheltered area outside and covered them with a plastic tarp. Since the electronics were off when the water hit them, 1) do I have a chance of recovering them? 2) If so, is there a way to clean them with some sort of liquid bath that would not damage the electronics? and 3) I don't want to bring moldy pieces back in the clean house. How could I decontaminate the electronics themselves, pre-bath? Not looking to save the speakers, just the amp, DirecTV box, video switch, etc. Thanks for any help, here, Slashdot." Read on for more details of this reader's plight.
Early last month, we had about 10" of rain in the course of two hours. Many houses in our neighborhood were damaged. We had rainwater coming in our back door and cascading down the basement steps. We have two sump pumps that weren't keeping up (and of course, no battery backup) and as the water rose in the basement, it was getting dangerously close to the breaker panel. So I made the hard decision to shut down the main power and we got the hell out.
The water reached about 6' in the basement before it drained out. Once we got back, we could not move fast enough to get all the debris out before mold set in and boy did it.
Since we are not in a flood plain, our insurance for this is woefully inadequate. While I would love to just go out and buy replacements, there are far more pressing things to re-buy (washer/dryer, furnace, water heater, etc.) and if there is a chance I can salvage some of this it might be a nice change of luck. -
Advanced Excel for Scientific Data Analysis
cgjherr writes "If the recent financial meltdown has left you wondering, 'When does exponential decay function stop?' then I have the book for you. Advanced Excel for Scientific Data Analysis is the kind of book that only comes along every twenty years. A tome so densely packed with scientific and mathematical formulas that it almost dares you to try and understand it all. A "For Dummies" book starts with a gentle introduction to the technology. This is more like a "for Mentats" book. It assumes that you know Excel very well. The first chapter alone will have you in awe as you see the author turn the lowly Excel into something that rivals Mathematica using VBA, brains, and a heaping helping of fortitude." Read on for the rest of Jack's review. Advanced Excel for Scientific Data Analysis author Robert de Levie pages 700 publisher Oxford Press rating 9 reviewer Jack Herrington ISBN 9780195370225 summary Use Excel for high end scientific data analysis akin to Mathemetica When I first opened this book my mouth just dropped. It had been years since I had seen a book typeset using LaTeX. But in an instant it made sense as the book is crammed packed with the kind of equations that would have been a nightmare to build with any other tools. Chapter after chapter has everything a really smart person needs to do curve fitting, statistical measures, differential equations, time-frequency analysis. But don't expect a play by play here. You will get the equations, set within a few dense paragraphs, with maybe a spreadsheet and a chart or two to show the results.
The first chapter concentrates on the getting the most out of Excel as a tool. All the chapters that follow dig into specific data analysis techniques. Chapters two, three and four are on least squares. Chapter five and six cover the analysis in the time domain including fourier transforms. Chapter seven covers differential equations. Chapter eight returns to Excel by digging in deeper into macros. Which leads into chapter nine, where we dig deeper into basic mathematical operations. Chapter ten covers matrix operations. And chapter eleven wraps it all up by giving you some spreadsheet best practices.
In University style there are also some exercises that you can do along the way if you want to tweak your brain pan a little more. To amuse myself I tried a few and I believe the book would have assessed my attempts 'wanting' if it had a voice to tell me.
Where most books like this would have several authors this book has just one; Roberte de Levie. This means that the tone, style and quality of the book is consistent throughout. A fact that you will come to appreciate as the book wades in ever increasingly deep data analysis concepts as the chapters roll on.
Though I would have preferred the book to have code samples in C#, I understand that the language of Excel is VBA and I guess I have to live with that. Thankfully VBA has come a long way and if you so inclined it would likely be easy to translate the code into C#, Java, or whatever else you like.
The fact that one person wrote the book left me wondering, "Who is this guy?" In my minds eye I kinda of figured he would look like one of those pulsing brain guys from Star Trek. Turns out he is a professor at Bowdoin College. And his fields of study include ionic equilibria, electrochemical kinetics, electrochemical oscillators, stochastic processes, and a whole lot more stuff that almost seems made up to sound impressive.
When this book isn't serving as an amazing reference for both Excel, scientific problem solving, or just insane equations it serves other purposes as well. It's a handy portable IQ test, as the count of pages you can grind through in one sitting, plus 90, is roughly your intelligence quotient. And if you fail at that you can always put a copy of the book, along with the Orange Bible, under your pillow and try to osmose your way to becoming the Kwisatz Haderach.
In all seriousness, this is a great book. It represents the kind of in-depth work and research we used to see in books that came out twenty years ago. Robert is to be applauded for his work. This is an excellent resource for anyone looking to do scientific data analysis but who was unaware of the powerful capabilities that Excel provides that is likely waiting just one Startup menu click away.
The book is not without fault. I would have preferred that it had been in color, or at least have one color section to show some of the more impressive visualizations that I'm sure would look great in color. In addition the index is silly short for a book that clocks in at 700 pages. But those are only minor quibbles for what is all-in-all an amazing piece of work.
You can purchase Advanced Excel for Scientific Data Analysis from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Inside Steve's Brain
cgjherr writes "There are management insights to be learned from Steve Jobs? You're nuts. The only things you can learn from Jobs is how to drive people nuts. Or at least, that's what I thought up until I read 'Inside Steve's Brain.' Turns out, there are things to learn from Steve's obsessive perfectionism. Certainly I wouldn't copy every aspect of Jobs' management style. Doing that will likely get you fired, or at least reprimanded, in most companies. But there is some stuff to be learned from how Jobs designs products and analyses the market, and that's the view that Leander Kahney gives us access to." Keep reading for the rest of Jack's review. Inside Steve's Brain author Leander Kahney pages 304 publisher Portfolio rating 10 reviewer Jack Herrington ISBN 1591841984 summary A look inside Steve Jobs' management style at Apple and Pixar Chapter one covers in some detail Jobs and his relationship with Apple, both before he left and after he came back. He talks about exactly what steps Steve took to revive the company and restore the morale of the employees. As with all of the chapters it ends with a summary of what Leander thinks are the takeaways from each of the anecdotes.
Chapters two and three; Despotism and Perfectionism, talk about the two traits that most often associated with Steve. In Despotism Leander offers some stories about just how in control Steve is of every aspect of development at Apple. And Perfectionism, well, that's self explanatory. Though you'll probably find some things you don't know about exactly where Jobs gets his design and style influences.
Chapter four and five, Elitism and Passion, dig into how Jobs cultivates that magical Apple touch. He works his people inside the company and inculcates a sense of pride and perfectionism in the Apple brand. And he works the customer base through innovative advertising that promotes the ideals and the brand, even when the product was inferior when he first took over. In the short Passion chapter Leander talks about how he builds a wider sense of world changing responsibility in the company and through his products.
The sixth chapter, Inventive Spirit, cite several examples of how Jobs used his relentless management style to refine products, and most interestingly the Apple Store. He went so far as to develop a prototype store in warehouse at the edge of the Apple campus, and how he was willing to completely scrap the design of the store when it wasn't exactly right, costing him months of time.
The seventh chapter provides a complete case study on the development of the iPod and Jobs' role in that effort. It's intriguing to see how, while there had been MP3 players in the market already, Steve and his team were able to stand back and look at the larger picture of the iPod in it's complete product ecology.
The final chapter, the Whole Widget, covers what I think is the most important lesson to be learned from Apple; that they take care of the entire product cycle. Where other vendors take care of just one piece, the hardware, the software, the network, Apple takes care of everything. If there is a problem with an Apple product you take it to the Apple store and they fix it.
Leander Kahney is the same guy who wrote "The Cult of Mac" and "The Cult of iPod". He knows his way around Apple. He has a clear grasp of the history of Apple in the large and the evolution of their key products. His insights prove that he also has good working relationship with some of the people on the ground in Apple.
There are certainly some interesting anecdotes about Steve in this book. But it would be a mistake to look at the book as just some psychoanalysis of one man. Steve doesn't make all of the products himself. The developer and designers at Apple do. It's the culture of the company that Jobs' controls, but the people who work there are motivated by it and produce within it. What you really learn here is just how passionate these folks are about finely tuning everything about their products, their services, the whole deal. It's inspiring.
You can purchase Inside Steve's Brain from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
The Myths of Innovation
cgjherr writes "Ah, the technology history book, normally I'm not a fan. The writing is aloof and dry. The topics are vague, the history misinterpreted, and the lessons presented too vague to be applicable. And don't get me started on the illustrations, which are all too often pyramids with the authors perched at the top looking down on the lowly reader at the base. Thankfully, this book, "the myths of innovation" breaks all of these rules. It's an engaging, fun and quick read. The history is interesting, and the lessons presented are practical. I particularly like the author's tone. It's witty and light. Which makes this a very fast read, one that leaves you wanting even more by the end." Read below for the rest of Jack's review. The Myths of Innovation author Scott Berkun pages 176 publisher O'Reilly rating Excellent reviewer Jack Herrington ISBN 0596527055 summary The history of innovation with lessons learned The myths of innovation is about how innovation happens in the real world in companies, universities and garages around the company. The first two chapters really draw the reader in by showing the twin fallacies of the epiphany moment and the historically clean line of innovation. Learning that innovation doesn't just come as a flash, and that lots of successes have come out of copious failure encourages us to try to innovate, and to keep trying even when we believe we have failed.
This short book (147 pages of content) is presented in ten short chapters. The first two show you how anyone can be an innovator. You can think of those as the debunking chapters. The third chapter is where the author starts helping you to build some techniques to innovate. He presents how there are some reasonable methods to spur innovation and shows examples from Apple, Google, Edison, Craiglist and more.
In chapter four he shows how to overcome peoples fears of innovation and overcome the common problems with the adoption of new technologies. Chapter five, "the lone innovator", debunks the legend of, well, the lone innovator. It sounds good, and plays into our noble story of the hero, but it's not common in reality. Chapter six talks about ideas and surveys where innovators have found the ideas that they start out with. Of course, where you start is often not where you end but that's ok, since innovation is a lot more about failure than it is about success.
Chapter seven covers something I think most of us can relate to, which is that managers aren't often the innovators. Chapter eight talks about how we believe that the "best ideas always win" but that's least often the case. This sounds pessimistic, but it's actually an interesting study in how the biggest product with the most feature isn't always the best for the customer. Chapter nine, "problems and solutions", talks about framing problems to constrain the creativity and innovation. The final chapter, "innovation is always good", is at the same time the most amusing and disturbing. It covers innovations from the automobile to DDT and presents that innovation, no matter what, is always good. Agree or disagree the points are well presented.
As I say I really enjoyed this book. It's an easy read that is hard to put down. What's more it's really motivating. After reading this book you will want to dig right back into those crazy ideas lurking around in the back of your mind and give them another shot. With this book, you will have a few more tools at your disposal to turn your ideas into reality.
You can purchase The Myths of Innovation from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Mars Rovers' Software Upgraded
cheros writes to note the news that NASA is upgrading the software in the Mars rovers to make them smarter in a number of ways. From the article: "The unexpected longevity of Spirit and Opportunity is giving the space agency a chance to field-test on Mars some new capabilities useful both to these missions and future rovers. Spirit will begin its fourth year on Mars on Jan. 3 (PST); Opportunity on Jan. 24. In addition to their continuing scientific observations, they are now testing four new skills included in revised flight software uploaded to their onboard computers." -
Fedora Project to Help Revitalize RPM
-=Moridin=- writes "The Fedora Project has announced plans to revitalize RPM, the package manager used by many Linux distros. According to the announcement, 'Job #1 is to take the current RPM codebase and clean it up, and in doing so work with all the other people and groups who rely on RPM to build a first-rate upstream project.' For more information, see the the RPM web site and the new wiki-based RPM FAQ. The issue of RPM's upstream development has been a thorny issue ever since Jeff Johnson, the original maintainer of RPM, left Red Hat." -
Google De-indexes Talk.Origins, Won't Say Why UPDATED
J. J. Ramsey writes "Talk.Origins is an archive with thousands of pages exposing creationist pseudoscience. Rather mysteriously, Google pulled the plug on its search engine, giving only the vague reason: 'No pages from your site are currently included in Google's index due to violations of the webmaster guidelines.' This was apparently triggered by a recent cracking of the site that added 'hidden links to non-topical sites,' but Google won't say just what the violations were. Talk.Origins webmaster Wesley R. Elsberry believes that this Google policy harms honest webmasters." From the article: "My mission, whether I liked it or not, was to find and fix whatever problem the [Talk.Origins Archive] might have, with no guidance as to what the problem was and nothing at all about where to start looking... I was extremely lucky. The damage to my site was limited and in the first place that I happened to look. Other honest webmasters might not be so lucky. They may have to undertake an arduous process of vetting pages, essentially having to second-guess the mind of the cracker in trying to locate a problem that Google knows the exact location of." Thanks to an alert reader who sent in Matt's blog posting about how Google handles hacked sites. -
IBM's Counterclaim 10 Outlines 5 Ways SCO's Wrong
ColonelZen writes "My article at IPW reads: But, however slowly, the wheels of justice do grind on. The discovery phase of SCO v. IBM is now complete, and as per the court's schedule the time to raise Summary Judgment issues is now. And IBM has indeed raised them ... such that it is very possible that all of SCO's claims against IBM could wind up dismissed piecemeal in those motions. ... Yesterday, IBM's redacted memo in support of CC10 hit Pacer. ... This is 102 pages detailing five independent but overlapping, direct and powerfully detailed reasons why SCO's claims of Linux infringement against its code are nonsense." -
Selecting Against Experience - Do Employers Know?
IBitOBear asks: "A couple days ago I did 'the interview loop' at that leading online retailer. Over the course of six hours I was repeatedly introduced to a guy in his early twenties, who would then ask me to write out code on a white-board for a problem that you might find in the study guide for a 200-level computer science class. I have 20 years of experience in programming and systems design. And in several cases the interviewers were vague, semantically incorrect, or self-contradictory. Interviewer blunders included not understanding that non-normal forms in databases -can be- more correct or efficient when the domain of a data is extremely limited; or choosing a leader among N candidates -is- a byzantine agreement problem. In short, the loop would have been perfect to weed out some guy getting his first job fresh out of school, but it definitely exerted selection pressure towards excluding experienced candidates. So employers, what are you doing to make sure that you are not culling out candidates with the low-ball? Job seekers, what do you do when you find yourself trapped in a sophomore study group?" -
China Prepares to Launch Alternate Internet
Netfree writes "The Chinese government has announced plans to launch an alternate Internet root system with new Chinese character domains for dot-com and dot-net. This may mean that Chinese Internet users will no longer rely on ICANN, the U.S.-backed domain name administrator, and, as one commentator notes, could be the beginning of the end of the globally interoperable Internet." -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
HOWTO: 0.5TB RAID on a Budget
Compu486 writes "Inventgeek.com has a new how-to article titled 'The Poor Mans Raid Array.' The article details how to make a modular .5 terabyte Raid 5 array for under $250 (USD), and it all runs on the Mandriva flavor of Linux." Drive prices being what they are, this seems cooler than it is practical. Update: 06/25 23:31 GMT by T : If that's not enough storage, Yeechang Lee writes "Let me show off the 2.8TB Linux-powered RAID 5 array I built for home use a few months ago. I provide lots of details on how I did it, what I used, and the results. The Usenet thread has good followup posts from others, too." -
IETF Approves SPF and Sender-ID
NW writes "According to the records in the IETF's database (here and here), both the SPF and Sender-ID anti-spam proposals were tentatively approved by the IESG (the approval board of the IETF) as experimental standards. It remains to be seen whether any of them will actually put a dent into spam." At the same time, the FTC has opened a central site about email authentication. -
Favorite Programming Contests?
SandSpider asks: "Sometimes, the daily grind of programming can wear a person down. Sometimes, people need challenges to expand their abilities and outlook. My personal favorite is the ADHOC/MacHack Showcase, where you spend up to 48 hours straight programming something impressive, perhaps with the conference theme, perhaps no. Sure, there's no prize, but it's the recognition from other programmers that makes it worthwhile. What is everyone else's favorite programming competition, and what did you do for it?" -
Plugging Internet Explorer's Leaks
jgwebber writes "If you're developing DHTML web apps, you probably already know first-hand that Internet Explorer has horrendous memory leak issues. You can't not run on IE, so you've got to find a way to plug those leaks. So I've created a tool to help you find them. So until Microsoft decides to fix its browser architecture (ha!), at least we can keep it from blowing huge amounts of memory." -
Canada Task Force Calls For Anti-Spam Law
Canrights writes "Canada's National Task Force on Spam released its final report today. Despite prior spam actions on privacy grounds in Canada, the task force is calling for a tough new anti-spam law including penalties for failure to obtain appropriate opt-in consents before sending commercial email as well as private right of action to encourage Canadian lawsuits against spammers. Professor Michael Geist, who headed up the legal aspects of the task force, provides a good summary of the recommendations." -
Using Email Networks as P2P Spam Filters
Oscar Boykin writes "New Scientist is running a story on using the social network in email as a P2P network. The idea is that email networks have structure that is conducive to a type of search called percolation search . This means email clients could query the social network of email users to filter spam. This story is based on a preprint available." -
First Hand Look At Chinese Internet Censorship
Blanchek writes "Few Internet quotes have had a longer shelf life than John Gilmore's 'the Internet interprets censorship as damage and routes around it.' An Ottawa Citizen article from Professor Michael Geist notes that the maxim may be dead. The article reflects on a recent experience with Chinese Internet censorship and the blocking of news, email, and Google searches, while providing a caution that it would be mistake to think that the Internet in Canada, the U.S. and Europe will always remain as free as China's is censored." -
Canadians May Face 25% Download Tariff
C-Yo writes "While Canadians have battled against an iPod tariff for more than a year, now comes news that Canada's copyright collectives are seeking a tariff on iTunes as well. Professor Michael Geist (who last week dismantled music industry claims about peer-to-peer) reports that one collective is demanding an incredible 25% of the gross revenue of music download services as well as 15% of webcasters' gross revenue and 10% of gamers gross revenue (free version of report or Toronto Star reg. version). When combined with other tariff proposals, it would appear that Canada's collectives want to the kill the download industry, demanding at least 40% of everything iTunes, Napster, and other new services earn." -
Revolution In The Valley
Jack Herrington writes "For most companies, lightning never strikes. The promised miracle product fails, and the revolutionary dreams meet evolutionary reality. But for Apple, lightning struck twice: first with the Apple computer, which can be justifiably named the first personal computer, then with the Macintosh. Introduced with the groundbreaking 1984 commercial the Mac started the GUI revolution which brought millions of new users into the once inhospitable world of computing." Read on for Herrington's review of Revolution in the Valley. Revolution in the Valley author Andy Hertzfeld pages 240 publisher O'Reilly rating 9 reviewer Jack Herrington ISBN 0596007191 summary The birth of the Mac, as told by one of its creatorsAt the heart of this revolution was a set of brilliant engineers and coders who through their work inspired individuals and companies alike. Andy Hertzfeld captured this revolutionary time at Apple through the eyes of the engineers involved at his site, folklore.org. Now he's published these stories in the book Revolution in the Valley.
Apple Confidential 2.0 will give you history. Cult of Mac describes the phenomenon from the outside. But only Revolution in the Valley tells the story of a computer revolution from the perspective of the team in the center of the storm.
The book consists of concise stories, separated by pages of notes, drawings and photographs from the three years it took to develop the original Mac. The stories run in length between one and eight pages, with most ending in the two- or three-page range. Each is told from a personal perspective, mainly by Hertzfeld himself. Sidebars with comments from Woz and others are included to round out the perspective.
The stories are organized chronologically, starting with Hertzfeld's first days at Apple and ending around the time when Jobs was ousted in Sculley's palace coup. Most of the stories are technical in nature, often going down into the level of hardware detail. Others are more personal in nature, detailing Jobs' odd hiring or management style, talking about the stresses of a 90-hour work week, or recounting Adam Osbourne's threats about the destruction of Apple and Jobs' famous response.
With its roughly one hundred stories weighing in at a little under 300 pages this is a relatively quick read. This is especially true since the stories work on many levels and are told with remarkable skill. There are some standouts: The development of the GUI, replete with Polaroids taken at key points along the way, is excellent. The story on the first meeting with Microsoft is told from a whole new perspective from what we have heard in the past. The genesis of the 1984 commercial is fascinating, and the meeting with Mick Jagger is hysterical.
There isn't a whole lot here that you won't find on folklore.org, though some of the later chapters do some summation work that I couldn't find on the site. These bring the book together as a coherent, readable whole. The note pages, which separate the chapters and are not on the site, are interesting on their own, particularly the notes from the session with Alan Kay.
Apple's development of the Macintosh has been seen as the prototype of the dot-com death marches that would follow. What we see here is the potent mix of technical brilliance, insane work hours and pressure, and management arrogance that paints a much more chaotic and realistic picture.
On a personal level, this is the book I have been waiting for my whole career. Andy Hertzfeld and Bill Atkinson are legends to me and many others. The passion and brilliance they demonstrated set the bar for all of us who look at computer science not as a job, but as a calling. To see the Mac development from Andy's perspective is simultaneously deflating and uplifting. Their project suffered from all of the usual trials. But somehow the team got through it, their creativity and hard work paid off, and they changed the world.
How many revolutions can there be? How many times can lighting strike? How can one small group of people change the world? That's what we all got into this business to find out. And this book shows us an example of how it was done and inspires us to do the same. Thank you, Andy, for what you did then and what you are doing now.
Jack Herrington is an engineer with a twenty-year career inspired by people like Andy Hertzfeld, and the editor-in-chief of the Code Generation Network, as well as the author of Code Generation in Action. You can purchase Revolution in the Valley from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Verified Voting
Joe from the EFF writes "Verified Voting has just gone live with a number of tools for all you data-hungry election nerds out there. Amongst the goods: an election guide for geeks, a voter's guide to electronic voting, the Verifier database of county-by-county election information and the Election Incident Reporting System (EIRS) which will be used on E-day by attorneys and observers in the field to collect data about election incidents called into the Election Protection Coalition's hotline, 1-866-OUR-VOTE. The geek community is playing a particularly active role in this year's eleciton via VV's TechWatch program. However, we could still use the help of the slashdot community, and all you have to do is click: We need to test the resiliency of the Verifier database and the EIRS before the election. -
Fedora Project Considering "Stateless Linux"
Havoc Pennington writes "Red Hat developers have been working on a generic framework covering all cases of sharing a single operating system install between multiple physical or virtual computers. This covers mounting the root filesystem diskless, keeping a read-only copy of it cached on a local disk, or storing it on a live CD, among other cases. Because OS configuration state is shared rather than local, the project is called 'stateless Linux.' The post to fedora-devel-list is here, and a PDF overview is here." -
IETF Decides On SPF / Sender-ID issue
Zocalo writes "The MARID working group at the IETF responsible for deciding on which extensions to SMTP will be used to try and prevent spoofing of the sender has made their decision. At issue was whether Microsoft's patent encumbered Sender-ID would be eligable for inclusion in an Internet standard. An initial analysis of the text of their decision, available here with a brief analysis, would suggest not. Unless Microsoft is going to make any dramatic concessions out of desperation, that pretty much clears the way for Meng Wong's Classic SPF to become the standard and hopefully make Joe-Jobs at thing of the past." -
MS Releases License For Sender-ID
NW writes "Microsoft published today a new license and FAQ for Sender-ID anti-spam standard being developed by the IETF's MARID WG (based on SPF). To use the license, a signed agreement with MSFT is required. Compatability with the Open Source Definition, the Free Software Definition, the Debian Free Software Guidelines, and the GPL/LGPL licenses is already in question." -
Microsoft to Deploy SPF for Hotmail Users
wayne writes "In a show of just how much Microsoft wants to put an end to email forgery, Hotmail, MSN and Microsoft.com will start enforcing Sender ID checks by Oct 1. In late May, MicroSoft announced that they would be adopting the Open Source SPF anti-forgery system (with a slight modification to make it Sender ID) and they have been working together with the IETF MARID working group to help create an RFC to define the Sender ID standard. Already tens of thousands of domain owners, such as AOL, Earthlink, and Gmail, have published SPF records, and thousands of systems are already checking SPF records. Publishing SPF records is easy, as is checking SPF records." -
Microsoft to Deploy SPF for Hotmail Users
wayne writes "In a show of just how much Microsoft wants to put an end to email forgery, Hotmail, MSN and Microsoft.com will start enforcing Sender ID checks by Oct 1. In late May, MicroSoft announced that they would be adopting the Open Source SPF anti-forgery system (with a slight modification to make it Sender ID) and they have been working together with the IETF MARID working group to help create an RFC to define the Sender ID standard. Already tens of thousands of domain owners, such as AOL, Earthlink, and Gmail, have published SPF records, and thousands of systems are already checking SPF records. Publishing SPF records is easy, as is checking SPF records." -
What Keeps You Off of Windows?
J. J. Ramsey asks: "schnell has already asked the question What's Keeping You On Windows? It seems only fair to ask the opposite question. For those of you who have elected to not use Windows, what keeps you away from it? Concerns about stability? Security? Dislike of Microsoft's business practices? Or are you simply a fan of your chosen platform and just don't care about Windows one way or the other?" Might recent events sway your decision to keep Microsoft's premier software offering off of your computer? -
Mandatory Banknote Detection Code?
metamatic writes "The European Union is planning to introduce legislation to make it mandatory for software developers to add black-box banknote detection code to their graphics software.How will this apply to open source software? Is it time to get writing to your Euro-MP?" -
SPF To Be Integrated With MS 'Caller ID' System
An anonymous reader submits "CNET's news.com is reporting 'An ongoing effort to consolidate antispam authentication schemes took a big step forward with the merging of Sender Policy Framework (SPF) and Microsoft's Caller ID for E-mail.' This is potentially good news." For more background, here are three previous mentions of Microsoft's proposed Caller ID-style system. -
Microsoft Submits Email Caller ID to the IETF
NetWizard writes "Following on the heels of Yahoo submitting DomainKeys, Microsoft decided to submit their "Caller ID" anti-spam proposal as a draft to the IETF. This proposal tries to tie in IP addresses to the domain of the sender just like SPF does. To make things even more interesting, looks like SPF and MSFT's Caller-ID proposals are merging. On a related note, Yahoo submitted an IPR disclosure for DomainKeys to the IETF." -
Yahoo Submits DomainKeys Draft To IETF
NetWizard writes "According to a mailing list post at the IETF, Yahoo's website and a Wired News story, Yahoo has made the DomainKeys draft public and submitted to the IETF." Russ Nelson explains "Basically, your MTA uses RSA-SHA1 to sign the headers and body of your email and inserts that signature before sending the email. The recipient MTA looks up $selector._domainkey.$domain in the DNS, gets your public key, verifies it, and inserts a notice. There's also a SourceForge project for a DomainKeys library." An anonymous reader asks "It seems to me that it doesn't offer anything more than the Sender Policy Framework by pobox.com, other than doing relay-based signing of the messages to provide the sender verification. SPF has already grown to over 14,000 domains so far and only requires an addition to your DNS to support (from the sending side). Verifying messages on the receiving MTA is as simple as doing a DNS lookup, most MTAs can support SPF now, the code is available and well tested. What advantages to people see in Domainkeys over SPF that are actually useful, and what standard should people implement?" -
Yahoo Submits DomainKeys Draft To IETF
NetWizard writes "According to a mailing list post at the IETF, Yahoo's website and a Wired News story, Yahoo has made the DomainKeys draft public and submitted to the IETF." Russ Nelson explains "Basically, your MTA uses RSA-SHA1 to sign the headers and body of your email and inserts that signature before sending the email. The recipient MTA looks up $selector._domainkey.$domain in the DNS, gets your public key, verifies it, and inserts a notice. There's also a SourceForge project for a DomainKeys library." An anonymous reader asks "It seems to me that it doesn't offer anything more than the Sender Policy Framework by pobox.com, other than doing relay-based signing of the messages to provide the sender verification. SPF has already grown to over 14,000 domains so far and only requires an addition to your DNS to support (from the sending side). Verifying messages on the receiving MTA is as simple as doing a DNS lookup, most MTAs can support SPF now, the code is available and well tested. What advantages to people see in Domainkeys over SPF that are actually useful, and what standard should people implement?" -
Painlessly Update FreeBSD
boarder8925 writes "Over at BSDnews, Steve Wingate has written an article on how to easily update FreeBSD. Wingate begins his article by saying, "One of the greatest advantages that *BSD has over other Unix variants is the cvsup/make world process. Unlike most Linux distributions it isn't necessary to wait months for a new version to be released for you to upgrade your system. The cvsup/make world process allows you to update your system at any time. I'm going to show you how to make the process as painless as possible." The article discusses the following: installing CVSup, choosing a cvsup server, configuring make.conf, and, finally, performing the upgrade. The piece is also available as a .pdf file." -
Microsoft Releases 'Caller-ID For Email' Specs
gfilion writes "Microsoft has released a draft specification for Caller-ID for email, 'to address the widespread problem of domain spoofing' - the concept is similar to SPF, but is using XML. There's already an Caller-ID to SPF converter in the works. A few weeks ago, Microsoft discussed compatibility between the projects with Meng Weng Wong (SPF's project leader), but most SPF users are against using XML, so nothing has come of it thus far." We recently covered a brief article mentioning Microsoft's anti-spam work, though this is a clearer indication of their intentions. Update: 02/26 21:36 GMT by T : NewsForge is carrying a brief article with FSF counsel Eben Moglen's take on the draft; Moglen says it is "encumbered with unclear and unnecessary patent license claims." -
Eiffel Programming Contest Results
Berend de Boer writes "NICE, the nonprofit International Consortium for Eiffel, has announced the results of its fifth International Eiffel Programming Contest. This year had cash prizes of up to 1400 USD and software valued up to $8000 USD. There were 17 entries. The top scores were:ePalm, bringing Eiffel to PalmOS; ewg, generating C code binding glue; and Hbchess, a chess engine." -
AOL Tests Sender Permitted From / E-mail Caller ID
securitas writes "ZDNet reports that AOL is testing Sender Permitted From (SPF), 'an antispam filter intended to accurately trace the origin of e-mail messages.' AOL is performing the widescale SPF test with its 33 million subscribers worldwide. The system works by letting recipients use the SPF record to cross-check DNS data associated with AOL's IP addresses and confirm that the message originated from AOL's servers. The system is one of three competing e-mail authentication protocols. The other IP-identifying protocols are the Designated Mailers Protocol (DMP) and Reverse Mail Exchange (RME/RMX). All systems alter the DNS database to let e-mail servers publish the IP addresses that they use to send e-mail." -
AOL Now Publishing SPF Records
SPF Fan writes "It looks like SPF is starting to catch on with the bigger ISPs. AOL is now publishing SPF records which you can verify with 'dig aol.com txt'. Will Hotmail and Yahoo be far behind? Who else is publishing SPF records for their domains? Slashdot has covered SPF in the past a couple times." -
SPF Design Frozen
Eric S. Smith writes "SPF, previously mentioned here, is a step closer to becoming a real, live RFC. We are encouraged to publish SPF records and thus to hasten the beginning of the end for annoying spam forgeries. SPF describes DNS TXT records that define the hosts authorized to send mail on behalf of users in your domain. Sites can then consult your SPF records and reject spam forged to look like it comes from you." (SPF stands for "Sender Permitted From.") -
SPF Design Frozen
Eric S. Smith writes "SPF, previously mentioned here, is a step closer to becoming a real, live RFC. We are encouraged to publish SPF records and thus to hasten the beginning of the end for annoying spam forgeries. SPF describes DNS TXT records that define the hosts authorized to send mail on behalf of users in your domain. Sites can then consult your SPF records and reject spam forged to look like it comes from you." (SPF stands for "Sender Permitted From.") -
Eiffel Programming Contest Deadline Nears
berenddeboer writes "Slightly more than two weeks left to polish up your Eiffel application or library and submit it to the world-wide Eiffel 2003 contest, the infamous Eiffel Class Struggle. As previously reported here the closing date is October 31. You can use any Eiffel compiler such as the GNU Eiffel compiler SmartEiffel. The top cash prize is $1400 USD. Entries are judged according to 12 criteria by an international panel of judges." -
Spoofed From: Prevention
An anonymous reader writes "It looks like the next promising advance in the war on spam is here! Introducing SPF: Sender Permitted From. A draft RFC is still being written, but the idea is simple: we can prevent forged emails by having domain owners publish a list of IP addresses authorized to send mail from their domain. It's no silver bullet, but how much spam can we eliminate by preventing forged mail from spoofed domains? Maybe we really don't need anti-spam legislation after all? The SPF site is chock-full of juicy info for our reading enjoyment. Bon appetit!" Interestingly, the to-do list mentions the possibility of seeking a defensive patent on this scheme, too. -
Spoofed From: Prevention
An anonymous reader writes "It looks like the next promising advance in the war on spam is here! Introducing SPF: Sender Permitted From. A draft RFC is still being written, but the idea is simple: we can prevent forged emails by having domain owners publish a list of IP addresses authorized to send mail from their domain. It's no silver bullet, but how much spam can we eliminate by preventing forged mail from spoofed domains? Maybe we really don't need anti-spam legislation after all? The SPF site is chock-full of juicy info for our reading enjoyment. Bon appetit!" Interestingly, the to-do list mentions the possibility of seeking a defensive patent on this scheme, too. -
2002 SAGE Salary Survey Finally Released
Ted Cabeen writes "The 2002 Salary Survey run by SAGE, SANS, and Sun's BigAdmin Group profiled in a March Slashdot Article has finally been released. Everybody who participated in the survey is entitled to a copy, as well as current members of those groups. How does your salary stack up in the post-crash economy?" -
Chinese Government to Use Only Local Software
owlmon writes "CNET Asia is reporting that China has outlawed foreign software in government applications. I expect that software buyers outside of the government will have to follow this lead. It's the same "network effect" that has powered Microsoft's growth for years. When the entire Chinese government is using WPS Office, anyone doing business with the government will feel mighty encouraged to follow suit. Otherwise, how will they exchange documents?" -
Celebrating the Mars Encounter with a DVD?
Berend de Boer asks: "To celebrate the upcoming encounter of The Mars Kind, I like to watch a DVD about Mars with my kids. Is there something worthwhile people can recommend? It should be suitable for younger kids (max 10), so Total Recall 2070 is out the question. It does not necessarily have to be an action film...something educational will be fine as well." -
Perl 1.0?
James A. A. Joyce writes "The title says it all. There's a tiny blurb over at dev.perl.org. Download Perl 1.0 here, for all of those nostalgics in the Slashdot audience! It's only 263KB, so why not give this piece of 1980s computing history a try?"