Domain: wikipedia.org
Stories and comments across the archive that link to wikipedia.org.
Stories · 7,048
-
Last Peacekeeper Deactivated
Inthewire writes "The United States Air Force deactivated the last of 50 Peacekeeper missiles yesterday. The Peacekeeper was an Intercontinental Ballistic Missile capable of accurately placing a 300 Kt W-87 warhead on ten individual targets." -
Learning to Code with a Boardgame
markmcb writes "While some of us cling tight to our memories of Apple-filled classrooms playing The Oregon Trail and driving our Turtle around in Logo, children today have many other ways to learn about the inner-working of computers and the code that drives them. Wired.com is running an interesting article about a boardgame in which players must use simple logic similar to that used in programming to get their skier down the mountain. From the article: 'Using basic math, players have to figure out which paths are open to them and then decide the fastest way to the finish line. The trick, however, is learning which paths are open to you using only programmer jargon like 'if (X==1)' then you can take the green path or 'while (X4) you can take the orange path,' where X is the roll of the die.'" -
Quantum Link Reverse Engineered
JeffLedger writes "A group of retro-geeks have rebuilt the old Quantum Link system to allow both emulated and real c64's to sign in over the Internet using the original software. Before it was called America Online, Quantum Link provided a pre-Internet online service to Commodore users." -
Wikipedia's New Archnemesis
euniana writes "Forget about Britannica, and meet Uncyclopedia. Formally the adoptive first cousin of Wikipedia, Uncyclopedia stands for everything Wikipedia cannot have: misinformation, satire, and lies. Does this prove that satire and humour can take off in a collaborative environment, a possibility often contested by grumpy Wikipedians? What many people don't know is that the Wikipedia article on the Flying Spaghetti Monster was partly copied from the FSM article on Uncyclopedia. Will the confusion ever end?" -
Wikipedia's New Archnemesis
euniana writes "Forget about Britannica, and meet Uncyclopedia. Formally the adoptive first cousin of Wikipedia, Uncyclopedia stands for everything Wikipedia cannot have: misinformation, satire, and lies. Does this prove that satire and humour can take off in a collaborative environment, a possibility often contested by grumpy Wikipedians? What many people don't know is that the Wikipedia article on the Flying Spaghetti Monster was partly copied from the FSM article on Uncyclopedia. Will the confusion ever end?" -
Study Puts Hole In Comet Theory Of Life's Origin
Astervitude writes "A new study by US and Japanese scientists has put a serious dent into one version of the popular panspermia theory that credits comets for bringing the seeds of life to Earth. Surveys conducted by the University of Arizona, the National Astronomical Observatory of Japan and others now show that objects from the main asteroid belt between Jupiter and Mars were largely responsible for the period of Late Heavy Bombardment that ended 3.9 billion years ago. UA Professor Emeritus Robert Strom believes that no more than 10 percent of the Earth's water comes from comets and any oceans then extant would have been 'vaporized by the asteroid impacts during the cataclysm.'" Interesting, because this directly contradicts the Nova mini-series Origins that just finished running on PBS. Science never stops moving. -
The Slurpee at 40
theodp writes "Oh Thank Heaven for 7-Eleven! Slate reports on the 40th birthday of the Slurpee, which has frozen an estimated 6 billion brains and arguably provided the inspiration for Starbucks' Frappuccino, Dunkin' Donuts' Coolatta and Kwik-E-Mart's Squishee. Wikipedia has more Slurpee facts and links." -
The Slurpee at 40
theodp writes "Oh Thank Heaven for 7-Eleven! Slate reports on the 40th birthday of the Slurpee, which has frozen an estimated 6 billion brains and arguably provided the inspiration for Starbucks' Frappuccino, Dunkin' Donuts' Coolatta and Kwik-E-Mart's Squishee. Wikipedia has more Slurpee facts and links." -
A Useful Grammar Checker?
burtdub asks: "With the amount of raw text data available, there seems to be no shortage of ambitious language projects on the horizon, from Universal Language Translators to Junk Email Filtering. However, the mess that is the English language still seems to elude commercial attempts while being relatively ignored by the open source community. What would it take to make a useful, functional grammar checker?" -
A New Replacement for TV Tome
Randall311 writes to tell us about, what the creators hope will be, a new replacement for the old TV Tome website, the TV IV Wiki. The once popular TV Tome website was absorbed by CNET in April of this year and most of the content was added to their TV.com website. Many users dislike the new format with vast amounts of flash, obnoxious ads, and missing content. So, if you liked the old TV Tome website perhaps this will allow the community to rebuild what it has lost. -
Making Ice Without Electricity
j-beda writes "Time Magazine is running an article telling us how Dave Williams is trying to make ice for third-world applications using the Hilsch-Ranque vortex-tube effect (first developed in 1930 by G.J. Ranque), where swirling air is split into hot and cold components." The method is horribly inefficient but Williams is hoping it could yield helpful results in areas where electricity is really not an option. -
What is the Current Status of WiMAX?
PalletBoy asks: "I live in BFE (read 'remote') Pennsylvania where BroadBand is not available in any form save satellite, which is no good for price and latency reasons (curse my MMO addiction!). My big question is: what is the -actual- current status of WiMAX technology? Different sites have me believing different things and I can't find an exact answer to the question 'When will I be able to buy a WiMAX router and cards so I can remotely receive broadband?' When will WiMAX (802.16) be solidly standardized, out, and affordable? Or is it already there?" -
Brute Force
ijones writes "Brute Force, by Matt Curtin, is about an event that many Slashdotters will remember: the cracking of the Data Encryption Standard. In June of 1997, a 56-bit DES key was discovered, and its encrypted message decoded, by an ad-hoc distributed network of computers, cooperating over the Internet. Four and a half months earlier, RSA had issued a challenge to the cryptography community, offering $10,000 to the first group to crack a 56-bit DES encrypted message. In Brute Force, Matt Curtin offers his first-hand account of the DESCHALL team's winning effort." Read on for the rest of Jones' review. Brute Force: Cracking the Data Encryption Standard author Matt Curtin pages 291 publisher Copernicus Books rating 9 reviewer Isaac Jones ISBN 0387201092 summary Volunteers working collaboratively over the internet manage to crack the Data Encryption Standard.
Although I wasn't involved with the DES cracking challenge, I am friends with the author of this book. I took a Lisp course from Matt at Ohio State University and I'll be forever grateful that Matt introduced me to functional programming with a great deal of humor and enthusiasm. I don't think I've ever seen Matt stay so serious for so long, but his enthusiasm comes through clearly in this book.
Brute Force can be enjoyed by both nerds and non-nerds interested in cryptography or codes. Those who have been a part of this or subsequent DES challenges may be particularly interested in this book. Curtin covers some technical details of DES and the brute force attack that the DESCHALL team used to discover a DES key. He also discusses the political and historical significance of this event. This is a fairly technical book, but it goes out of its way to explain non-obvious technical topics, so one doesn't need a lot of technical background to understand it.
Curtin briefly explains a lot of stuff: the C programming language, firewalls, UDP, one-time pads, protected memory, etc., in order to make this book readable for novices. Although I generally did not need such explanations, I did not find them annoying or distracting, as they were fairly brief. In fact, it's fun to read concise explanations of such topics. Occasionally, Curtin does go into just a little too much detail. The chapter on Architecture gives an explanation of some of the many pieces of software that were involved in this effort. This chapter sometimes gets a bit bogged down with explanations of useful scripts that folks wrote to analyze data or forward packets through firewalls.
Brute Force is a very readable and enjoyable book. It is well organized as a narrative, though it is not chronological; Curtin presents the background and substance to each aspect of the story together, rather than chronologically. This can be slightly confusing sometimes, but I think it improves the over-all flow of the story.
In a way, Curtin gives away the ending to the book at the beginning (and in the title), but this isn't ancient history, and most readers will probably already know that DES was defeated by this effort. He still manages to maintain a good sense of suspense throughout the book. He presents tables and analysis of the effort, along with predictions about completion dates that volunteers had made at the time. Unfortunately, he doesn't tell us whether those tables turned out to be correct. What percentage of the keyspace was searched by Macintoshes? How many different kinds of client machines were there in the end? Did Ohio State University try more keys than Oregon State University? Which one is the real OSU?
One of the main themes running throughout the book was that of community. The DESCHALL project was made up of thousands of volunteers from all over the US. Anyone with some spare CPU cycles could get involved by downloading the client software. This may remind you of other distributed computing projects like SETI@home. The community was further broken down into sub-groups like schools who would compete for bragging rights. The organization of the DESCHALL project was much like an open source project, though the key-cracking tools were not open source. Spreading the Word is a chapter about how people started to hear about DESCHALL and what the earliest adopters were like. Some of the tables in a later chapter list the operating system and hardware that the clients were running, which was a pretty cool snapshot of the Internet from 1997. It included lots of OS/2 clients, labs full of SGI machines, and plenty of computers which were only connected to the Internet via dial-up modems. Special scripts were developed for such machines so they could phone home when they needed a new block of keys.
Though the key cracking clients were not open source, they were free as in beer, at least for Americans. Since such cryptography-related software could not be exported at the time, this was a US-only effort. There was a European team, however, with their own software, called SolNet, and Curtin keeps us updated on their progress. In fact the DESCHALL project had an impact on the political debate of this time with regard to the export and control of cryptographic technologies. Curtin gives us interesting periodic updates on the political debate as the DES cracking story moves forward. Cryptography control was defeated at that time, but the use of cryptography is a right that will need continued protection.
The political story of DESCHALL was one aspect of the historical impact of the project. Another impact was the explosion of volunteer distributed computing networks after the DESCHALL project, with SETI@home being one of the most obvious examples. DESCHALL clearly demonstrated the viability of this kind of computation. Curtin touches briefly on this here and there, but does not go into detail. I would like him to more clearly spell out the trends in Internet distributed computing. I would like to hear that DESCHALL was derived from project A and that it inspired projects B, C, and D. Was it was the original Internet distributed computing network? Was it a fad that has abated in the last few years? Curtin touches on this a bit, but says, "Some other distributed computing projects like DESCHALL were around," (pg 200.) He says which ones, but doesn't make any claims that DESCHALL inspired SETI@home, for instance. Perhaps such things are never quite clear in the free exchange of ideas on the Internet.
The political and community aspects of the story wrap up very nicely. Curtin outlines DESCHALL's impact on driving the AES standard, and its (perhaps much smaller) impact on the debates on key escrow and encryption exports. Brute Force is a very enjoyable read about an important event, and I can happily recommend my friend Matt's book to the Slashdot crowd. My only criticisms can really be summed up by saying, "I want to hear more."
You can purchase Brute Force: Cracking the Data Encryption Standard from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Brute Force
ijones writes "Brute Force, by Matt Curtin, is about an event that many Slashdotters will remember: the cracking of the Data Encryption Standard. In June of 1997, a 56-bit DES key was discovered, and its encrypted message decoded, by an ad-hoc distributed network of computers, cooperating over the Internet. Four and a half months earlier, RSA had issued a challenge to the cryptography community, offering $10,000 to the first group to crack a 56-bit DES encrypted message. In Brute Force, Matt Curtin offers his first-hand account of the DESCHALL team's winning effort." Read on for the rest of Jones' review. Brute Force: Cracking the Data Encryption Standard author Matt Curtin pages 291 publisher Copernicus Books rating 9 reviewer Isaac Jones ISBN 0387201092 summary Volunteers working collaboratively over the internet manage to crack the Data Encryption Standard.
Although I wasn't involved with the DES cracking challenge, I am friends with the author of this book. I took a Lisp course from Matt at Ohio State University and I'll be forever grateful that Matt introduced me to functional programming with a great deal of humor and enthusiasm. I don't think I've ever seen Matt stay so serious for so long, but his enthusiasm comes through clearly in this book.
Brute Force can be enjoyed by both nerds and non-nerds interested in cryptography or codes. Those who have been a part of this or subsequent DES challenges may be particularly interested in this book. Curtin covers some technical details of DES and the brute force attack that the DESCHALL team used to discover a DES key. He also discusses the political and historical significance of this event. This is a fairly technical book, but it goes out of its way to explain non-obvious technical topics, so one doesn't need a lot of technical background to understand it.
Curtin briefly explains a lot of stuff: the C programming language, firewalls, UDP, one-time pads, protected memory, etc., in order to make this book readable for novices. Although I generally did not need such explanations, I did not find them annoying or distracting, as they were fairly brief. In fact, it's fun to read concise explanations of such topics. Occasionally, Curtin does go into just a little too much detail. The chapter on Architecture gives an explanation of some of the many pieces of software that were involved in this effort. This chapter sometimes gets a bit bogged down with explanations of useful scripts that folks wrote to analyze data or forward packets through firewalls.
Brute Force is a very readable and enjoyable book. It is well organized as a narrative, though it is not chronological; Curtin presents the background and substance to each aspect of the story together, rather than chronologically. This can be slightly confusing sometimes, but I think it improves the over-all flow of the story.
In a way, Curtin gives away the ending to the book at the beginning (and in the title), but this isn't ancient history, and most readers will probably already know that DES was defeated by this effort. He still manages to maintain a good sense of suspense throughout the book. He presents tables and analysis of the effort, along with predictions about completion dates that volunteers had made at the time. Unfortunately, he doesn't tell us whether those tables turned out to be correct. What percentage of the keyspace was searched by Macintoshes? How many different kinds of client machines were there in the end? Did Ohio State University try more keys than Oregon State University? Which one is the real OSU?
One of the main themes running throughout the book was that of community. The DESCHALL project was made up of thousands of volunteers from all over the US. Anyone with some spare CPU cycles could get involved by downloading the client software. This may remind you of other distributed computing projects like SETI@home. The community was further broken down into sub-groups like schools who would compete for bragging rights. The organization of the DESCHALL project was much like an open source project, though the key-cracking tools were not open source. Spreading the Word is a chapter about how people started to hear about DESCHALL and what the earliest adopters were like. Some of the tables in a later chapter list the operating system and hardware that the clients were running, which was a pretty cool snapshot of the Internet from 1997. It included lots of OS/2 clients, labs full of SGI machines, and plenty of computers which were only connected to the Internet via dial-up modems. Special scripts were developed for such machines so they could phone home when they needed a new block of keys.
Though the key cracking clients were not open source, they were free as in beer, at least for Americans. Since such cryptography-related software could not be exported at the time, this was a US-only effort. There was a European team, however, with their own software, called SolNet, and Curtin keeps us updated on their progress. In fact the DESCHALL project had an impact on the political debate of this time with regard to the export and control of cryptographic technologies. Curtin gives us interesting periodic updates on the political debate as the DES cracking story moves forward. Cryptography control was defeated at that time, but the use of cryptography is a right that will need continued protection.
The political story of DESCHALL was one aspect of the historical impact of the project. Another impact was the explosion of volunteer distributed computing networks after the DESCHALL project, with SETI@home being one of the most obvious examples. DESCHALL clearly demonstrated the viability of this kind of computation. Curtin touches briefly on this here and there, but does not go into detail. I would like him to more clearly spell out the trends in Internet distributed computing. I would like to hear that DESCHALL was derived from project A and that it inspired projects B, C, and D. Was it was the original Internet distributed computing network? Was it a fad that has abated in the last few years? Curtin touches on this a bit, but says, "Some other distributed computing projects like DESCHALL were around," (pg 200.) He says which ones, but doesn't make any claims that DESCHALL inspired SETI@home, for instance. Perhaps such things are never quite clear in the free exchange of ideas on the Internet.
The political and community aspects of the story wrap up very nicely. Curtin outlines DESCHALL's impact on driving the AES standard, and its (perhaps much smaller) impact on the debates on key escrow and encryption exports. Brute Force is a very enjoyable read about an important event, and I can happily recommend my friend Matt's book to the Slashdot crowd. My only criticisms can really be summed up by saying, "I want to hear more."
You can purchase Brute Force: Cracking the Data Encryption Standard from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Brute Force
ijones writes "Brute Force, by Matt Curtin, is about an event that many Slashdotters will remember: the cracking of the Data Encryption Standard. In June of 1997, a 56-bit DES key was discovered, and its encrypted message decoded, by an ad-hoc distributed network of computers, cooperating over the Internet. Four and a half months earlier, RSA had issued a challenge to the cryptography community, offering $10,000 to the first group to crack a 56-bit DES encrypted message. In Brute Force, Matt Curtin offers his first-hand account of the DESCHALL team's winning effort." Read on for the rest of Jones' review. Brute Force: Cracking the Data Encryption Standard author Matt Curtin pages 291 publisher Copernicus Books rating 9 reviewer Isaac Jones ISBN 0387201092 summary Volunteers working collaboratively over the internet manage to crack the Data Encryption Standard.
Although I wasn't involved with the DES cracking challenge, I am friends with the author of this book. I took a Lisp course from Matt at Ohio State University and I'll be forever grateful that Matt introduced me to functional programming with a great deal of humor and enthusiasm. I don't think I've ever seen Matt stay so serious for so long, but his enthusiasm comes through clearly in this book.
Brute Force can be enjoyed by both nerds and non-nerds interested in cryptography or codes. Those who have been a part of this or subsequent DES challenges may be particularly interested in this book. Curtin covers some technical details of DES and the brute force attack that the DESCHALL team used to discover a DES key. He also discusses the political and historical significance of this event. This is a fairly technical book, but it goes out of its way to explain non-obvious technical topics, so one doesn't need a lot of technical background to understand it.
Curtin briefly explains a lot of stuff: the C programming language, firewalls, UDP, one-time pads, protected memory, etc., in order to make this book readable for novices. Although I generally did not need such explanations, I did not find them annoying or distracting, as they were fairly brief. In fact, it's fun to read concise explanations of such topics. Occasionally, Curtin does go into just a little too much detail. The chapter on Architecture gives an explanation of some of the many pieces of software that were involved in this effort. This chapter sometimes gets a bit bogged down with explanations of useful scripts that folks wrote to analyze data or forward packets through firewalls.
Brute Force is a very readable and enjoyable book. It is well organized as a narrative, though it is not chronological; Curtin presents the background and substance to each aspect of the story together, rather than chronologically. This can be slightly confusing sometimes, but I think it improves the over-all flow of the story.
In a way, Curtin gives away the ending to the book at the beginning (and in the title), but this isn't ancient history, and most readers will probably already know that DES was defeated by this effort. He still manages to maintain a good sense of suspense throughout the book. He presents tables and analysis of the effort, along with predictions about completion dates that volunteers had made at the time. Unfortunately, he doesn't tell us whether those tables turned out to be correct. What percentage of the keyspace was searched by Macintoshes? How many different kinds of client machines were there in the end? Did Ohio State University try more keys than Oregon State University? Which one is the real OSU?
One of the main themes running throughout the book was that of community. The DESCHALL project was made up of thousands of volunteers from all over the US. Anyone with some spare CPU cycles could get involved by downloading the client software. This may remind you of other distributed computing projects like SETI@home. The community was further broken down into sub-groups like schools who would compete for bragging rights. The organization of the DESCHALL project was much like an open source project, though the key-cracking tools were not open source. Spreading the Word is a chapter about how people started to hear about DESCHALL and what the earliest adopters were like. Some of the tables in a later chapter list the operating system and hardware that the clients were running, which was a pretty cool snapshot of the Internet from 1997. It included lots of OS/2 clients, labs full of SGI machines, and plenty of computers which were only connected to the Internet via dial-up modems. Special scripts were developed for such machines so they could phone home when they needed a new block of keys.
Though the key cracking clients were not open source, they were free as in beer, at least for Americans. Since such cryptography-related software could not be exported at the time, this was a US-only effort. There was a European team, however, with their own software, called SolNet, and Curtin keeps us updated on their progress. In fact the DESCHALL project had an impact on the political debate of this time with regard to the export and control of cryptographic technologies. Curtin gives us interesting periodic updates on the political debate as the DES cracking story moves forward. Cryptography control was defeated at that time, but the use of cryptography is a right that will need continued protection.
The political story of DESCHALL was one aspect of the historical impact of the project. Another impact was the explosion of volunteer distributed computing networks after the DESCHALL project, with SETI@home being one of the most obvious examples. DESCHALL clearly demonstrated the viability of this kind of computation. Curtin touches briefly on this here and there, but does not go into detail. I would like him to more clearly spell out the trends in Internet distributed computing. I would like to hear that DESCHALL was derived from project A and that it inspired projects B, C, and D. Was it was the original Internet distributed computing network? Was it a fad that has abated in the last few years? Curtin touches on this a bit, but says, "Some other distributed computing projects like DESCHALL were around," (pg 200.) He says which ones, but doesn't make any claims that DESCHALL inspired SETI@home, for instance. Perhaps such things are never quite clear in the free exchange of ideas on the Internet.
The political and community aspects of the story wrap up very nicely. Curtin outlines DESCHALL's impact on driving the AES standard, and its (perhaps much smaller) impact on the debates on key escrow and encryption exports. Brute Force is a very enjoyable read about an important event, and I can happily recommend my friend Matt's book to the Slashdot crowd. My only criticisms can really be summed up by saying, "I want to hear more."
You can purchase Brute Force: Cracking the Data Encryption Standard from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Converting TeX to Microsoft Word?
belmolis asks: "For many years I've done almost all of my writing in TeX. This has increasingly caused problems with publishing in journals. For a long time, many journals reset what you sent them, so they didn't care what program you used. More and more, I find, they do, and in most cases, what they want is MS Word. Is there any good way to convert TeX to Word?" "I've seen some advertised. Some only work with LaTeX, which doesn't help. One claims to use a full-scale TeX interpreter, but my queries as to whether it can handle home-brew Metafont fonts, PIC graphics etc. have gone unanswered. These products also all seem to be plugins for MS Word. I don't use MS Windows or any other MS products, and hate WYSIWYG word processors (I hated Bravo before it was reincarnated as Word) so a Word plugin is not a great solution, even if it works.
Furthermore, I wonder what exactly these programs do. If they interpret the TeX and then generate very low level Word, that may result in a document that looks similar, but a journal editor probably won't be able to edit it the way he wants to. In some cases the editor can be persuaded to accept a camera-ready PDF, since it turns out that the publishers often want PDF and the reason the editor wants Word is so he can edit the text, but when the editor can't or won't budge, is there any alternative to reformatting the document entirely in Word or a clone?
The larger question this raises is, where are we going? Even if formats are open, translation is difficult if they are only commensurable at a very low level. Is the solution to write in something very abstract like DocBook? And if so, will the market go this way?" -
Rebuilding New Orleans With Science
EccentricAnomaly writes "The New York Times has a discussion of flood control methods in use in Holland, England, and Bangladesh that could be used in the rebuilding of New Orleans. Of particular interest is the $8 billion Delta Works built by the Netherlands in response to the North Sea flood of 1953, which almost destroyed the city of Rotterdam, but for a heroic captain who plugged a breach in a dike with his ship." From the article: "While scientists hail the power of technology to thwart destructive forces, they note that flood control is a job for nature at least as much as for engineers. Long before anyone built levees and floodgates, barrier islands were serving to block dangerous storm surges. Of course, those islands often fall victim to coastal development." -
Rebuilding New Orleans With Science
EccentricAnomaly writes "The New York Times has a discussion of flood control methods in use in Holland, England, and Bangladesh that could be used in the rebuilding of New Orleans. Of particular interest is the $8 billion Delta Works built by the Netherlands in response to the North Sea flood of 1953, which almost destroyed the city of Rotterdam, but for a heroic captain who plugged a breach in a dike with his ship." From the article: "While scientists hail the power of technology to thwart destructive forces, they note that flood control is a job for nature at least as much as for engineers. Long before anyone built levees and floodgates, barrier islands were serving to block dangerous storm surges. Of course, those islands often fall victim to coastal development." -
Rebuilding New Orleans With Science
EccentricAnomaly writes "The New York Times has a discussion of flood control methods in use in Holland, England, and Bangladesh that could be used in the rebuilding of New Orleans. Of particular interest is the $8 billion Delta Works built by the Netherlands in response to the North Sea flood of 1953, which almost destroyed the city of Rotterdam, but for a heroic captain who plugged a breach in a dike with his ship." From the article: "While scientists hail the power of technology to thwart destructive forces, they note that flood control is a job for nature at least as much as for engineers. Long before anyone built levees and floodgates, barrier islands were serving to block dangerous storm surges. Of course, those islands often fall victim to coastal development." -
Firefox Moving On From SSL 2.0
Juha-Matti Laurio writes "Plans are afoot to remove support for SSL version 2.0 in Mozilla Firefox, reports MozillaZine portal. Mozilla Foundation is eager to disable support for SSL 2.0 and have all Firefox installations use only the newer and more secure SSL 3.0 and TLS 1.0 protocols." From the post: "Netscape Communications Corporation introduced SSL 2.0 with the launch of Netscape Navigator 1.0 in 1994. Netscape Navigator 2.0 included support for SSL 3.0 when it was released in 1996. The specification for TLS 1.0, essentially a standardized version of SSL 3.0 with some differences, was published in 1999." -
How I Failed the Turing Test
chrisjrn writes "I stubled across this article today, detailing a man's experiences of being added to AIM Screen Name lists - one full of "celebrities" and the other full of "Sex Bots" (he was, of course, neither of these). Raises a few questions as to how easy it is to get a hold of your screenname, and also of the effectiveness of the Turing Test for AI, in the online world. Or is it just that people aren't bothered trying to tell the humans apart anymore?" Also, it's funny. Don't try to read anything deep into it. -
OpenOffice Goes LGPL
Motor writes "According to the OpenOffice.org site, Sun has decided to relicense OpenOffice under the LGPL alone and retire its Sun Industry Standards Source License (SISSL). Sun supporters claim that it's part of Sun's move to reduce the number of open source licenses. Of course it could just be PR, since Sun stirred up a lot of bad publicity with the introduction of the CDDL for the release of Solaris. Either way, it's good news for OpenOffice." -
OpenOffice Goes LGPL
Motor writes "According to the OpenOffice.org site, Sun has decided to relicense OpenOffice under the LGPL alone and retire its Sun Industry Standards Source License (SISSL). Sun supporters claim that it's part of Sun's move to reduce the number of open source licenses. Of course it could just be PR, since Sun stirred up a lot of bad publicity with the introduction of the CDDL for the release of Solaris. Either way, it's good news for OpenOffice." -
Google Plans To Destroy Unindexed Information
linolium writes "Executives at Google, the rapidly growing online-search company that promises to 'organize the world's information,' announced Monday the latest step in their expansion effort: a far-reaching plan to destroy all the information it is unable to index. 'Book burning is just the beginning," said Google co-founder Larry Page. 'This fall, we'll unveil Google Sound, which will record and index all the noise on Earth. Is your baby sleeping soundly? Does your high-school sweetheart still talk about you? Google will have the answers.'" FYI; it's The Onion, so yes, it's a joke. -
Nintendo's First Podcast
celerityfm writes "With the US release of an MP3/multimedia player add-on for the Nintendo DS and Gameboy Advance just around the corner, Nintendo is already busy creating content for it with their first Podcast, produced by podcast pioneer Carl Franklin. Check out the first episode, it's all about Nintendogs." Commentary is available at Press the Buttons. From that post:"From the sound of things, girls love Nintendogs. Dog training tips are exchanged, fans are briefly interviewed, and even a parent weighs in now and then. Ms. McCollom's segment goes in to why girls are apt to love raising portable puppies and just how the Nintendo DS's wireless mode enables gamers to meet new players and their dogs. Teen People even proclaims the experience 'better than Barbie', so if that's not a young girl stamp of approval, I don't know what is. " -
Communications Infrastructure No Match for Katrina
jfourier writes "In this age of cheap commoditized consumer electronics and advanced mobile technology, why can't all the people of a city make contact during an emergency? Cell phone circuits filled up during 9/11 attacks and in the wake of hurricane Katrina very few victims can make contact with their families, despite the fact that they have all those mobile phones. The Red Cross is looking to deploy satellite equipment to restore communications in affected areas." From the article: "Katrina made landfall in Louisiana early this morning with sustained winds of 145 mph, but veered just enough to the east to spare New Orleans a direct blow. Even so, flooding, power outages and heavy damage to structures were reported throughout the region. The Red Cross tomorrow expects to begin deploying a host of systems it will need, including satellite telephones, portable satellite dishes, specially equipped communications trucks, high- and low-band radio systems, and generator-powered wireless computer networks, said Jason Wiltrout, a Red Cross network engineer. " -
Open Source Autos Hit the Streets in Spain
markdowling writes "BBC News has a story about electrically powered tourist cars in Cordoba which provide tourist information in French, English and Spanish as landmarks are passed. The promoter, Alfredo Romeo, calls them Blobjects which he heard described in a speech by Bruce Sterling. The car's tourist guide software is open source - Romeo's quoted reason: 'With proprietary software, innovation comes from the people in marketing. But with open source, innovation comes from the guy who is really in the market. It comes from someone who knows the city.'" -
Yet Another Method Of Achieving Nuclear Fusion
deglr6328 writes "Recent research has seen the use of the pyroelectric effect, the compression of bubbles using ultrasound and gas jet irradiation for producing nuclear fusion on small tabletop-scales. Yet another method can now be added to the list which uses ultraintense laser irradiation striking a borated plastic target to heat a plasma to billion kelvin temperatures and achieves aneutronic (clean) proton-boron fusion. (The PRL paper can be read online.) Though, like the other recently discovered exotic methods of attaining fusion, it does not look like a method which can be scaled up to ignition or even anywhere near break even, it still may have important use in the laboratory for the examination of such incredibly high temperature plasmas." -
Yet Another Method Of Achieving Nuclear Fusion
deglr6328 writes "Recent research has seen the use of the pyroelectric effect, the compression of bubbles using ultrasound and gas jet irradiation for producing nuclear fusion on small tabletop-scales. Yet another method can now be added to the list which uses ultraintense laser irradiation striking a borated plastic target to heat a plasma to billion kelvin temperatures and achieves aneutronic (clean) proton-boron fusion. (The PRL paper can be read online.) Though, like the other recently discovered exotic methods of attaining fusion, it does not look like a method which can be scaled up to ignition or even anywhere near break even, it still may have important use in the laboratory for the examination of such incredibly high temperature plasmas." -
Yet Another Method Of Achieving Nuclear Fusion
deglr6328 writes "Recent research has seen the use of the pyroelectric effect, the compression of bubbles using ultrasound and gas jet irradiation for producing nuclear fusion on small tabletop-scales. Yet another method can now be added to the list which uses ultraintense laser irradiation striking a borated plastic target to heat a plasma to billion kelvin temperatures and achieves aneutronic (clean) proton-boron fusion. (The PRL paper can be read online.) Though, like the other recently discovered exotic methods of attaining fusion, it does not look like a method which can be scaled up to ignition or even anywhere near break even, it still may have important use in the laboratory for the examination of such incredibly high temperature plasmas." -
Yet Another Method Of Achieving Nuclear Fusion
deglr6328 writes "Recent research has seen the use of the pyroelectric effect, the compression of bubbles using ultrasound and gas jet irradiation for producing nuclear fusion on small tabletop-scales. Yet another method can now be added to the list which uses ultraintense laser irradiation striking a borated plastic target to heat a plasma to billion kelvin temperatures and achieves aneutronic (clean) proton-boron fusion. (The PRL paper can be read online.) Though, like the other recently discovered exotic methods of attaining fusion, it does not look like a method which can be scaled up to ignition or even anywhere near break even, it still may have important use in the laboratory for the examination of such incredibly high temperature plasmas." -
SpaceShipThree to be Orbital Spacecraft
FleaPlus writes "The president of spaceflight company Virgin Galactic has recently stated that if the upcoming suborbital service with SpaceShipTwo is successful, the follow-up SpaceShipThree will be an orbital craft. Although orbital spaceflights would be much longer and could potentially dock with orbital space stations, they are also considerably more difficult than suborbital spaceflights. Other private firms working on orbital spaceflight (and potentially in the running for Robert Bigelow's $50 million America's Space Prize for orbital flight) include t/Space and SpaceX." -
Lucene in Action
Simon P. Chappell writes "I don't know about you, but I hardly bother with browser bookmarks any more. I used to have so many bookmarks, back in the early days of Netscape's 4 series, that I would have to regularly trim and edit my bookmark file to prevent my browser from crashing on startup -- that's a lot of bookmarks, folks! Now, I go to my favourite web search engine, enter a couple of appropriate search terms and voila, there's my page! Search engines are so ubiquitous that we rarely give much thought to the technology that powers them. Lucene in Action by Otis Gospodnetic and Erik Hatcher , both committers on the Lucene project, goes behind the HTML and takes you on a guided tour of Lucene, one of a generation of powerful Free and Open-Source search engines now available." Read on for the rest of Chappell's review. Lucene in Action author Gospodnetic and Hatcher pages 421 (7 pages of index) publisher Manning rating 9 reviewer Simon P. Chappell ISBN 1932394281 summary Solid introduction to Lucene Who's it for? Lucene is a library and framework, rather than a complete application. It truly is an engine, around which you are expected to build and extend your own application. Like Lucene, the book is targeted at those who are looking for a tool to build their own search facility application rather than just "download and go." The book does include a number of case studies of Lucene usage (including at least one download and go search engine) but those are included to show how to use and adapt Lucene to fit differing environments rather than as ends in themselves. The Structure The book is sensibly divided into two parts. The first part looks at "Core Lucene" functionality, while the second part addresses "Applied Lucene".
Part one has six chapters, covering the central components and inner workings of Lucene. It's here that the book starts with a tutorial introduction, familiarising the reader with the concepts of Lucene as a search engine around which you wrap your own code. The other five chapters move steadily through good search engine fare, with indexing getting the whole of chapter two to itself The discussion of how to retrieve text from the documents being indexed is mentioned here but postponed until chapter seven, where it is dealt with exhaustively. Chapter three covers searching, and especially how Lucene ranks documents.
Chapter four examines analysis. In it's chapter introduction, the book explains that "Analysis, in Lucene, is the process of converting field text into it's most fundamental indexed representation, terms." This process is performed by an analyser, which tokenises text according to it's own built in rules; each analyser will have a different emphasis, some want only dictionary words, others might explicitly include acronyms and sometimes you'll want an analyser that will block stop words (those words in languages that are part of the structure, but that add nothing to the information being conveyed by the text; classic examples of stop words in English include "a", "and" and "the").
Chapter five looks at advanced search techniques; everything from sorting search results, searching on multiple fields to filtering searches. Many free or open source software tools are extensible, and Lucene is no exception. Chapter six addresses creating and using custom components within Lucene, everything from custom sort methods to custom filters.
Part two, the final four chapters, cover Applied Lucene. It is dedicated to practical uses of Lucene and answers the question "So, what can I do with a search engine?" Chapter seven covers ways and means to parse common, non-plain text document formats. The primary formats covered are RTF, XML, PDF, HTML and Microsoft Word. The ability to parse and index these file formats will cover the search engine needs of the majority of Lucene users. Chapter eight looks at a number of Lucene tools and extensions that are available; many of them being free and open source software. Chapter nine covers ports of Lucene. While for many users, Lucene being a Java library is not a problem, some users want its functionality in environments that do not have Java. The chapter looks at ports written in C++, C#, Perl and Python. Lastly, chapter ten takes a thorough look at seven Lucene case studies. Perhaps the "star" case study is the one about Nutch, a download and go search engine written by Doug Cutting , the original author of Lucene.
There are three appendices. The first offers installation advice for Lucene; a useful addition that those newer to working with Java libraries will surely appreciate. The second appendix has a very well explained description of the Lucene index format. This is the kind of information that can be hard to find, so it is welcome in a book of this sort. The last appendix contains a number of categorised resource references. The number and breadth of the resources provided could provide quite an incredible education in information retrieval theory if the reader was inclined to read them all. What's to Like? There are several things to like about this book. Let's start with the fact that the authors are part of the core development team of Lucene. This gives them both credibility and an excellent understanding of the internal workings of Lucene. Co-author Erik Hatcher is a fantastic writer, having previously been a co-author of the only Ant book worth bothering with, Manning's Java Development with Ant . (Full disclosure: I do know Erik personally.)
The structure of the book is well thought out and each chapter does seem to move your understanding forward when combined with what you learned from the proceeding ones. The division into core and applied Lucene is also helpful. While you'd hope that this was the case, it often isn't; hence I note it as a positive.
I especially appreciate that this book does not fill up page after page with API documentation. The authors appear to have grasped that if you have Internet access to download the software, you might just be able to access the documentation online; rather, they concentrate on the way to use the software. What a concept!
As a part of Manning's "in Action" series, the book has excellent layout and has obviously been thoroughly edited by both technical evaluators and copyeditors. This might seem to be a small thing to some, but a well-edited book stands out clearly from the crowd. What's to consider? If you are looking for a book on using and configuring a download and go style of search engine, this book would be less suitable. While the case study on Nutch is of good length, it would be too short to useful as a configuration guide. Conclusion I enjoyed reading this book. If you have any text searching needs, this book will be more than sufficient equipment to guide you to successful completion. Even, if you are just looking to download a pre-written search engine, then this book will provide a good background to the nature of information retrieval in general and text indexing and searching specifically.
You can purchase Lucene in Action from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Lucene in Action
Simon P. Chappell writes "I don't know about you, but I hardly bother with browser bookmarks any more. I used to have so many bookmarks, back in the early days of Netscape's 4 series, that I would have to regularly trim and edit my bookmark file to prevent my browser from crashing on startup -- that's a lot of bookmarks, folks! Now, I go to my favourite web search engine, enter a couple of appropriate search terms and voila, there's my page! Search engines are so ubiquitous that we rarely give much thought to the technology that powers them. Lucene in Action by Otis Gospodnetic and Erik Hatcher , both committers on the Lucene project, goes behind the HTML and takes you on a guided tour of Lucene, one of a generation of powerful Free and Open-Source search engines now available." Read on for the rest of Chappell's review. Lucene in Action author Gospodnetic and Hatcher pages 421 (7 pages of index) publisher Manning rating 9 reviewer Simon P. Chappell ISBN 1932394281 summary Solid introduction to Lucene Who's it for? Lucene is a library and framework, rather than a complete application. It truly is an engine, around which you are expected to build and extend your own application. Like Lucene, the book is targeted at those who are looking for a tool to build their own search facility application rather than just "download and go." The book does include a number of case studies of Lucene usage (including at least one download and go search engine) but those are included to show how to use and adapt Lucene to fit differing environments rather than as ends in themselves. The Structure The book is sensibly divided into two parts. The first part looks at "Core Lucene" functionality, while the second part addresses "Applied Lucene".
Part one has six chapters, covering the central components and inner workings of Lucene. It's here that the book starts with a tutorial introduction, familiarising the reader with the concepts of Lucene as a search engine around which you wrap your own code. The other five chapters move steadily through good search engine fare, with indexing getting the whole of chapter two to itself The discussion of how to retrieve text from the documents being indexed is mentioned here but postponed until chapter seven, where it is dealt with exhaustively. Chapter three covers searching, and especially how Lucene ranks documents.
Chapter four examines analysis. In it's chapter introduction, the book explains that "Analysis, in Lucene, is the process of converting field text into it's most fundamental indexed representation, terms." This process is performed by an analyser, which tokenises text according to it's own built in rules; each analyser will have a different emphasis, some want only dictionary words, others might explicitly include acronyms and sometimes you'll want an analyser that will block stop words (those words in languages that are part of the structure, but that add nothing to the information being conveyed by the text; classic examples of stop words in English include "a", "and" and "the").
Chapter five looks at advanced search techniques; everything from sorting search results, searching on multiple fields to filtering searches. Many free or open source software tools are extensible, and Lucene is no exception. Chapter six addresses creating and using custom components within Lucene, everything from custom sort methods to custom filters.
Part two, the final four chapters, cover Applied Lucene. It is dedicated to practical uses of Lucene and answers the question "So, what can I do with a search engine?" Chapter seven covers ways and means to parse common, non-plain text document formats. The primary formats covered are RTF, XML, PDF, HTML and Microsoft Word. The ability to parse and index these file formats will cover the search engine needs of the majority of Lucene users. Chapter eight looks at a number of Lucene tools and extensions that are available; many of them being free and open source software. Chapter nine covers ports of Lucene. While for many users, Lucene being a Java library is not a problem, some users want its functionality in environments that do not have Java. The chapter looks at ports written in C++, C#, Perl and Python. Lastly, chapter ten takes a thorough look at seven Lucene case studies. Perhaps the "star" case study is the one about Nutch, a download and go search engine written by Doug Cutting , the original author of Lucene.
There are three appendices. The first offers installation advice for Lucene; a useful addition that those newer to working with Java libraries will surely appreciate. The second appendix has a very well explained description of the Lucene index format. This is the kind of information that can be hard to find, so it is welcome in a book of this sort. The last appendix contains a number of categorised resource references. The number and breadth of the resources provided could provide quite an incredible education in information retrieval theory if the reader was inclined to read them all. What's to Like? There are several things to like about this book. Let's start with the fact that the authors are part of the core development team of Lucene. This gives them both credibility and an excellent understanding of the internal workings of Lucene. Co-author Erik Hatcher is a fantastic writer, having previously been a co-author of the only Ant book worth bothering with, Manning's Java Development with Ant . (Full disclosure: I do know Erik personally.)
The structure of the book is well thought out and each chapter does seem to move your understanding forward when combined with what you learned from the proceeding ones. The division into core and applied Lucene is also helpful. While you'd hope that this was the case, it often isn't; hence I note it as a positive.
I especially appreciate that this book does not fill up page after page with API documentation. The authors appear to have grasped that if you have Internet access to download the software, you might just be able to access the documentation online; rather, they concentrate on the way to use the software. What a concept!
As a part of Manning's "in Action" series, the book has excellent layout and has obviously been thoroughly edited by both technical evaluators and copyeditors. This might seem to be a small thing to some, but a well-edited book stands out clearly from the crowd. What's to consider? If you are looking for a book on using and configuring a download and go style of search engine, this book would be less suitable. While the case study on Nutch is of good length, it would be too short to useful as a configuration guide. Conclusion I enjoyed reading this book. If you have any text searching needs, this book will be more than sufficient equipment to guide you to successful completion. Even, if you are just looking to download a pre-written search engine, then this book will provide a good background to the nature of information retrieval in general and text indexing and searching specifically.
You can purchase Lucene in Action from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Steganography with Flickr
yiangocy writes "Steganography is not something new, there have been techniques and available programs for hiding data in pictures/audio files for a long time now. However, one step further is using popular online photo sharing sites, such as Flickr in hiding your data, successfully." -
New, Faster Attack against SHA-1 Revealed
VxSote writes "According to Bruce Schneier's blog, a team of Chinese cryptographers has announced new results against SHA-1 that speed up the time required to find collisions compared to their previously published attack. Schneier says that a SHA-1 collision search is now 'squarely in the realm of feasibility,' and that further improvements are expected." -
The Milky Way is Not a Spiral?
ETEQ writes "Space.com reports that new data from the Spitzer Space Telescope showing that the Milky Way is in fact a barred spiral! Looks like all our old astronomy textbooks will have to be thrown away..." -
Crocodile's Immune System Kills HIV
ASEville writes "In an ongoing effort to stop the spread of HIV, scientists in Australia have discovered that crocodiles can fight off HIV and kill the virus. This is a major boon to medicine because the crocodile serum can also fight things that are penicillin resistant such as staphylococcus aureus." -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Ending Spam
Shalendra Chhabra writes "Jonathan Zdziarski has been fighting spam since before the first MIT spam conference in 2003, and has now released a full-on technical book, Ending Spam, on spam filtering. Ending Spam covers how the current and near-future crop of heuristic and statistical filters actually work under the hood, and how you can most effectively use such filters to protect your inbox." Read on for the rest of Chhabra's review. Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification author Jonathan A. Zdziarski pages 312 publisher No Starch Press rating 8 reviewer Shalendra Chhabra ISBN 1593270526 summary Very Good Book Covering Statistical Models and Techniques Implemented in Current Spam Filters
Spam (unsolicited commercial email) and phishing (fraudulent emails) are causing losses of billions of dollars to businesses. Many initiatives are currently underway for fighting this challenge. On the legal front, a Virginia court recently sentenced a prolific spammer, Jeremy Jaynes, to nine years in prison, and a Nigerian court sentenced a woman to two and a half years for phishing. Michigan and Utah have both passed laws creating "do-not-contact" registries in July/August 2005, covering e-mail addresses, instant messaging addresses and telephone numbers. Technical initiatives to fight spam include server- or client-side spam filtering, using Lists (Blacklists, Whitelists, Greylists), Email Authentication Standards (IIM, DK, DKIM, SPF, SenderID), and emerging sender reputation and accreditation services.
Ending Spam is the first book explaining the fine details of the theoretical models and machine-learning algorithms implemented in these filters. The book is divided into three parts: introduction to spam filtering, fundamentals of statistical filtering, and advanced concepts of statistical filtering.
The first section of the book discusses the history of spam, spam kings, different approaches for fighting spam such as blacklisting, whitelisting, heuristic filtering, challenge response, throttling, collaborative filtering, Authenticated SMTP, Sender Policy Framework and SenderID, spammer fingerprinting, etc. However, the author omitted any mention of locally-sensitive hash functions (such as Nilsimsa Hash) to counter spammers' random insertion of words, the use of CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart), Greylisting, Identified Internet Mail, and Domain Keys (now Domain Keys Identified Mail).
In the next chapter, the author clearly explains various components of a Language Classifier Pipeline, including the Historical Dataset (aka wordlist, database, dictionary, filter memory), Tokenizer, and the Analysis Engine with its feedback loop. However, the process flow of a language classifier could have been more generalized, e.g. incorporating an initial text-to-text transformer. This chapter also covers the advantages and disadvantages of various training modes for filters, such as Train Everything (TEFT), Train-on-Error (TOE), and Train Until No Errors (TUNE). This part concludes with the description of Paul Graham's famous spam-filtering technique using Bayesian classification (as described in "A Plan for Spam"), Gary Robinson's Geometric Mean Test, Fisher-Robinsons Inverse Chi Square (including the source code for the inversion function), and some other tricks for optimizing spam- filtering accuracy.
The second part of this book deals with the fundamentals of statistical filtering. The author explains HTML and Base64 encoding, followed by a detailed description of tokenization techniques (e.g. Sparse Binary Polynomial Hashing). Then there's a discussion of the various tricks that spammers use for penetrating filters. Although these tactics are mentioned in John Graham-Cumming's "Spammers Compendium," Jonathan has very elegantly explained why some tricks work for spammers and some don't. This part concludes by addressing some of the resource, storage and scaling concerns raised by the large number of features generated from tokenization techniques.
The third part of this book deals with advanced concepts of statistical filtering. This includes the testing criteria for measuring accuracy of an email filter, and some advanced tokenization concepts, e.g. chained tokens (taking word-pairs and phrases into account, instead of individual words) generated using a sliding 5-byte window as mentioned in Sparse Binary Polynomial Hashing. The next chapter describes the Markovian Model implemented in the CRM114 Discriminator, but the author fails to describe different weighting schemes for features implemented in the Markovian-based version of CRM114. The author then describes the Bayesian Noise Reduction Technique for purging "out of context" data from the mail text. This chapter concludes with a very nice summary of collaborative algorithms and techniques, such as Message Innoculation, Streamlined Blackhole List, Fingerprinting, Automatic Whitelisting, URL Blacklisting, and Honeypot email addresses for snaring spammers' address harvesting bots.
The most interesting part of this book is the appendix, where the author presents interviews with John Graham-Cumming of POPFile, Brian Burton of SpamProbe, Marty Lamb of TarProxy, Bill Yerazunis of CRM114 Discriminator, and Jonathan Zdziarski of DSPAM (himself). I loved this section.
The salient points of the book: it's very easy to read; each chapter begins with a very thought-provoking introduction, and concludes with a crisp "final thoughts" section. The number of technical errors are very few in this print, and the illustrations are of good quality. Since the book is geared more toward the Bayesian and statistical generation of spam filters, the absence of certain spam-busting technologies is acceptable. However, a noticeable omission is the lack of discussion about measuring spam-filter accuracy, and what impact this has on setting filtration thresholds. A section on the economics of tradeoffs, and the use of a Receiver Operating Characteristic curve (ROC) would have been very helpful.
Overall, by putting together Ending Spam, Jonathan Zdziarski has made another significant contribution (after DSPAM) to the anti-spam community. Whether you are a system administrator, anti-spam researcher, engineer or a newbie interested in fighting spam, this book is a great reference.
William S Yerazunis and Richard Jowsey also contributed to this review. Shalendra Chhabra is a Graduate Student in Department of Computer Science and Engineering at University of California, Riverside. He is on the development team of CRM114 Discriminator and has presented his work at MIT Spam Conference 2005, Cisco Systems, and Stanford University. You can purchase Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Wikipedia Used For Apparent Viral Marketing Ploy
jangobongo writes "An article over at BoingBoing discusses what appears to be a viral marketing ploy appearing in a Wikipedia entry. Quote: "Someone has apparently abused collaborative reference site Wikipedia in a viral marketing campaign for a BBC online alternate reality game." " -
Wikipedia Used For Apparent Viral Marketing Ploy
jangobongo writes "An article over at BoingBoing discusses what appears to be a viral marketing ploy appearing in a Wikipedia entry. Quote: "Someone has apparently abused collaborative reference site Wikipedia in a viral marketing campaign for a BBC online alternate reality game." " -
YouTube -- The Flickr of Video?
An anonymous reader writes "A new folksonomy website that seems to be catching on is YouTube, a service similar to Flickr, except that it is for sharing and hosting short video clips instead of photos. Like Flickr, its core functionality is implemented in Flash. Videos can be tagged, searched, discussed, etc through a social network. YouTube has developer APIs, RSS feeds, and the ability to embed videos directly into other web pages. The website was recently profiled on TechCrunch as an up-and-coming Web 2.0 application." -
Podcasting
SFEley (Stephen Eley) writes "Todd Cochrane's Podcasting: The Do-It-Yourself Guide has been heavily pushed in the podcasting community as the first of a wave of podcasting books to be released in the next several months. All of these books will surely cover the same themes, more or less: what podcasts are, how to listen to them, and how to produce your own. The popularity of podcasting is exploding right now, with coverage in every press outlet and Apple hyping it as The Next Big Thing. It's easy to see that there will be a huge demand for these books, even if they don't do much more than state the obvious. So what about this one? Other than being the first, does it offer any compelling virtues for the would-be podcaster or listener?" Read on for Eley's answer to that question. Podcasting: Do-It-Yourself Pirate Radio for the Masses author Todd Cochrane pages 281 publisher Wiley rating 4 reviewer Stephen Eley ISBN 0764597787 summary How to find, record, and publish podcasts
Before we can even begin to talk about the book, we ought to cover the preliminaries. If you've been living under a rock for most of 2005, you may not know that podcasting is the latest Internet publishing wave, getting most of the same hype that blogging has gotten but much faster. In its simplest form, it's just people producing audio files (talk, music, whatever) and syndicating them over an RSS feed. Listeners can then use one of several apps to automatically download them and load them onto an MP3 player. The mainstream media, feeling some embarrassment for missing the last few Web boats, has jumped on podcasting and given it, frankly, a lot more press than it probably deserves right now.
A note on the author: Todd Cochrane produces Geek News Central, a very popular tech podcast wherein he reads out news headlines and offers commentary. He also founded and manages the Tech Podcast Network, a consortium of other technology podcasts that band together for cross-promotion, content standards and advertising, and he's the main force behind the heavily advertised and sponsored Podcast Awards. It's fair to say that Cochrane has done a lot for podcasters in various ways, and although I've disagreed with him on some of the details of his projects, I respect him highly for his tremendous energy and the work he's done to make podcasting a respectable form of media.
Another note (and disclaimer) on myself: I also have my own podcast, a moderately popular one that narrates science fiction short stories. In a practical sense this makes me both a podcaster and a literary editor. Which means, in turn, that I have a sensitivity both to poor information on podcasting and poor writing.
And with all that said... I'm afraid Podcasting: The Do-It-Yourself Guide is a marginal book at best. It doesn't suck, and there's nothing horribly wrong with the information it gives, but it has two endemic problems. Cochrane's responsible for both, but I put the real blame on his editors at Wiley, who likely ignored them in their rush to get the book out before any others.
The first problem is the writing. It's possible that this bothers me more than it would others. Todd Cochrane may be an intelligent, selfless, wonderful guy -- I truly believe that he is -- but the man can't write. The entire book exhibits a rushed, forced-casual, eighth-grade English paper style that grates on me like nails on a chalkboard. Cochrane even admits this in his acknowledgments: "Early on, I made it clear to Chris [Webb], my acquisitions editor, that I was a geek/tech guy first and that he did not want to see my English grades. Even so, he assured me that I was their man, and I went to work."
Well, Chris Webb, you're a dumbass. You picked someone who admitted he couldn't write to write a book on a breakthrough technology. As a result, the book is vague, meandering, and frequently redundant, e.g.: "You will want to use this Recording Control window to control your default recording device." That phrase ("You will want to ...") crops up everywhere: the book's not only in second person, but it's a second person that tells the reader what he/she wants. The only sentence opener that appears more often is "Obviously" -- which frequently precedes a thought that is neither obvious nor related to the sentence before it.
You will also want to ignore the poor punctuation and comma splices, the frequent intersplicing of Notes and Tips paragraphs that seem indistinguishable (in both font and content) from the main text, and very often, the simple use of the wrong words. In many cases this is simply amusing: "[Dave Winer's] analogy was that it was taking longer to download the video than it was to play it." Uh, that's not an analogy, dude. In at least one case it leads to a technically incorrect statement: "The reading on the software-controlled meter in my audio-recording package showed nearly 40 dB of baseline noise," when what he really meant was a noise floor of -40 dB. Two very different things.
The other major problem is the narrow perspective. It's really Podcasting: The Do-It-Todd-Cochrane's-Way Guide. Everything in this book is about Cochrane. Every example is his own podcast, every screenshot of a Web page is his own, and he's got multiple photos of himself in various dorky situations. Any photos of other podcasters? Mur Lafferty, perhaps, or Soccergirl? You wish. I have no problem with Cochrane using himself as a starting point, but it's a very diverse field, and nobody podcasts with quite the same gear or the same techniques as anybody else. Cochrane says he spent significant time interviewing software developers for the chapters on applications, but there's no indication anywhere that he spoke to any other podcasters in writing this book. That's a huge mistake, rushed deadlines or no rushed deadlines. Not only does it reduce the book's utility, but it also makes the prose seem dreary, monotonic, and egocentric.
So there's my overview. For those who think the book may still have some use to you (and it might, if you can put up with the above) I'll break it down by section:
Part I: Listening to the Podcast Revolution This section has three chapters, and they're useless. The book begins, "Do you have specific interests? How about triathlons? I have to admit, most radio broadcasts don't deal with those kind of subjects. But that's about to change." Yeah, okay. The problem here (beyond the clumsy writing) should be obvious: if you have no idea what podcasting is, you're not interested enough to buy a book on podcasting. The first chapter, "What Is a Podcast?" has Cochrane spiraling around the subject of podcasting for twelve pages without ever giving a simple definition. Then we've got two chapters which together describe the leading software tools used to download podcasts, and tutorials for using them to subscribe to -- can you guess? -- Todd Cochrane's podcast. To be fair, it was a pretty decent overview of the major client applications at the time of the book's writing; which means it's already obsolete, as iTunes 4.9 has totally changed the landscape since then. Of course, that can't be helped. The real weakness of this section is its superfluity: if you're willing to pay $20 for a book on podcasting, it's because you want to make podcasts. Even Grandma's not going to buy this book to learn how to listen to them.
Part II: Joining the Revolution: Your Own Podcast Here's where the book starts to get genuinely interesting. The obligatory but stupid chapters on listening to podcasts are behind us; now it's all about making them. The first chapter here, "Choosing a Podcast Format," actually has little to criticize. His basic message is sound: Follow your passions; develop a show structure and follow it; and be aware of copyright issues if you're playing music. All of that is good advice, and his detailed description of his own show structure and notes is appropriate here. This is followed by a completely unnecessary chapter about computer choices, in which he shows his Windows colors and comes off a trifle condescending toward the Mac. ("In researching materials for this book, I found I could not do the reviews justice unless I had a Mac, so I purchased a Mac Mini ... I knew that if I could record a podcast on a Mac Mini, it would probably make the Mac fans happy.") Then, at last, he delivers the first truly crunchy chapter: "The Semiprofessional Podcast Studio." This chapter's honestly very good, running the gamut of sound cards, microphones, mixers, Firewire interfaces (he dismisses USB interfaces rather unfairly), digital recorders, even quiet case fans. Some of it's hand-waved, and some of it's so vague it's just silly: "A condenser microphone is generally never found in households. People might have them, but they usually are not aware that they do." On the other hand, his discussion of quality sound cards does have much of value (barring the "40dB of baseline noise" misstatement I mentioned above), and he gives one of the best descriptions of mixers and effects processors for novices that I've found. If you have no idea what sort of equipment you might need for quality sound in your podcast, you'll get a decent grounding here. Not an excellent grounding, but perhaps enough to parse a little bit more of the serious sound FAQs on the Web.
Part III: Recording Your Podcast and Performing Postproduction Tasks (Yes, the man can't even name things with brevity.) There's one weak chapter here and two great ones. In "Recording Locations," Cochrane reveals that you can podcast at home, in your car, at a restaurant, or walking around. Whee. Then we get to the actual process of recording and postproduction, and the book honestly shines. He describes step-by-step how to set up Audacity (the excellent freeware Win/Mac/Linux sound editor) to record, how to set up a typical mixer, and best of all, how to set levels properly. Levels are the bane of any audio amateur, and these half-dozen pages are gold; it's the one thing a novice podcaster is likely to turn back to and reference several times over in his first few recordings -- or ought to, anyway. His advice on noise reduction, amplifying, and normalizing is spot-on, the steps listed for MP3 encoding are simple but solid, and he even gives several good options for ID3 tagging. (A step too often overlooked by podcasters.) I could complain about a few weird digressions -- e.g., the postproduction chapter tells you how to upload to Openpodcast.org, which is an utterly bizarre thing to advise -- but they're easily ignored, and overall this section truly shines.
Part IV: Hosting and Preparing to Publish Your Podcast This section's ... okay. His chapter on hosting is mostly a treatise on how to evaluate service agreements, which is valuable enough in itself but can be overkill for someone just starting out. There are a few math exercises for estimating bandwidth -- useless when you don't know your potential audience size -- and a brief list of "podcast-friendly hosts" which is, of course, already obsolete. His coverage of publishing methods is about weblog software -- wait, scratch that, it's about MovableType. He's infatuated with MT, and devotes several pages on a step-by-step for hacking MT's code and templates to support enclosures with full-source RSS code listings, then mentions virtually offhand that Wordpress and Radio Userland support enclosures out of the box. This is another case where having multiple podcaster perspectives would have helped. Finally, we get a chapter named "The Life Breath of a Podcast: RSS 2.0 With Enclosures," just barely longer than its title, which covers how to use FeedForAll to hand-crank an RSS file if you don't have blogging software that will make one for you. It might have been a valuable chapter if he'd spent any real time explaining RSS 2.0 or enclosures.
Part V: It's Show Time A closing section that's nearly pointless, but mercifully brief. There's an entire chapter about using graphical FTP clients -- lame because anyone who's that blinking-twelve was lost back at Chapter 6. The meaty chapter is called "Feedback, Promotion, and Paying the Bills," and it has some moderately useful information and some large gaps. Feedback apparently means "have a mailing list and a voicemail line, and hang out on Skype." Okay. Promotion's about directory listings and exchanging promos with other podcasters; then he offers a long commentary on advertising and why it's a fine thing to have. Unfortunately, other than creating a media kit he has nothing much to say on how to contact and market your show to advertisers. And the final chapter of the book, "Where Do We Go From Here?" offers a few vapid musings of the sort all podcasters talk about over beer: we're going to kill mainstream radio, podcasts will band together and commercialize, all the starving children of the world will have an MP3 player ... And Yes, in his final sentences he invokes the already-tired "Podcasting Revolution" chestnut. Not much to say here, but rest assured, he says it.
So there you have it. That's the entire book. Worth buying? That depends. If you're itching to get started with podcasting, if you're an absolute beginner when it comes to sound recording, if the online resources at Podcast411 and other sites don't float your boat, and if you can't wait a few more months for books like Podcast Solutions and Podcasting for Dummies to come out ... then sure. There are at least three or four good chapters in here with information you can use. It's not all the information, and you have to take Cochrane's style and limited viewpoint with a big grain of salt, but it'll get you started. For less than twenty bucks, at least it isn't a high-risk investment.
On the other hand, if you're the bootstrapping type, or you already know most of what you're doing, then there's not much in here you can't figure out online and through experience. And if you're patient, there will be other books, and I'm almost positive they'll be better written.
You can purchase Podcasting: the Do-It-Yourself Guide from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Slashback: Start, Trash, Explain
Slashback tonight with more on the Microsoft start page project vis-a-vis Google's similar one, a wee $40 million slap on the wrist for Amazon over shopping-cart patent infrigement, new animals for the CodeZoo, and a strong denial that WikiPedia has announced a more stringent editorial policy. Details on these stories and more, below.
What's done is done, and in a certain order. MSN.com general manager Hadi Partovi writes:"A few days ago I read your Slashdot post about start.com.
Thanks for the note!
Thank you for the promotion :-). Meanwhile, I wanted to make sure you know that the work we've been doing on the start.com project actually predates the Google personalized page. I manage a tiny incubation team that has been building start.com since November, and it was first live on the Web in February, 3 months before Google released their personalized page. Of course we are missing some capabilities that Google has, and vice versa. It's a tight competition. But I'm emailing you because our team takes a lot of pride in its innovation. You may point out at a lot of place where Microsoft is following competitors, but if you track the functionality and UI changes that the companies have made over the past 6 months, this has clearly been a place where Google has been following Microsoft's lead.
(Our main engineer on the project has written a bit more about this to respond to your post.)
Anyway, I'm not sending this to be defensive. Heck, I have a lot of work to do to bring an innovation culture to the MSN organization and in many areas we have our work cut out for us. But I guess I want my small incubation team to get credit for being the leading innovators on this one small product :-)"
Always clean out the trashcan. dotpavan writes "The Register and Cnet have this report about Kai-Fu Lee not cleaning his recycle bin at his previous workplace and now MS has stumbled upon some interesting document, which shows that Google anticipated the MS move, and had planned top put him on a leave of absence or have him as a consultant to thwart any attempt of MS getting him back."
Amazon Settles Patent Suit For $40M theodp writes "In today's SEC filing, Amazon.com disclosed it will pay $40 million to settle an e-commerce patent infringement lawsuit that was reported earlier on Slashdot. The terms of the settlement also provide for dismissal of all claims and counterclaims and grant Amazon a nonexclusive license to Soverain's patent portfolio."
29+36 more = 65 vector drawing apps. Anonymous Coward writes "There were many useful comments made for 29 Vector Drawing Programs. After incorporating most of them, the revised column has 65 Vector Drawing Programs."
And each after its own kind. chromatic writes "As seen on the O'Reilly Radar and announced at OSCON 2005, CodeZoo now lists Python and Ruby components. CodeZoo is a human-edited directory of useful, well-maintained, and redistributable software components in various languages. (Slashdot previously covered CodeZoo's launch.)"
The chair recognizes Mr. Wales for a point of clarification. brajesh writes "There has been news on Slashdot and others about Wikipedia announcing tighter editorial control. It seems that everyone jumped the gun. Jimmy Wales, a founder of Wikipedia, has clarified his stance on the idea of freezing stable content on Wikipedia. Apparently, [Jimbo writes] 'I spoke in English, and this was translated to German. Then the German was translated back to English, and then translated again into the Slashdot story.' Also, 'There was no "announcement." We are constantly reviewing our policies and looking for ways to improve, but we have not "announced" anything. We don't even really work that way ... if you know how Wikipedia works, it's through a long process of community discussion and consensus building, not through a process of top-down announcements.' This has also been covered on Ars Technica."
Google Earth not a security risk after all. mister_tim writes "In a follow-up to yesterday's story about ANSTO's request that Google censor images of Australia's only nuclear reactor, the Australian government has now come out and said that Google Earth poses no security risk. Australia's Attorney General has come to the view, also noted by many /. readers, that the Google images have been available for several years from other sources and add nothing to the existing publicly available data. Chalk this one up as a victory for common sense." -
Slashback: Start, Trash, Explain
Slashback tonight with more on the Microsoft start page project vis-a-vis Google's similar one, a wee $40 million slap on the wrist for Amazon over shopping-cart patent infrigement, new animals for the CodeZoo, and a strong denial that WikiPedia has announced a more stringent editorial policy. Details on these stories and more, below.
What's done is done, and in a certain order. MSN.com general manager Hadi Partovi writes:"A few days ago I read your Slashdot post about start.com.
Thanks for the note!
Thank you for the promotion :-). Meanwhile, I wanted to make sure you know that the work we've been doing on the start.com project actually predates the Google personalized page. I manage a tiny incubation team that has been building start.com since November, and it was first live on the Web in February, 3 months before Google released their personalized page. Of course we are missing some capabilities that Google has, and vice versa. It's a tight competition. But I'm emailing you because our team takes a lot of pride in its innovation. You may point out at a lot of place where Microsoft is following competitors, but if you track the functionality and UI changes that the companies have made over the past 6 months, this has clearly been a place where Google has been following Microsoft's lead.
(Our main engineer on the project has written a bit more about this to respond to your post.)
Anyway, I'm not sending this to be defensive. Heck, I have a lot of work to do to bring an innovation culture to the MSN organization and in many areas we have our work cut out for us. But I guess I want my small incubation team to get credit for being the leading innovators on this one small product :-)"
Always clean out the trashcan. dotpavan writes "The Register and Cnet have this report about Kai-Fu Lee not cleaning his recycle bin at his previous workplace and now MS has stumbled upon some interesting document, which shows that Google anticipated the MS move, and had planned top put him on a leave of absence or have him as a consultant to thwart any attempt of MS getting him back."
Amazon Settles Patent Suit For $40M theodp writes "In today's SEC filing, Amazon.com disclosed it will pay $40 million to settle an e-commerce patent infringement lawsuit that was reported earlier on Slashdot. The terms of the settlement also provide for dismissal of all claims and counterclaims and grant Amazon a nonexclusive license to Soverain's patent portfolio."
29+36 more = 65 vector drawing apps. Anonymous Coward writes "There were many useful comments made for 29 Vector Drawing Programs. After incorporating most of them, the revised column has 65 Vector Drawing Programs."
And each after its own kind. chromatic writes "As seen on the O'Reilly Radar and announced at OSCON 2005, CodeZoo now lists Python and Ruby components. CodeZoo is a human-edited directory of useful, well-maintained, and redistributable software components in various languages. (Slashdot previously covered CodeZoo's launch.)"
The chair recognizes Mr. Wales for a point of clarification. brajesh writes "There has been news on Slashdot and others about Wikipedia announcing tighter editorial control. It seems that everyone jumped the gun. Jimmy Wales, a founder of Wikipedia, has clarified his stance on the idea of freezing stable content on Wikipedia. Apparently, [Jimbo writes] 'I spoke in English, and this was translated to German. Then the German was translated back to English, and then translated again into the Slashdot story.' Also, 'There was no "announcement." We are constantly reviewing our policies and looking for ways to improve, but we have not "announced" anything. We don't even really work that way ... if you know how Wikipedia works, it's through a long process of community discussion and consensus building, not through a process of top-down announcements.' This has also been covered on Ars Technica."
Google Earth not a security risk after all. mister_tim writes "In a follow-up to yesterday's story about ANSTO's request that Google censor images of Australia's only nuclear reactor, the Australian government has now come out and said that Google Earth poses no security risk. Australia's Attorney General has come to the view, also noted by many /. readers, that the Google images have been available for several years from other sources and add nothing to the existing publicly available data. Chalk this one up as a victory for common sense." -
Artificial Intelligence for Computer Games
Craig Maloney writes "Artificial Intelligence (AI) is a very hot topic today in computer circles because of the interest in modeling behaviors on machines that we find in nature. Many books have been dedicated to studying and expanding the field of AI, but generally fall into two categories: those that concentrate on AI as a research topic, and those that concentrate on AI in the field of game development. Artificial Intelligence for Computer Games (AI for Computer Games) is unique in how it takes classical AI and merges that knowledge into AI for game development. It's an approach that will be fascinating to those currently studying AI, but the approach limits the usefulness of this book to a select audience of AI researchers interested in game development." Read on for the rest of Maloney's review. Artificial Intelligence for Computer Games author John David Funge pages 127 publisher A K Peters, td. rating 6 reviewer Craig Maloney ISBN 1568812086 summary An introduction to Gaming Artifical Intelligence
AI for Computer Games begins with a brief introduction to the historic roles that AI has played in games such as Pac Man and Mario, and how these Non-Playable Characters (NPCs) achieved fame through their roles as NPCs. The NPCs play important roles in games, and their behavior can ultimately determine if the game is entertaining or frustrating. The author then describes the differences between the field of Artificial Intelligence as compared with Gaming Artificial Intelligence. Later he shows how these two fields can intertwine with each other, and how Gaming Artificial Intelligence can be useful to AI researchers via game-playing robots and other similar experiments. The author also introduces the architecture of the components of a game. They are:- Game State: The current state of the world
- Simulator: Encodes the rules for how the game state changes, and the rules for the game (physics, etc.)
- Renderer: The display of the game
- Controllers: The player and NPC methods for interacting with the game.
Next, AI for Computer Games discusses NPC perception. Players in a gaming environment are hindered by what the renderer will display to them, so likewise, the NPCs should not have omniscience in the game. The author recommends a strategy for handling this for NPCs: use the render engine for determining the perception of the NPCs as well. This allows the players and NPCs to work from the same rules. The author also describes how NPCs can handle partial observability, as well as prediction.
The rest of the book deals with the NPCs' abilities to react, remember, search, and learn to the game environment. This is the heart of the book, and provides a good analysis of the various methods available to the developer to model complex behaviors. The section on learning is especially interesting, as the idea of rewarding the algorithm when it performs correctly seems both strange and obvious at the same time (although the author points out that sometimes the algorithm can do undesirable things in order to obtain that reward). There are many ideas in these sections for perfecting the AI of the game, and the author expertly describes each one and where each would best be used.
AI for Computer Games was both enlightening and frustrating at the same time. The author obviously possesses a lot of knowledge in the AI field; the frustration is in his telling of that knowledge. The book reads much like an academic paper on AI applications in games, and could put off many potential readers with its rather dense descriptions of complicated material. The book also suffers from being rather short. The book is 127 pages in total length with code snippets, diagrams, and other page artwork. The brevity makes the book easy to pick up and read for a bit, but the density ensures you'll be re-reading several chapters in order to catch what the author is trying to convey. The code snippets also suffer from brevity. The code snippets are in C++, but are primarily constructors, with precious few methods defined. The author has excellent ideas; using an environment where the player and the NPCs are equals removes much of the complexity for the example AI to handle. Unfortunately the execution in this book leaves me wanting more.
You can purchase Artificial Intelligence for Computer Games from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
What are the Next Programming Models?
jg21 writes "In this opinion piece, Simeon Simeonov contemplates what truly new programming models have emerged recently, and nominates two: RIAs and what he calls 'composite applications' (i.e. using Java, .NET or any other programming language). He notes that Microsoft will be trying to achieve RIAs in Avalon, but that it's late out of the gate. He also cites David Heinemeier Hansson's Ruby on Rails project as showing great promise. 'As both a technologist and an investor I'm excited about the future,' Simeonov concludes. It's a thoughtful piece, infectious in its quiet enthusiasm. But what new models are missing from his essay?" -
Fun Stuff at OSCON 2005
OSCON 2005 was held in a convention center this year, instead of a hotel, because it just got too big (2000+ people). Too big, in fact, for pudge and myself to cover more than a fraction of the talks and the ideas flitting around the hallways. But here's some of what I found cool last week. And if you attended or presented at OSCON and want to tell us about all the neat stuff we missed, please, share your thoughts in the comments, or submit a fact-rich writeup and we'll maybe do a followup story later.Mike Shaver's talk on writing Firefox extensions was packed to the walls. If you've been wanting to try it, Firefox 1.5 makes development easier, and should be out soon, so now's a good time. This talk and the tutorial on Ajax persuaded me to start using the DOM Inspector and debugging some JavaScript to get a better understanding of webpage manipulation.
Aaron Boodman's talk on his extension Greasemonkey was a walkthrough of writing a simple GM user script, a discussion of what's coming up, and some Q&A. Greasemonkey 0.5 ("Now With Security!") is in beta: there are multiple security changes that suggest someone really has sat down and thought the whole model through. GM works with Firefox, Seamonkey, Opera, and Windows MSIE (but not, oh please somebody correct this oversight, Safari).
Ruby on Rails is hot; if you want to develop a web app quickly you can't ignore it. It stresses "convention over configuration" with reasonable defaults. The tutorial went from installation to the "hello world" of the web, a blog (!), in a few hours. Anyone have a real-world example of Rails scaling to a large project and lots of traffic?
DarwinBuild is an open-source project from Apple that aids in building the open-source components of Darwin/Mac OS X. Given a build number of Mac OS X, it will fetch and build the software for that version, allowing you to modify the source as needed, making it easy for any developer to modify everything from the kernel to various utilities (just remember to reapply the modifications after running Software Update, if necessary). You can read more about it from, in addition to the web site, the presentation slides.
Google and O'Reilly gave out the 2005 open source awards, with $5000 attached to each. Congratulations to the winners.
Tony Baxter's Shtoom is a cross-platform VoIP client and software framework, written in Python, for writing your own phone applications.
Novell is still moving its employees from Windows to Linux, which we first heard at last year's OSCON. The migration from Microsoft Office to OpenOffice is complete, and the big step, from Windows to Linux, is 50% complete, projected to be 80% by November. Miguel de Icaza gave flashy demos of some Linux desktop applications that didn't impress this cynical observer very much.
PlaceSite is an open-source project looking to bring physical proximity awareness to Internet access at coffeeshops and other meetingplaces: think "local-only Friendster" and you're not far off. They got feedback from a monthlong trial earlier this year and are working on a new version that will be easy to deploy. Could be neat.
In a great 2-hour session on Wednesday, we got to hear from representatives of four leading open source databases about what they've been working on lately. Here are the summaries...
Ingres r3 has an impressive list of big features. Ingres was just open-sourced by Computer Associates this summer, and it's gotten a lot of attention for being a full-featured enterprise database. Ingres supports table partitioning that can be either range-based or hash-based, which can greatly improve performance in many cases. Its optimizer can now come up with parallel execution plans, which can be useful even on single-CPU machines and non-partitioned tables. There's also federated data storage (one can access data stored in another RDBMS through Ingres) and replication. And they're working on a concurrent access cluster, to allow data to be manipulated not just by multiple threads on one machine, but multiple machines.
A side note: Computer Associates was invited by O'Reilly to talk about its recently open-sourcing Ingres. Its representative, while confessing that introducing a new license was "probably the wrong thing to do," said that other licenses wouldn't have worked for them (the GPL "was seen as viral"). The one question that the audience had time to ask was "is Ingres a dump" -- is CA making it open-source to transfer the responsibility of support from the company to the community? The three-part "no" answer was that there are more CA developers working on Ingres now, that Ingres is at the core of their new releases, and that they've sponsored a "million-dollar challenge" to foster community interest. Time will tell I guess.
Firebird 2.0 has been in alpha since January and a beta is expected soon. Since 2000 much of their development has been aimed at making the product easy to install, and making the code easy for a distributed group of developers to work on. This year they're building features on that groundwork. Their design includes 2-phase commits (since the beginning), cooperative garbage collection (as a transaction encounters unneeded data, it removes it) and self-balancing indexes. Backup has been improved. When 2.0 gets to beta, I'm going to check this out, it sounds like very interesting technology (and apparently it will install with four clicks!).
MySQL 5.0 is in beta, and has been feature-frozen since April. Back in 4.1, its abstracted table-type has been put to advantage with odd engines like Archive (only insert, no update); Blackhole for fast replication; and an improvement to MyISAM for logging (allowing concurrent selects with inserts-at-table-end). Their Connector/MXJ lets you run a native MySQL server embedded inside a Java application. In 5.0 we're seeing stored procedures per the SQL:2003 standard, triggers, updatable views, XA (distribution transaction), SAP R/3 compatible server side cursors, fast precision math, a federated storage engine, a greedy optimizer for better handling of many-table joins, and an optional "strict mode" to turn some of MySQL's friendly nonstandard warnings into compliant errors. And they're working on partitioning, ODBC, and letting MySQL Cluster's non-indexed columns to be stored on disk.
PostgreSQL 8.1 is expected to be released in November or December, after a feature-freeze in July -- and it's an impressive list of new features. Their optimizer will make use of multiple indexes when appropriate, which is pretty darn exciting. The recommendation will be that in most cases it will be most efficient to have only single-column indexes and let the optimizer figure out which combination to use. They're implementing a 2-phase commit, they're bringing the automatic vacuum into the core code, and they removed a global shared buffer lock so they're now getting "almost linear" SMP performance scaling. I've never felt the need for Postgres, but I'm definitely going to look at 8.1.
-
New Graphics Chip Relies on Reduced Gate Count
absolut.evil writes "HardOCP has an interesting story about a small integrated graphics chip maker from Norway: Falanx. The hardware is scalable and has a much lower power consumption thanks to its reduced gate count. So in theory we could see geforce/radeon style graphics in a set-top box -- maybe even a cell phone. They say they'll have hardware by mid-2006."