Domain: aaronsw.com
Stories and comments across the archive that link to aaronsw.com.
Stories · 9
-
Aaron Swartz Indicted in Attempted Piracy of Four Million Documents
An anonymous reader writes "New York Times has reported that Internet activist Aaron Swartz has been indicted for stealing more than 4 million documents from JSTOR." The indictment contains an exciting tale featuring trespassing, MAC address forgery, a Python script or two, and even computers hidden under a cardboard box. El Reg has a decent summary. Demand Progress has released an official response claiming the charges are trumped up nonsense. -
FBI Investigates Liberator of Court Records
eldavojohn writes "Federal court documents aren't free to the public, they cost $0.08/page through a system called PACER. During a period when the US Government Printing Office was trying out free access at a number of courthouses around the US, a 22-year-old programmer named Aaron Swartz installed a small PERL script at the 7th US Circuit Court of Appeals library in Chicago — a script that uploaded a public document every three seconds to Amazon's EC2 cloud computing service. Swartz then donated over 19 million documents to public.resource.org. That's when the FBI took interest in the programmer responsible for this effort and ran his name through government databases. How did he discover this? His FOIA was approved, of course, and he received the FBI's partially redacted report on himself. The public.resource.org database was later merged with that of the RECAP Firefox extension, which we discussed a couple of months back." Update: 10/06 18:22 GMT by KD: Timothy Lee pointed out that the summary as originally posted garbled the Swartz / RECAP connection. Improved now. -
Who (Really) Writes Wikipedia
Nico ? La ! writes "Aaron Swartz questions Jimbo Wales' (Wikimedia's founder) belief and evangelized truth that only around 500 people are the most important contributors to Wikipedia. Whereas the truth is that they probably are the people who do the most editing. From the post: 'For example, the largest portion of the Anaconda article was written by a user who only made 2 edits to it (and only 100 on the entire site). By contrast, the largest number of edits were made by a user who appears to have contributed no text to the final article (the edits were all deleting things and moving things around).'" Which ultimately means that Wikipedia in some ways much more closely mimics a real encyclopedia, with many contributors writing the bulk of the content, but a small group massaging that text to insure standards compliance with the overall work. Interesting thing there and worth your time, although the super-computer thing doesn't make a lot of sense to me. -
Summer Internships - The Good, and the Bad?
loquacious d asks: "This has been a spectacular summer for open-source student internships. Google funded a huge variety of open-source projects through the Summer of Code, including GCC-CIL and other improvements to Mono, new features and fixes for Gaim, and even new packages for Common Lisp. Joel Spolsky at Fog Creek hired four interns to produce a highly modified version of VNC called Fog Creek Copilot, and Paul Graham's new venture capital firm Y Combinator helped students create their own tech companies. What internships did people enjoy this summer, and which ones didn't work out so well? Which ones would you recommend to next year's applicants, and which should they avoid?" -
Post-copyright: Digital Cash and Compulsory Licensing?
gojomo writes "AaronSw offers a compelling idea: use anonymous transferable digital cash to allocate the monies collected for creators in a compulsory licensing scheme, to avoid some of the potential problems outlined by other compulsory critiques. LawMeme calls it a "Proto Whuffie" but expects fake artists to sign up for the loot. I might call it "voucher socialism" -- but that's not necessarily a bad thing." -
Post-copyright: Digital Cash and Compulsory Licensing?
gojomo writes "AaronSw offers a compelling idea: use anonymous transferable digital cash to allocate the monies collected for creators in a compulsory licensing scheme, to avoid some of the potential problems outlined by other compulsory critiques. LawMeme calls it a "Proto Whuffie" but expects fake artists to sign up for the loot. I might call it "voucher socialism" -- but that's not necessarily a bad thing." -
Content Syndication With RSS
Alex Moskalyuk writes "Ben Hammersley's Content Syndication with RSS is a step-by-step guide to implementing RSS. This standard is gaining popularity among the Web community, and some of your favorite sites might syndicate their content as RSS feeds. The new O'Reilly publication focuses on many aspects of this standard, and is of primary interest to developers, Web site designers, data architects and anyone interested in distributing their data around the Web." So if you have a steady stream of information for your customers, family, or fans, read on for the rest of Alex's review. Content Syndication With RSS author Ben Hammersley pages 222 publisher O'Reilly rating 8/10 reviewer Alex Moskalyuk ISBN 0596003838 summary Introduction and guide for RSS implementationsThe first three chapters are primarily discussing the multiplicity of RSS standards. While with some other technologies it might seem a bit excessive, remember that RSS is a forked project with the forks at this moment bearing little resemblance to one another. The abbreviations even have different abbreviations - RSS means Really Simple Syndication if you are using RSS 0.91 or RSS 0.92, that was developed by Dave Winer. RSS means RDF Site Summary if the version you're using RSS 1.0. The development credits in this case go to RSS DEV team. To confuse you even more, the RSS 2.0 standard is deciphered as... correct, Really Simple Syndication again.
Hence chapter 4 discusses Winer's implementation (simplistic and user-friendly), while chapter 6 focuses on RSS 1.0 (RDF-compliant and data-architect-friendly), and chapter 8 talks about RSS 2.0 (improved RSS 0.9x). Chapter 4 is available online as a PDF file. Section 4.4 is recommended for those interested in promoting their RSS feeds as it provides pretty good reference to meta data.
Chapter 9 is perhaps of special interest to Web developers and administrators out there. It presents several code samples to properly parse RSS and present the result in readable HTML. The examples include (a) parsing with XML::Simple in Perl, (b) parsing with Perl regular expressions, (c) parsing with XML::Simple and sending the headlines to cell phones via WWW::SMS, (d) parsing via XSLT transformation. Python, PHP and ASP folks might feel left out due to the abundance of Perl examples, but if you got so far in the book, you can probably apply the regular expressions example or search for appropriate support for RSS format in your preferred language.
Going beyond the standard itself, RSS directories, aggregators and readers are discussed. Author makes a distinction between the last two by classifying Meerkat-like services into aggregators and desktop or Web applications designed to present the information to the user into readers. The chapter also provides information about Syndic8, its API, and describes the feed registration process. OReilly's Meerkat is also discussed in chapter, together with reference table for its API (you can make Meerkat generate HTML or RSS news headlines on certain topic or using certain keywords by providing a right query to its Web interface).
The book is quite a smooth read for a text describing the details of data specification. The chapters are informative and the book is not overloaded with useless information just to increase the page count. The tips are quite useful for someone, who is knew to the field and answers some questions not covered by standards (e.g., how often should you request an RSS feed, what to do if you're being screen-scraped, etc.)
I like the way the author divided the chapters into RSS 0.9x/2.0 and RSS 1.0 and kept two worlds apart. Most of the time you probably won't be interested in developing a feed to support both standards, but would like to focus just on one. The examples in Perl are perfect with me, although for someone new to Perl or programming in general those examples with abundant regular expressions might look a bit convoluted. Kudos to the author for not expanding on the topic, like many do, and providing an example of a script for RSS manipulation in every possible language out there.
What's missing? I wish more pages were dedicated to desktop RSS readers. FeedReader, HotSheet, Syndirella, Beaver and SharpReader are excellent end user applications currently gaining some popularity among those who'd prefer to browse the favorite headlines at a glance, instead of going to a dozen of sites every morning. To be fair, there's a huge list of readers in Appendix, and some applications mentioned above only came around in the last few months, which was probably after the book hit the press. Some sites also didn't make it into the book. I like DailyRotation and FreshNews that borrow from Meerkat's versatility and provide their own feed portal.
Overall, the book is a pretty good developer's guide to RSS standard. Accompanied with helpful illustrations and numerous tips it's an excellent resource for those unfamiliar with RSS and a helpful reference for those who have been doing Web syndication for a while.
You can purchase Content Syndication With RSS from bn.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page. -
Curious Yellow, Superworm
jpmccord writes "Brandon Wiley's white paper, Curious Yellow, explains how "a superworm -- a worm that coordinates it actions among infected hosts and launches a massive distributed denial of service attack on any hosts it can't infect using those it can" (via disLEXia, a weblog by Maximillian Dornseif). The "doomsday scenario" frightens "even us", says Dornseif. An accompanying discussion rebukes Wiley's article a bit. Aaron Swartz's light-hearted take is rather entertaining: "So go read it now and find out how you can take over the whole Internet. And if you're going to, could you give me 24 hours notice?"" -
Eldred Transcript, Bookmobile Experience
Patrick writes "The transcript of the oral arguments in Eldred v. Ashcroft is now online." Such exciting lines as: "CHIEF JUSTICE REHNQUIST: Well, but you want more than that. You want the right to copy verbatim other people's books, don't you?". See previous stories about the oral arguments and Lessig's thoughts on them. chromatic writes "The O'Reilly Network has just published Richard Koman's Lessons from the Internet Bookmobile about his travels with Brewster Kahle to Eldred v. Ashcroft. I particularly like how he describes the universal positive reception."