Ask About 10 Years of Free Web Publishing
This week's Slashdot questions go to Paul Jones, director of ibiblio.org (formerly MetaLab, before that SunSITE) since it first went live in August, 1992. Ibiblio hosts the world's largest Linux archive (including the LDP), plenty of streamed and downloadable music, the world's longest-running Web cartoon (Dr. Fun), and thousands of texts on topics too numerous to list here. This is truly "the public's library and digital archive," 100% GPL, copyleft, and/or public domain, sponsored jointly by the Center for the Public Domain and UNC. Lots of people talk about free online publishing. Paul Jones just does it, day after day, year after year. Ask him whatever you want; we'll send 10 of the highest moderated questions to him and post his answers as soon as we get them back.
DRM? Palladium?
What's your take on these two technologies?
Are you afraid they'll ultimately destroy what you have been working for, for the past 10 years? If not, why?
Optional question: What about the copyright extension we have seen?
Another optional question: Linux... or BSD? =)
Looking back on 10 years of doing this, what would classify as your greatest success, and your greatest failure?
Personally I'd really like to know what the difference in bandwidth usage, hits, cost, and other boring logistical statistics the site produces are...
*HOW* many gigs per day, HOW much cost per day, how many people download the latest linux ISO on their cable/dsl just because they can?
Sunsite (as I'll forever call it) isn't just a measure of the pulse of linux penetration, it's been the heart of it for me over the years. -_-()
What is the center's view on the publishing of material that might be considered "offensive" or "dangerous", and does the center make subjective judgements upon the importance of one piece of intellectual property over another on the basis of 'artistic worth', 'decency', etc.? With only limited resources available to promote the archiving of data, is there the risk that important fringe documents may be left by the wayside, or ignored due to political/social concerns?
What do you do for revenue? Most free hosting services are plauged with crappy obtrusive ads and pop up/under windows that annoy me to no end. I try to avoid these sites (ie geocities/angelfire) however you don't have much in the way of ads, how to you have any capital?? (and if you wouldn't mind telling the slashdot editors maybe they can remove some of the larger ads on the site...
--fetch daddy's blue fright wig, i must be handsome when i release my rage
Which ones get the most traffic?
What I'm listening to now on Pandora...
One of the things that people frequently ask about sites like ibiblio.org
is "They are great. But how long will they be around?"
Do you see this as a concern (esp. after the LWN announcement) and do you have any
comments regardning this. Are there any good approaches you suggest (like augmenting
free usership with voluntary subscriptions, etc) for such free sites in general ?
Thanks.
DO NOT PANIC
In general how supportive have you found the producers of such content to be of your services? Do many if any really believe that something like this will cause them to starve to death?
It's amazing how spiritual an elaborated beer commercial can be. -- Philip K. Dick
What's your backup strategy? I imagine it's hard to deal with both so much data as well being under constant bombardment from clients around the world. How often is data archived? Have you had any major data loss incidents and, if so, how well were you able to deal with them?
First, you should know that we're in the midst of a big Web page redesign. We'll be moving our main pages from http://promo.net/pg to http://ibiblio.org/gutenberg (with virtual domains of course: gutenberg.net and friends). We'll be addressing many of your concerns. You heard it here, first.
Second, I agree our "finding aids" (in library terms) are poor. It's my #1 priority to get this stuff working better, and in fact several people are working right now to put a new database-driven system into place.
Responses to your questions: ascii over html: We take everything, but also try to make sure we have a plain ASCII file in addition to other formats. Most volunteers give us just text, since that's what comes from their OCR of books. In the near future, we will have automatic conversion on the fly into nearly ANY format, starting with Braille, then adding HTML, XML, PDF and others including PDA eBook formats....text too, of course.
small print: Since November 2001 the small print at the start is only 35 lines or so, including the title, author, pub date, etc. The long annoying legalese is at the end now. The automatic conversion process mentioned above will enable us to put the most recent header (with the short front part) on all the older content. As to "why do we need the legalese," read the small print itself, it's pretty clear.
server indirection: this is one of those finding aids problems we'll overcome. A cookie or other configuration would do the trick here...
bibliogaphic information: All the recent (last year or two) texts include this right up top. Even the older ones include a "release date" or something similar. The improved finding aids will let you search by publication date, by the way.
We're actively discussing this stuff on the Project Gutenberg Volunteer's Discussion List, see mailing list subscription info for how to subscribe.
Dr. Gregory B. Newby
Chief Executive and Director
Project Gutenberg Literary Archive Foundation
http://gutenberg.net
A 501(c)(3) not-for-profit organization with EIN 64-6221541
I noticed that you are one of the founders of the American Open Technology Consortium and/or GeekPAC - the lobbying group that got a bit of fanfare a few months back when it was formed, but has been pretty quiet since then.
With Congress launching seemingly daily attacks on our technological freedom in order to support the revenue models of a few huge businesses, the need for a voice in Washington is growing urgent. Is the AOTC/GeekPAC working to get our voices heard? Is there a need for an umbrella group to tie together various groups like GeekPAC, Public Knowledge, Digital Consumer, etc.?
I've downloaded my share of things, and find that the 3 Mbps cap on my cable modem is almost always my bottleneck. So my question is fairly simple (albeit broad) -- can you describe your setup a bit, in terms of bandwidth (both what you have for an Internet connection, and how much traffic you actually use), servers, storage (I'd venture to guess it's to the tune of several terabytes?), etc.
________________________________________________
suwain_2
Over the past ten years, what has been the most personally rewarding part of your work?
iBiblio stands out as an excellent repository for a wide range of culturally valuable resources. As it and other sites grow in size, the importance of good searching and indexing becomes extremely relevant. Have you given any thought to how you might want to cope with this? Specifically, are there any metadata schemata that you are considering using? I would love to see iBiblio be used more like a content feed to research/cross-referencing applications.
I heard you talk at the Southern Presses conference last year about the use of trust metrics (like Slashdot's karma and Advogato's peer certification) as a possible alternative to the "top-down" means of filtering that scholarly and commercial publishers use, namely formal peer review and mass marketing, respectively. Are you more or less optimistic about the long-term viability of this model then you were then? (Especially in light of the powerful efforts to keep control of the gates we're seeing these days from Hollywood, the recording industry, and their political allies...)