Domain: archive.org
Stories and comments across the archive that link to archive.org.
Comments · 7,005
-
The Larrikin-Wowser Nexus
Australia has always had a tradition of repressive, authoritarian government and arbitrary authority. After all, it was a penal colony, and a military outpost of the British Empire, holding the line, and standards had to be enforced. Up until the 1960s or 1970s, a lot of things which would be OK in London or New York were strictly beyond the pale in the big cities of Australia. Australian puritanism (or "wowserism") doesn't have the evangelical, light-on-a-hill idealism of the American variety, but tends to be more of a what-will-the-neighbours-think conservatism.
Mind you, Australia also has an equally old opposite tradition of borderline contempt for authority and propriety; commonly called "larrikinism". This is a country where an armed robber is a national hero, an unofficial (and by far more popular) national anthem is about a sheep thief, and more recently, there were (unofficial) national moments of silence and memorials held for an Australian executed in Singapore for smuggling a huge quantity of heroin. The larrikin streak has made an impression on Australian culture in a number of areas, from an old an ongoing tradition of political mischief to highly-developed scenes for activities such as stencil graffiti and urban exploration.
The downside of the larrikin-wowser dynamic is that there is not much of a centre, and not much of a tradition of liberalism and civil society. Since the 1970s, Australia has become more liberal and cosmopolitan, though that was never enshrined into anything like a bill of rights. Consequently, as soon as a hard-right government got into power, all the de facto institutions of liberalism are being swept away like so many sandcastles on a beach, and the old authoritarianism is showing through. -
LMA
Here's the URL: http://www.archive.org/audio/
-
Re:WikiLyrics
I hit up archive.org to see if they had indexed the site to any depth. No luck in my preliminary search, but I did find this:
http://web.archive.org/web/19990125090702/http://l yrics.ch/
Note the date. Jan 25, 1999. I can't believe it was that long ago. -
libtoolYou might be interested in this:
http://www.advogato.org/article/85.html
which links to the open-source metrics:
http://orbiten.org/ofss/01.html
which is dead but is still on the archive:
http://orbiten.org/ofss/01.html">The link doesn't work!@!#@!@@!
Here is the first table Table 1: Top 10 authors ranked by contribution of code Author % of total free software foundation, inc 11.231 sun microsystems, inc 1.848 the regents of the university of california 1.359 gordon matzigkeit 1.216 paul houle 1.042 thomas g. lane 0.782 the massachusetts institute of technology 0.762 ulrich drepper 0.559 lyle johnson 0.528 peter miller 0.525
Table 1: Top 10 authors ranked by contribution of code Author % of total free software foundation, inc 11.231 sun microsystems, inc 1.848 the regents of the university of california 1.359 gordon matzigkeit 1.216 paul houle 1.042 thomas g. lane 0.782 the massachusetts institute of technology0.762 ulrich drepper 0.559 lyle johnson 0.528 peter miller 0.525 more... -
Re:What?Ahh, "Blockbuster Movie Syndrome": Everything put on film must be exciting.
Yeah, to me the formulaic blockbuster movies are boring as batshit. Some of the technical films in the Prelinger archives http://www.archive.org/details/prelinger are much more interesting. How can you go past a film like;Personal Hygiene (Part I) - U.S. Army
It's unintentionally hilarious, and there are thousands like it in the collection.
Military training drama showing how the residents of a barracks convince a sloppy soldier to clean up his act. With many folk songs on cleanliness. -
Copy of the javascript?
Does anyone have a copy of version 0.9.7 (I think that was the latest one, before it was taken down.) of http://www.ashotoforangejuice.com/jsrisk.js
I tried looking for it on http://web.archive.org/web/*/ashotoforangejuice.co m/* , but obviously had no luck. Is there any chance that Google might be caching it? They certianly have the HTML (which is pretty much worthless) here http://216.239.51.104/search?q=cache:RrdLpS5Pm5IJ: ashotoforangejuice.com/gmrisk.html+site:ashotofora ngejuice.com&hl=en&client=firefox-a , but I don't know whether Google caches javascript files.
By the way, I have 0.9.5 if anyone wants it. -
Re:The Dead == The Man
You might be swapping "David Gans and Friends" soundboards. But I'm not, like practically all of us. So none of the music we're exchanging was made by Gans. FWIW, the soundboards of Gans' bands are still up on Archive.org . Because there's no market for them. Without the free distribution, Gans wouldn't have the marketing, the same audience, make the same money off the stuff he can charge for.
-
Re:Be Like Mojo
There are many more. No Mojo Nixon though, maybe you should email him about this outlet for his music.
-
The Deadhead headache for the Internet ArchiveFrom what I hear from some Internet Archive people, the Deadheads have become a headache. The Grateful Dead stuff is a tiny percentage of the Internet Archive, which has petabytes of data, including multiple copies of the whole World Wide Web. But the Deadheads are hogging the bandwidth, and because they hit the same stuff over and over, the Archive bogs down. The Archive was designed as a library, without a big cacheing front end to handle high traffic to a few files. So concentrated traffic in one area slows it down.
The Archive now offers files for streaming, which is a bandwidth hog for music files. People keep playing them again and again. (Especially Deadheads, who are notorous for listening to the same content repeatedly. Possibly due to drug-induced memory degradation.) This is interfering with other queries.
-
bt.etree.org
Many taper friendly bands choose to not allow their shows to be posted to the live music archive.
See the list of those that have opted out here (after the accepted and pending list):
http://www.archive.org/audio/etree-band-showall.ph p
Phish is a good example. They do allow fans to trade their recordings on bt.etree.org as well as other places. You can buy soundboards from their website. I don't think that makes them greedy or in the same class as metallica and others.
That said...the dead archive on etree is just amazing and I hope it stays. I encourage anyone that hasn't ever got the dead to download some of the higher rated shows and give them a chance. Great music to code to. -
Jerry wanted the music to be free...
"once we're done with [the music], you can have it." - Jerry Garcia
Bassist Phil Lesh echoed that sentiment--quoting Garcia in an interview with Charlie Rose on CBS's 60 Minutes in 2004: "Jerry put it the best, as he frequently did, 'Let 'em have it. When we play it, we're done with it."
from: http://www.archive.org/iathreads/post-view.php?id= 49496
The Dead also released a disclaimer about their live music:
MP3 STATEMENT TO MP3 SITE OPERATORS
The Grateful Dead and our managing organizations have long encouraged the purely non-commercial exchange of music taped at our concerts and those of our individual members. That a new medium of distribution has arisen - digital audio files being traded over the Internet - does not change our policy in this regard.
Our stipulations regarding digital distribution are merely extensions of those long-standing principles and they are as follows:
No commercial gain may be sought by websites offering digital files of our music, whether through advertising, exploiting databases compiled from their traffic, or any other means.
All participants in such digital exchange acknowledge and respect the copyrights of the performers, writers and publishers of the music.
This notice should be clearly posted on all sites engaged in this activity.
We reserve the ability to withdraw our sanction of non-commercial digital music should circumstances arise that compromise our ability to protect and steward the integrity of our work.
Jerry Garcia did not care about people taping or downloading their music, he thought any live show could be shared and traded by anyone for their personal use, but not to copy and sell for profit. I would think the rest of the band would respect his wishes. Long live Jerry.
http://www.people4peace.net/pix/people4peace/jerry -garcia.jpg -
Night of the Living Dead
Get it at archive.org:
http://www.archive.org/details/night_of_the_living _dead -
Grateful Dead no longer share-friendly
Recently Archive.org was asked to pull recordings of the Grateful dead they had been hosting - all fan recordings.
Archive.org was allowed to resume hosting microphone recordings for shows - but the soundboard recordings (which were all made by fans, not the Dead) are now only allowed to be streamed.
This implies then that if you are sharing soundboard recordings you are doing so against the wishes of the Dead.
Read the spirited comments on the matter here. Some fans are thankful that they are allowed partial freedoms, others upset that all the effort that went into fan soundboard recordings is being withheld form the people that made it.
There is also a petition to sign here to let the band know how you feel about them going back on thier principals. -
I looked just now
http://www.archive.org/audio/etreelisting-browse.
p hp?collection=etree&cat=Grateful%20Dead
I noticed when the Grateful Dead shows went off but I didn't know why. Now it shows there are 1100 shows back there (still not all of them back). Maybe putting the other 1500 back on is why the archive has been running slowly today.
I'm having trouble clicking through the link now, so who knows what the deal is. -
Support share-friendly artists
The answer to the downloading conundrum is easy.
1. Go to http://www.archive.org/audio/etreelisting-browse.p hp . All the music is legal, live concert, artist permitted, and free. Download Grateful Dead, 311, G Love and Special Sauce, Cracker, Glen Phillips, Andrew Bird, and the Ditty Bops and so on to your heart's content.
2. Listen to commercial-free streaming audio via ITunes (Radio) and other internet media.
3. Reward the artists whose work you enjoy this way by going to their concerts. Reward any artists whose albums you can hear from front to back for free, like Nickel Creek on CMT.com and the Ditty Bops on dittybops.com. -
Re:Fear more than greed
This is about power. The record companies want to dictate how you use their product. They cannot get over the idea that once you purchase something it no longer belongs to them.
If this is true, then they just don't get music (as if they ever did or cared to).
Music is like language, it is a part of _our_ culture, not the record execs power trip. Sure, a record company can produce a random artist that looks good and can produce a couple of bubble gum hits, but everybody over 15 knows that is not music, and it will only be a forgoten thing except for later releases like "Greatest hits of the '90s" and a memory on the billboard list. If you don't believe me, go and look back at the "hits" from the 60s and see how many of them are songs that you know or if many of those songs are what you think of as 60s era music.
Music that lasts, lasts for a reason. Look at http://www.archive.org/audio/ for tons of music that is freely available. Look at some of the music trading sites on the net like http://www.dimeadozen.org/. We love music, and it has been a part of the human experience since the first guy beat 2 sticks together.
Like the South Park episode that shows the poor starving record exec and his mansion and private plane or whatever they showed. That is not music. That is business. Both will survive, regarless of there being a "record business". -
Re:Slightly easier to build...
If you're interested, check out this Flecktones show with guest thereminist pamela kurstin.
-
Re:pixelfest
That is pretty cool. I did something of a similar bent a few years back, though with different goals. I wanted to see if people were capable of participating in an art project without being asshats.
http://web.archive.org/web/20021011144257/http://t ru7h.org/society/
Short version is, they couldn't. There were some cool things a few people did (that link is one example), but it was always done by one person and some scripts, rather than a group.
Don't have it up anymore, the way I stored the data was pretty inefficient and was too expensive in terms of CPU time to keep available. -
Re:Control
I think you misspelled 'archive', chuckle.
Even after fixing that rather ...interesting... misspelling, the link still appears to be broken. Here's a substitute search link with results for the group.
- -
Re:Extension of the Blogging Culture
Yes and no. Unlike blogs, podcasts are mostly one-way, none of the commenting, tagging and cross-linking that characterizes blogging. Podcasting is another form of content syndication. And yes, the technology is so simple now (I use a Yamaha UW500, a USB audio/midi recorder) that anyone with a computer can record themselves doing all kinds of things and slap it out on the Internet for anyone to see. (A hint to save you some bandwidth: if what you're doing is distributable via a Creative Commons license, you can have the Internet Archive host it for you.)
Recording is easy. The tricky part is figuring out how to best build your feed. Besides the standard RSS tags, look at the iTunes extensions.
Eric
Just put out my first (long!) podcast -
Re:Weak
Bram Cohen has in fact condoned piracy, at least until mid-2003. Check out this little piece, now removed from his website, but still accessible via wayback: http://web.archive.org/web/20030602145959/bitconj
u rer.org/a_technological_activists_agenda.html
"I build systems to disseminate information, commit digital piracy, synthesize drugs, maintain untrusted contacts, purchase anonymously, and secure machines and homes...I refuse to work on technology to track users, analyze usage patterns, watermark information, censor, detect drug use, or eavesdrop. I am not naive enough to think any of those technologies could enable a 'compromise'."
He was the last person I'd have expected to deal with the MPAA, given what his rhetoric used to be. -
Meta-Modding
Insightful (thanks for the reminder, temojen!)
-
Re:a work of love
Do they have anything to do with Archive.org's 78s archive? Because I'd love to see a unified archive, with a choice of whichever conservator's GUI I prefer.
-
Nice of y'all to join us
This is news? Was the poster not aware that Roedy's unmaintainable code doc has been growing for at least five years? http://web.archive.org/web/*/http://mindprod.com/
u nmain.html -
Re:Out of Touch with an Old Reality
>In another 2000 years...
Hmmmm, perhaps I should have said "relatively" perfect & enduring. If the half-life of an AOL disk is 20 years, there will still be several thousand of those buggers functional in 4006, bearing a usuable but embarassing browser.
There is a fundamental difference between physical books and electronic media. In 2000 years, nearly all paper books will have cycled through the biosphere a dozen times, which destroys the information on them. In contrast archived web pages will very likely still exist as information. Perhaps they will be readable only via old browsers but those browsers themselves are only information, similarly archived and available to researchers who care to figure them out.
There is no need to wikify or continually update works such as the Rubiyat, which have reached their final form long ago. Translations of course require continual update as language changes over the centuries, but the original text of Beowulf is the same today as it was 200 years ago (...plus or minus findings of new texts.)
-
Sounds like Cringely saw a PetaboxThe Internet Archive's Petabox. is a petabyte of storage in a shipping container. Each rack holds 100 terabytes, and power consumption is 6 KW per rack. Capricorn builds them for the Internet Archive.
Sounds like Google is trying that out.
There's nothing that exotic about this. The military builds racks of electronics into shipping containers all the time. It's mostly a cable management and maintenance access problem. You have to be able to do everything from the front of the rack, which requires some design work but isn't rocket science.
-
The Petabox?This sounds just like the Petabox being designed by the Internet Archive folks. The projected specs are (ripped from the linked page):
- Low power-- 6kWatts per rack, and 60kWatts for the whole system
- High density-- 100 Terabytes per rack
- Local computing to process the data-- 800 low-end PC's
- Multi-OS possible, linux standard
- Colocation friendly-- requires our own rack to get 100TB/rack, or 50TB in a standard rack
- Shipping container friendly-- Able to be run in a 20' by 8' by 8' shipping container
-
Re:Finally!
Not to mention SAP R/3 administration for dummies (which does, in fact, exist!) and Vertex Operator Algebras for dummies (which unfortunately doesn't).
-
Re:Riddled with errors and unsupported statements.
until recently it was entirely clear to the law. Things could have owners and ideas could not.
This is baloney. It's been quite a while since the constitution was written, and right there in Article 1 section 8 clause 8 is the statement by the framers that is the basis for our patent system. Ideas could be owned in 1789, and long before that as well, as England also had a patent system.
Patents (originally) were/are not monopolies on ideas, but on inventions. Those are not quite the same. And originally, all such "inventions" were limited to the physical world. It is only fairly recently that patent offices and courts have started extending what can be protected by patent to the immaterial world.
Even with the latest reform, the USPTO is still paying lip service to the original principle, by demanding a "Concrete, and Tangible Result". Of course, in practice it doesn't exclude much anymore (of course you always want to monopolise real-world actions in the end, and every innovation in the abstract can be applied to the real world if that includes things like "provide a commercial benefit").
And the main problem with these extensions are that they are not based on economic needs, but simply pushed by a small in-crowd who stand to gain from them.
Not to mention the fact that money is an idea, equitable servitudes are ideas, usufructs are ideas, loans are ideas, contracts are ideas, and, now this will really blow your mind --
options on options...
I think you're extending the term "idea" beyond the context in which the author used it. That's easy of course, since "idea" has no legal definition and can be interpreted quite broadly. My interpretation of the article is that the author used idea in a more abstract sense, as in "the idea of using money instead of property", "the idea of lending money" etc.
In this world, size is no protection. It just makes you a more succulent target for enemy lawyers.
I would just like to point out that both sides have lawyers -- this makes it sound like lawyers are the enemy. In fact, lawyers are just the guys that help their clients get what they deserve under the law.
But in general society is better off when less lawyers are needed. After all, (and please don't take this personally) all money that goes into lawyers is money which cannot be invested in useful things (like R&D). It's an overhead cost. And by creating more "rights" you automatically increase the number of lawsuits, license agreements etc.
I'm not saying that a world without rights or lawyers would be ideal, but on the other hand extending rights and adding more rights does increase the overhead and at a certain point starts reducing the overall "justice" and "efficiency" of the system.
People with more money have always been able to hire better lawyers in our legal system, and that problem has nothing to do with intellectual property.
It is an argument to balance the situations in which you may need a lawyer though.
The system is supposed to work this way. It incentivizes companies to research and patent things as fast as they can, pushing the limits of technology, and then disclosing them to the public.
That's the theory, but in practice it doesn't always work that way. Witness e.g. Machlup already saying in the fifties:
If one does not know whether a system "as a whole" (in contrast to certain features of it) is good or bad, the safest "policy conclusion" is to "muddl
-
Re:Plan 9?!
-
Re:Plan 9?!
-
Some gems from Archive.Org.
http://www.archive.org/audio/audiolisting-browsea
r tists.php?collection=78rpm
A lot of these are transfers from the flat Diamond Discs, not the cylinders dubbed from Diamond Discs. Some of those transfers are pretty freakin' amazing. Lots of history here. Hear Irving Berlin sing. Hear why people raved about Enrico Caruso...makes Pavarotti and Domingo sound like punters. Hear Fanny Brice do her schtick. A lot of what is referred to as "Jazz" is actually more like Ragtime. But that can be pretty amazing too.
I came here looking for cartoony music that had passed into the public domain for my upcoming podcast series The Cartoon Geeks. There's lots of it here. Here's the tune that's going to be the theme music. Yowza yowza. -
Some gems from Archive.Org.
http://www.archive.org/audio/audiolisting-browsea
r tists.php?collection=78rpm
A lot of these are transfers from the flat Diamond Discs, not the cylinders dubbed from Diamond Discs. Some of those transfers are pretty freakin' amazing. Lots of history here. Hear Irving Berlin sing. Hear why people raved about Enrico Caruso...makes Pavarotti and Domingo sound like punters. Hear Fanny Brice do her schtick. A lot of what is referred to as "Jazz" is actually more like Ragtime. But that can be pretty amazing too.
I came here looking for cartoony music that had passed into the public domain for my upcoming podcast series The Cartoon Geeks. There's lots of it here. Here's the tune that's going to be the theme music. Yowza yowza. -
Re:Our style!You are correct that it's a good rule of thumb to just never use identifiers that start with an underscore, but there are exceptions.
From http://web.archive.org/web/20040209031039/http://
o akroadsystems.com/tech/c-predef.htm#Groups:Respect that first entry in the table below: never make up any identifier that starts with an underscore.
(Actually, you can legally use an identifier that starts with an underscore if the second character is a lower-case letter or a digit, and the identifier is used inside a function or a function prototype or as a structure member or label. Easier just not to use leading underscores!)
The parent post uses them inside a function and the second character is lower-case.
-
Free music
Ah.. there's too much free (legal!) music on the internet nowadays to warrant shelling out cash for 'pop' music from the big dogs.
http://freealbums.blogsome.com/
http://www.archive.org/audio/netlabels.php
http://www.magnatune.com/
Laters... -
What are they doing that is so expensive?
There are ways to cut down the costs—archive.org (the Internet Archive) will host any file, allow unlimited downloads, and mirror it internationally over reasonably fast connections for free. 6GB of transfer and 400MB of storage space can be had online for $12/month (and I'm guessing plenty of
/. readers know better deals than that). This is certainly a lot of storage for some fixed (X)HTML+CSS and an RSS file. If one can reliably get free Internet access whenever one needs to upload files, one could make a nice site that is regularly updated and features an RSS feed for less than $80/month.
So, I'm not entirely convinced that one needs to have ads here. -
OCA and PG scratching each others' backs
The focuses of OCA and PG are really quite different: PG is most interested in preserving the essential information of a book (ie, its text), while OCA's interest is in preserving the form of the book (ie, its fonts, pages format, coloration, even down to the yellowing of the pages). That having been said, there's a lot each can do for the other (and has!).
The Archive has archived most of PG's material, because even though the Books department of The Archive is focussed mostly on preserving books, The Archive as a whole is interested in preserving just about any information it can, and the PG data is definitely of interest.
When the The Archive's Scribe software processes the book images into its various format (jpg, djvu, pdf, flippy, et al), it OCR's the book's text. This text then becomes part of generating some of the other formats. It will be really trivial for PG to obtain this text for any book it wants to incorporate into their dataset.
qv: intlepisode00jamearch. The interesting files here are intlepisode00jamearch.txt which is just the OCR'd text, and intlepisode00jamearch_djvu.xml which is the OCR'd text with layout information (which has been useful to me in developing software which auto-corrects some OCR errors -- where the text is on the page often offers valuable hints for choosing the right heuristic for guessing the right text).
A quick side note on the differences between Google's and OCA's efforts that I haven't seen talked about much -- Google's main advantages in their bookscanning efforts are their wealth and fame, while The Archive's main advantages are experience, familiarity, and scanning technology.
Traditional book-scanning technologies are expensive and slow (which makes doing a lot of books, fast, that much more expensive, because you have to hire more people to do more books in parallel), but Google has enough money to throw at the problem that this is less of an issue. Google's fame means they can bring powerful partners onboard with a smile and a handshake, including some of the most prestigious libraries in the nation.
The Archive has been involved in scanning books and making them available online for several years now (qv The Million Books Project). This experience has shaped the processes used in the acquisition and scanning of books, as well as the technology used in their storage, indexing, and presentation. Furthermore, libraries around the world have grown familiar with The Archive over the years. That, and The Archive's good track record, make it a powerful rallying point for partnerships and alliances, and have given it more experience in facilitating such relationships. Finally, partially due to the limits of existing book-scanning solutions, and partially due to The Archive's limited budget, it has facilitated the development of two independent low-cost, reliable, high-quality book-scanning systems: The Scribe (developed in-house at The Archive) and the Kirtas Robot (developed at Kirtas, a Canadian company).
Many of the books scanned for the Million Book Project using traditional scanning methods are really lousy, sometimes to the point of being unreadable. These new scanning systems dramatically improve the quality of the end product, while equally dramatically reducing the cost-per-page. This means that more scanning systems can be purchased for more libraries (avoiding the per-library capital outlay problem), and more books can be scanned more quickly within a given budget.
Obviously, Google and OCA can benefit from co-operation, as each has a lot to offer the other. I'd be surprised if Google didn't join the OCA, eventually, if for no other reason that to gain access to the books of the >100 OCA
-
OCA and PG scratching each others' backs
The focuses of OCA and PG are really quite different: PG is most interested in preserving the essential information of a book (ie, its text), while OCA's interest is in preserving the form of the book (ie, its fonts, pages format, coloration, even down to the yellowing of the pages). That having been said, there's a lot each can do for the other (and has!).
The Archive has archived most of PG's material, because even though the Books department of The Archive is focussed mostly on preserving books, The Archive as a whole is interested in preserving just about any information it can, and the PG data is definitely of interest.
When the The Archive's Scribe software processes the book images into its various format (jpg, djvu, pdf, flippy, et al), it OCR's the book's text. This text then becomes part of generating some of the other formats. It will be really trivial for PG to obtain this text for any book it wants to incorporate into their dataset.
qv: intlepisode00jamearch. The interesting files here are intlepisode00jamearch.txt which is just the OCR'd text, and intlepisode00jamearch_djvu.xml which is the OCR'd text with layout information (which has been useful to me in developing software which auto-corrects some OCR errors -- where the text is on the page often offers valuable hints for choosing the right heuristic for guessing the right text).
A quick side note on the differences between Google's and OCA's efforts that I haven't seen talked about much -- Google's main advantages in their bookscanning efforts are their wealth and fame, while The Archive's main advantages are experience, familiarity, and scanning technology.
Traditional book-scanning technologies are expensive and slow (which makes doing a lot of books, fast, that much more expensive, because you have to hire more people to do more books in parallel), but Google has enough money to throw at the problem that this is less of an issue. Google's fame means they can bring powerful partners onboard with a smile and a handshake, including some of the most prestigious libraries in the nation.
The Archive has been involved in scanning books and making them available online for several years now (qv The Million Books Project). This experience has shaped the processes used in the acquisition and scanning of books, as well as the technology used in their storage, indexing, and presentation. Furthermore, libraries around the world have grown familiar with The Archive over the years. That, and The Archive's good track record, make it a powerful rallying point for partnerships and alliances, and have given it more experience in facilitating such relationships. Finally, partially due to the limits of existing book-scanning solutions, and partially due to The Archive's limited budget, it has facilitated the development of two independent low-cost, reliable, high-quality book-scanning systems: The Scribe (developed in-house at The Archive) and the Kirtas Robot (developed at Kirtas, a Canadian company).
Many of the books scanned for the Million Book Project using traditional scanning methods are really lousy, sometimes to the point of being unreadable. These new scanning systems dramatically improve the quality of the end product, while equally dramatically reducing the cost-per-page. This means that more scanning systems can be purchased for more libraries (avoiding the per-library capital outlay problem), and more books can be scanned more quickly within a given budget.
Obviously, Google and OCA can benefit from co-operation, as each has a lot to offer the other. I'd be surprised if Google didn't join the OCA, eventually, if for no other reason that to gain access to the books of the >100 OCA
-
OCA and PG scratching each others' backs
The focuses of OCA and PG are really quite different: PG is most interested in preserving the essential information of a book (ie, its text), while OCA's interest is in preserving the form of the book (ie, its fonts, pages format, coloration, even down to the yellowing of the pages). That having been said, there's a lot each can do for the other (and has!).
The Archive has archived most of PG's material, because even though the Books department of The Archive is focussed mostly on preserving books, The Archive as a whole is interested in preserving just about any information it can, and the PG data is definitely of interest.
When the The Archive's Scribe software processes the book images into its various format (jpg, djvu, pdf, flippy, et al), it OCR's the book's text. This text then becomes part of generating some of the other formats. It will be really trivial for PG to obtain this text for any book it wants to incorporate into their dataset.
qv: intlepisode00jamearch. The interesting files here are intlepisode00jamearch.txt which is just the OCR'd text, and intlepisode00jamearch_djvu.xml which is the OCR'd text with layout information (which has been useful to me in developing software which auto-corrects some OCR errors -- where the text is on the page often offers valuable hints for choosing the right heuristic for guessing the right text).
A quick side note on the differences between Google's and OCA's efforts that I haven't seen talked about much -- Google's main advantages in their bookscanning efforts are their wealth and fame, while The Archive's main advantages are experience, familiarity, and scanning technology.
Traditional book-scanning technologies are expensive and slow (which makes doing a lot of books, fast, that much more expensive, because you have to hire more people to do more books in parallel), but Google has enough money to throw at the problem that this is less of an issue. Google's fame means they can bring powerful partners onboard with a smile and a handshake, including some of the most prestigious libraries in the nation.
The Archive has been involved in scanning books and making them available online for several years now (qv The Million Books Project). This experience has shaped the processes used in the acquisition and scanning of books, as well as the technology used in their storage, indexing, and presentation. Furthermore, libraries around the world have grown familiar with The Archive over the years. That, and The Archive's good track record, make it a powerful rallying point for partnerships and alliances, and have given it more experience in facilitating such relationships. Finally, partially due to the limits of existing book-scanning solutions, and partially due to The Archive's limited budget, it has facilitated the development of two independent low-cost, reliable, high-quality book-scanning systems: The Scribe (developed in-house at The Archive) and the Kirtas Robot (developed at Kirtas, a Canadian company).
Many of the books scanned for the Million Book Project using traditional scanning methods are really lousy, sometimes to the point of being unreadable. These new scanning systems dramatically improve the quality of the end product, while equally dramatically reducing the cost-per-page. This means that more scanning systems can be purchased for more libraries (avoiding the per-library capital outlay problem), and more books can be scanned more quickly within a given budget.
Obviously, Google and OCA can benefit from co-operation, as each has a lot to offer the other. I'd be surprised if Google didn't join the OCA, eventually, if for no other reason that to gain access to the books of the >100 OCA
-
SF already has free Wi-Fi
San Francisco has had free Wi-Fi for quite some time. I had the pleasure of meeting Ralf Muehlen, one of the primary contributors, when I donated equipment to the project last year.
What's interesting is that there's no reason why a lot of Internet access shouldn't be free. We don't pay a service charge for broadcast radio and television. There's an argument that Wi-Fi should be more like HAM radio -- you buy your equipment and your're online. Developments in mesh networking, especially where it's possible to relay through multiple nodes could help make this a reality. Of course we'd still need the wired backbone.
Of course there are a lot of special interests working against this. Not least, the FCC (backed by the current fee based providers) who are adamant about keeping power limititation extremely low for the ISM unlicensed spectrum. Of course the cell phone compainies have no problem blasting at thousands of times more power than we can. But that's life in politics I guess.
Be interesting to see how this plays out in the next few years, especially with the advent on 802.16.
Please get in touch with someone from sflan if you can contribute bandwidth, equipment, or technical expertise. It's a really good cause. -
Re:Why not join the Gutenberg Project
Actually, Project Gutenberg can be reached from Archive.org's Main Text Page along with some other cool sub-collections. I particularly like the Canadian Libraries. Once you install the DjVu extension you can view the scanned books. All old and out of copyright. Some have some very nice illustrations that now public domain, so can be copied and used for other projects.
-
Re:NAT is not the answer!The first SIP implementation came out in Mar 1999. Here is a link to linksys(via wayback machine Aug. 1999) stating "As an added bonus, the Fast Ethernet 10/100 Network in a Box now comes with a free copy of Virtual Motion's Internet LanBridge LAN-to-Internet connectivity software with Unlimited user licenses and 15 days of Internet access on the EarthLink Network -- share modems or ISDN connections and get networked today!" They didn't have any "routers" listed on the site.
It seems that people forget that for the most part of the early internet days it was all SLIP/PPP. People didn't use NAT. They got a public IP when they dialed up. SIP has been around a long time. They banked on address availability (or later IPv6) when they designed it. The widespread adoption of NAT was not a foregone conclusion at that point. NAT just gained momentum a lot faster. It is one solution to a shortage of IPs, I would just argue that it isn't the best.
-
Rubberhose
So what happens if you're running Rubberhose?
Even if they break out the rubber hoses and you give up a passkey to an aspect they won't know how many or if there are any other aspects on the disk.
P.S. Official site has been gone for some time, but it's still on archive.org -
Maybe is it time considering netlabels ?
-
Here you go, smiley.
Some friends of mine; there are a lot of other bands listed at Archive.org as well. FREE. Plus there are literally thousands of MP3s out there from good bands who want you to download them, FREE.
The problem isn't CDs, it's the major record companies. Buy your CDs from local bands.
-mcgrew
MRC="grapple" -
Old website with video from an earlier version
There was a website a few years ago from a group of students who did a musical version of the original Star Wars in high school. They had online video and audio clips and it looked pretty funny/good. It was also based on rewriting the lyrics for songs from existing musicals (e.g. Andrew Lloyd Weber, etc).
It was performed May 24 and 25 (in 1996) at the Palos Verdes Peninsula High School Performing Arts Center in Rolling Hills, California. Book and Lyrics by Kevin Bayuk, Garrin Hajeian, and John Zuckerman.
I still have an archived copy of the site, including media files.
Here's the wayback machine link: http://web.archive.org/web/19990218201534/newdrea
m .net/StarWars/.I'm assuming this new version is unrelated.
-
Re:Virtual Property
You're missing the whole point of Slashdot. The reason to read the comments is to see what retarded bullshit the Slashdot lemmings (Slemmings???) spout off about the latest bullshit that the "editors" decide to post. It's like watching a train wreck - so horrible you just can't look away....
By the way, Slashdot always sucked. It wasn't any better back in the "good old days." Don't let anyone tell you any different.
(On a side note, the script confirmation word right now is "distress." How fitting.) -
Re: good for your computer
-
99% of the time....
The music that is sold isn't even worth listening to.
Want free lossless music?
-> http://www.archive.org/audio/etreelisting-browse.p hp?collection=etree/
Support trade friendly artists to "stick it to the man".
Go to their shows.
Buy their swag.
Have fun. -
Easy
How do you then ensure that the music and player you buy today will not be incompatible with your player, online store or the OS?"
Easy, only buy music from people willing to let you listen to it. Places like emusic and magnatune sell completely unrestricted music files. And shit, archive.org gives away thousands of hours of music for free.
Vote with your wallet. If DRM is unacceptable, don't buy from people who would push it on you. There's plenty of music out there that's not DRM'd, and it's mostly better than the RIAA crap. Good musicians can afford to give music away, there's plenty more where that came from.
If you were treated the same way in a physical store that Apple or Napster treats you online, you'd storm out angrily and never shop there again. Why should online stores be any different?