Domain: dlib.org
Stories and comments across the archive that link to dlib.org.
Comments · 24
-
Re:What a flood of garbageThings nobody has mentioned:
- The relevant search term is "digital preservation". The Library of Congress has an active project.
- One fundamental problem is what the underlying storage medium should be. Microfiche is a well-established choice, and lasts at least 20 years.
- Once you can store a bit-stream, the other fundamental problem is what format to use. You could use a lowest-common-denominator format, and include directions on how to decode it. A better option is to use whatever format the digital preservation community is standardizing around, since they are likely to maintain open-source decoders for those in the future. For images, uncompressed TIFF and JPEG 2000 are common choices.
-
Blah blah
"Actually, you are arrogant, ignorant, self-centric, and you talk through your behind."
No, actually he's right. It sucks for you, but it's true.
"There are lot more than just english-language web sites"
If by a "lot more" you mean "an insignificant amount as compared to" then yes.
http://www.dlib.org/dlib/april03/lavoie/04lavoie.h tml
The info is a few years old, but seeing as the trend was toward MORE english sites not fewer, I'd say it's probably pretty close.
The rest of your post was just a juvenile rant about your insecurity, which we could really do without. -
Re:Second Life HypeThere are currently 36,511 users currently logged into SecondLife right now. Somewhat hard to call that 'no one'. That's quite a lot, but it still pales in comparison to many online games.
Soccer sim Hattrick http://www.hattrick.org/ usally has more than that (not right now, it has 13.000 because it's 2 am in Europe but on weekends and Wednesdays, it reaches 50.000 users simultaneously connected).
I don't play it, but according to this site, World of Warcraft reaches 200.000 simultaneous users!!! and for all I know it could reach millions... http://www.dlib.org/dlib/december05/kirriemuir/12k irriemuir.html
Other games I know of like Magic Online also have 10.000 to 15.000 users online at certain times, and googling showed EVE Online can have 50.000. -
Two types of publishers
I forgot to mention something relevant that is known to all academics but apparently unknown to many
/. readers.There are two types of publishers: for-profit and non-profit. The for-profit publishers are commercial corporations like Elsevier. They, as is their duty to their shareholders, charge what the market will bear, do everything they can to jack up their prices, etc. One for-profit journal might cost an individual $200/year; while a library would pay $500-$1000/year or more. All numbers are approximate.
The non-profits are the professional societies like IEEE. In the US, a non-profit organization is allowed the privilege of being a non-profit in return for providing some benefit to society. IEEE's income is membership fees (I pay IEEE $200/year incl some journals), conference registration fees (perhaps $200/day), journal subscriptions ($40-$100/year), and misc. The professional societies set the prices just high enough to break even (and pay overhead). That's a totally different philosophy.
Even a narrow field may have 10 relevant journals. If your work is interdisciplinary, then there may be 30 journals (and many conferences) that occasionally publish something interesting. Everyone is starting new journals.
While both classes of journals are technically obsolete, only the for-profit journals' prices are breaking the libraries' budgets. First the smaller colleges like RPI got hit. (RPI cancelled most of its print journals and is now cancelling many of its online subscriptions.) However even Stanford, Cornell, etc, are now feeling the pinch. The current solution for the poorer libraries is to pay for any individual articles that researchers ask for. In response, the commercial publishers may now charge $20 for one article, and that's rising.
As a researcher, I feel it my moral duty to support the society journals whenever possible. However, sometimes the publishers' journals are excellent. There's a feedback loop here. A journal is defined to be good if papers (mostly in other journals) cite its papers. Therefore people want to publish there, so its editors get to select the best, etc. This is related to the concept that sometimes the best SW for a particular app is commercial.
More tidbits:
In at least two recent cases, the complete board of editors of a for-profit journal have gotten so angry at their own journal's price (set by the publisher) that they've quit en masse and formed a competing non-profit journal.
For-profit publishers can be sensitive on this topic. Around 1994, Gordon & Breach sued the American Institute of Physics and American Physical Society for publishing a survey showing that their journals were among the most expensive. AIP/APS won. See http://barschall.stanford.edu/index.html
There are many online stories about this. Librarians have been debating it for more than 10 years now. Here is one ref: http://www.dlib.org/dlib/march01/frazier/03frazie
r .htmlMy own feeling: It's time for a reorganization of the whole higher-ed and research system. Abstractly, things have never been better (in many fields of CS). I can do research on a laptop; I can learn what a researcher in Tasmania is doing from his website. However, the institutional system is more and more obsolete and irrelevant, and indeed, more and more, hindering progress.
-
Re:Google's got a long way to go . . .
Most libraries' collections are very similar to most other libraries' collections, and the greatest overlap occurs with the books that are the most important.
Because the original Google 5 libraries have their holdings entered into WorldCat, a statistical study was done that showed that those five libraries would account for 33% of the 32 million books in that database. It also showed that 61% of the books held by the Google 5 are uniquely held by only one library. Essentially, the holdings of libraries follows a common pattern of a short high followed by a very long tail. If, even with their long tails, these 5 major libraries account for only 1/3 of books that libraries have entered into WorldCat, imagine how many libraries it will take to find and digitize the long tail of that one bibliographic database.
Less ephemeral works (the kind typically preserved in library collections a century later) generally all had their copyrights renewed in the U.S
The rate of copyright renewal was very low. According to Lessig ("Free Culture" p. 135) "In 1973, more than 85 percent of copyright owners failed to renew their copyright." I've seen estimates that about 90% of the books published between 1923 and 1978, when renewal was abolished, were never renewed. That means that there are MANY public domain books in that time frame, only we can't easily know which ones they are. You can look them up in the renewal database, but my impression is that the database is not considered to be complete, and therefore not entirely reliable. If you find the book in the database, it was renewed. If not...
-
Use Dspace.
For you I'd recommend a digital repository like Dspace.
-
Second part
here is the second part of the article
-
IEEE, already Green, considers going Gold
IEEE, has already gone "Green" -- i.e., it is among the 78% of publishers (publishing 92% of the 8950 journals surveyed to date) who have already given each of their authors the green light to provide open access to their own articles, if they wish, by self-archiving them in their own institutional OA archives. IEEE is now contemplating also going "Gold" -- i.e., becoming one of the 5% of publishers that are open-access publishers, making all of their articles open-access (and many of them recovering their costs by charging the author-institutions for publication by the article instead of charging the user-institutions for access by the journal or article). Going Gold is not without an element of risk, so IEEE are to be highly commended if they actually decide to try it, but let us not foget that, being already green, IEEE are already on the side of the angels! It is the authors (and their institutions and funders) -- i.e., the research community itself, the very ones for whom the benefits of open access are being sought -- who are to blame for not yet going when the going is Green, by self-archiving their own articles so as to make them open access. Relief may be on the way there too, however, in the form of a proposed new recommendation to the 55 major research institutions worldwide who have signed the Berlin Declaration on Open Access" that they should now implement an explicit Institutional Self-archiving Policy of providing open access to their own research article output. (A summary will appear in the March issue of D-lib magazine.) Two recent international surveys have found that whereas most authors do not yet self-archive, 79% will do so willingly, but only if and when they are required to do so by their employers and/or funders.
-
More About DRMIf you want all the details about DRM, you can find them here:
http://www.dlib.org/dlib/june01/iannella/06iannel
l a.htmland
http://en.wikipedia.org/wiki/Digital_rights_manage ment -
predecessor: robust hyperlinks
There were two fellows at UC Berkeley (Phelps and Wilensky) who implemented the idea of "fingerprinting" web pages at least as far back as 2000. It was a non-trivial fingerprinting (i.e. not just MD5 hash of a web page).
As far as I know, they haven't done any more recent work on this and the software is only available via archive.org.
A paper
I gather that the IBM effort is different in significant respects, but it certainly employs ideas from Phelps & Wilensky. -
Here's one where you can simply sing into it.
-
MS-Passport is inherently insecureMS Passport is inherently insecure and cannot be made secure, even in theory. To claim otherwise would be false advertising. Not to mention that in the terms of service you hand over any privacy you once had, see the FTC link above again for an example of abuse.
I'd be especially wary of sites locked into ASP or
.NET, not just for the inherent security problems. PayPal, for example,. is at potential risk, as it is owned by eBay. But read the changes to HotMail or other similarly MS-Passport encumbered services.There are ways to do secure, platform independent, centralized authentication for web and other services, but MS-Passport isn't one of them. See Kerberos + LDAP instead. If you don't wish to experiment on *BSD or something else, all the major Linux distros include both clients and servers. There are even ways of scaling enourmously. Universities and libraries with electronic subscriptions should be able to get the most mileage out of Kerberos.
-
A typical day on Slashdot..Do as I post...
'tongue in cheek'
Haven't you heard? Information wants to be free.
'tongue out of cheek'
-
Re:libraries
Dear Robot,
You missing a few book-business idiocyncracies that complicate your analysis. Firstly, costs of print publication are not dropping dramatically. Process improvements have helped hold the line on costs, but up to 85% of print costs is the paper itself and paper as a commodity is highly volatile. 1995 saw a fourfold increase in paper costs for all printers and contributed mightily to the consolidation in the newspaper industry, for example. This necessitates maintaining high inventories relative to other manufacturing processes in order to hedge raw materials price swings.
Another aspect of the publishing market you may not be aware of is an anacronistic feature of book distribution contracts termed a buy-back clause. What this means is that publishers commit to repurchasing unsold inventory from distributors. This habit has some quirky historical background and is very uncommon in other industries. It lingers both because distributors are loath to give up this safety net and because of the high cost of inventory maintenance for print publications. The upshot of this is that the line between production and marketing costs is very fluid for most publishers. High print runs signal a publishers commitment to a book and can be considered an important aspect of marketing, but it also builds in a cost overhang. Among other things this inefficiency contributes to the durability of book genres and big names in print. New authors and untried formulas are highly risky. This also contributes to some eggregious contractual terms for authors. Did you know that the vast majority of publication advances to authors are refundable? That is, if the book fails to sell sufficiently for the author's royalties to cover the advance, the difference is the author's liability. In fact, most new works of fiction fail this test and most fiction authors' first books are consequently their last.
As to the cost basis for libraries I refer you this article
I regret to say that the absurdities /. finds in the infrastructure of industrial music production and distribution were invented and refined by the book business for two centuries before the advent of vinyl. The book industry taught the music and film studios how to do the business of mass meme distribution and how to control their creative talent.
-
how can it be secure without drm ?
drm is an important technology that will save the world from Communism and crackers. The DOD needs security and according to world software maker Microsoft, drm is needed to provide better multimedia and security.
Someone please think about our children.
-
Re:isn't this done already?Actually this stuff is strongly related to research in adaptive hypertext linking.
I know this group ran experiments with web sites that generated dynamic links according to user retrieval patterns in 1996 and before: -
Heh
Where do I begin?
so we need to justify our posture and pricing.
I'm guessing that MS is going to justify there pricing and secure their posture by pushing DRM. (Another good DRM site, here)
He acknowledged there was more to Linux than free software--the main benefit of the open-source movement was the community developing software and sharing ideas. "Linux is not about free software, it is about community,"
Absolutely correct. Those who actually use the product get input into its future. Unlike most commercial software, where users are force fed a marketing department's idea of what is or isn't important.
Ballmer hits on an important issue: the Linux community. Here is a group of people that are as diverse as you can possibly get, yet share a single OS and philosophy. But, Ballmer completely misses the ideal behind community.
For nine years, the company has designated users with particular skills--usually seen by how often they intervene helpfully in newsgroups--as "most valued professionals". Currently there are about 1,200 MVPs, half of whom are in the United States.
The title is highly regarded, said Thomas Lee, a Windows 2000 MVP who specializes in directory issues, and has just been appointed as chief technologist at QA Training. "You are recognized by your peers, not by an exam that you can cheat in."
MS believes that they can create their own community, when in fact they will only succeed in alienating more people with their elitist attitude and the MVP award. -
ip conservancy
Many companies "promise" that they file for patents for defensive purposes only. Please. Maybe Red Hat is really telling the truth, but in general one should never believe what a publicly traded company promises todo. One should assume that the publicly traded company will try to maximize profits for their shareholders.
If a company truly is filing for a patent for only defensive purposes then they would donate it to an intellectual property conservancy, like The Knowledge Conservancy run out of Yale. That way a company won't be tempted to try to cash in on their IP if they have a change of heart about their "promise" or if they get bought by someone else. Hopefully we can learn something from the CDDB debacle. -
Re:Why not linux?
Windows = vendor to blame if stuff goes wrong.
You're forgetting that these are people in government here. Rule #1 in bureaucracy decision-making: CYA. Cover Your Ass.
As a side note, it is interesting to see that hardly anyone is complaining about how Compaq is being considered as the hardware vendor. There are a couple of posts suggesting Macs be chosen but in that case the differentiating factor is really the OS again.
Side note #2, this reminds me of a very similar project that was undertaken in France, about 20 years ago. I believe it was the 'Minitel' service. It involved the free distribution of about a million 'kiosques' in Paris. They were basically funky little TTY terminals that connected to 'Videotex' computerised French phone directories. A short history of the project is given here. The service is still going, in fact (see www.minitel.fr). The history is interesting in terms of the interaction between French 'dirigisme' and the encroachment of the Net. -
I actually support DRMYou know what, the coming era of DRM is not really going to be that bad. Let me explain...
First of all, I want to get one thing straight. Stealing music is illegal. Whether you disagree with the compensation given to the musicians or whatever, when you steal an MP3, they get nothing. The general consensus on /. has been that the people here do not use MP3 (or whatever) file format for pirating music (whether they are lying or not, I don't know).
Such that this is, I think that Digital Rights Managment, properly implemented, could be a great thing. This article gives a good overview of how it might be implemented. Basically, it organizes information into a WORK, an EXPRESSION of that work, a MANIFESTATION of that expression, and an ITEM as a part of that manifestation. For example (the example they give on the page) the work could be The Name of The Rose by Umberto Eco. The expressions of that work could include the original, an english translation, etc. The manifestations, say, of the english translation expression could be the book and the book-on-tape. The items of the "book manifestation" could include an actual hardcover book or an e-version from some website.
When you buy something, you have digital rights for either the work or any of the sub-levels. Owning the rights to the expression (the english translation) would get you all of the manifestations and items below that. Of course, most people would only own rights to one or a couple "items".
Now, the main problem with this is that DRM-protected files won't work on legacy hardware. I agree that this is a big problem. (You hear! I agree!) But, I'm interested to hear, discounting this problem, would DRM really be so bad according to you all? If you bought a car-stereo, a portable stereo, a home stereo, and computer running LinuxDRM (or WindowsDRM), and they were all registered to you, you could buy "Metallica - Master of Puppets (Live with the San Francisco Orchestra).DRM.mp3" and it would run on all your DRM-registered items. If you sold one of those stereos, the new owner would want to change the registration in order to play his MP3s, and you could keep copies on everything you want. You could keep backups on every stinking computer in North America, as they would only work for things registered to you.
Now, I can see a couple of problems right away. Hackers would crack the DRM in about 20 seconds from the first one landing in St. Petersburg, and this would be much easier to implement with a central registration system (which in my opinion is unacceptable, but there are ways around it). Any other thoughts?
Wow, that went a lot longer than I thought it would. Note that all opinions are mine and I take responsibility for them.
-Cruz -
Similar project
-
Lame Slashdot articleThis is old news. Nor is it a big secret; IBM has a page on Madison. It's an application of IBM's Java-based Cryptolope technology; the content is delivered as a JAR file. The main technical paper on Cryptolopes describes the technology, and you can even download IBM's free player for Cryptolopes.
We've been seeing too many under-researched articles on Slashdot recently. Sloppy work at the Geek Compound. Go read some Journalism 101 books, guys. Linking to someone else's article is not journalism.
-
The Right to Link?
I've found this article to be quite interesting. I feel comfortable providing this link since Mr. Templeton specifically wrote "Certainly you can feel free to link to these pages!".
Since we're on the topic of "the right to link", I'd like to go off on a slight tangent that has nothing to do with MP3Board. What are the opinions here on framing another site's content without permission? When does framing (or any sort of hyperlinking) represent a derivative work? To clarify, let me pose three different purposes:
- Framing for content - One site frames the content of another, essentially co-opting the work, whether its attibutable or not. (i.e. Washington Post v. TotalNews...a case which settled without a court decision)
- Framing for persistence - One site or service frames links to external content, but the frame retains the brand and any advertising within shared space in the browser window. Does providing a drop-frame function mitigate the practice (see Hotmail, AskJeeves, and About.Com for examples.)
- Framing for functionality - A less common rationale, but consider a tool like a Web-based proxy service. Some fetch pages, at the behest of the user, and render them within a frameset that includes a navigation function and, perhaps, advertising. What now?
-
Some background reading on Digital WatermarksTry http://researchweb.watson.ibm.com/topics/popups/i
n novate/multimedia/html/dahow.html for one introduction. A more thorough background piece is available at http://www.dlib.org/dlib/d ecember97/ibm/12lotspiech.html or http://www.jtap.ac.uk/reports/htm/j tap-034.htmlI hope this helps anyone who likes reading research.
Stephen