Slashdot Mirror


User: ericleasemorgan

ericleasemorgan's activity in the archive.

Stories
0
Comments
10
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 10

  1. Not a whole lot, but... on What Do You Do When the Cloud Shuts Down? · · Score: 1

    What do you do when the cloud shuts down? Nothing. Seriously. This is why you strongly weigh the advantages of cloud computing or relying on an outside network in general. I recently purchased a house that has no and will have no Internet connection. While there my computing change and I figure out ways to accomplish many of my same goals with a limited set of resources. I sort of look at the whole thing as interesting computing problem to solve or work around.

  2. libraries are not about books on Gen Y Hits the Library the Most -- But Not For Books · · Score: 2, Insightful

    Libraries are not about books anymore than carpenters are about hammers or surgeons are about scalpels.

    Instead, libraries are about the collection, organization, preservation, and dissemination of data and information for their respective audiences. Carpenters are about building things, and surgeons are about healing. For the longest time, information was primarily manifested in books. It is not about the books; it is about what is inside the books. Unfortunately, too many libraries have identified their tool with their trade (profession), and too many librarians have not learned how to exploit the use of computers to change the image. Sigh. No, libraries are not indispensable, but they can save people time, record the historical record for future generations, provide a neutral space for people to interact in a community, and educate a population.

    The article outlines some of the ways libraries are trying to reinvent themselves, and at the same time, demonstrating how they are still about data and information for the acquisition and creation of knowledge.

  3. Re:Bad news for the libertarians on Government Makes NIH Research Open Access · · Score: 5, Insightful

    No, speaking as a librarian, this is not bad news at all. In fact, it is a boon. Instead of paying thousands of dollars a year for subscriptions, this legislation allows librarians to freely collect, preserve, organize, and re-dissemination this research in a way that will benefit all (except the publishers).

  4. Re:About Time on Government Makes NIH Research Open Access · · Score: 1

    +2 - "what the fuck" = +1 # I agree

  5. kinosearch, swish-e, zebra, ht:/dig, etc. on Best Way to Build a Searchable Document Index? · · Score: 1

    There are many ways to skin this cat. I believe most of them have been mentioned, but I will outline my experiences anyway.

    swish-e is a grand-daddy of an indexer. It can act as a robot, crawl your local file system, or get its input from STDIN. If indexing HTML, swish-e will index the document's metatags and provide field searching against them. Swish-e comes with a C, Perl, and PHP API. I don't think swish-e supports anything but ASCII very well.

    kinosearch is my new favorite. Written in C but with a Perl API, this indexer works a lot like Lucene. Its resulting indexes (files) may be readable by Lucene. Kinosearch works by initializing a "document" with attributes, filling each attribute with values, and saving the document. Searching is fast an easy. It does not support wildcard searching, but uses extensive stemming instead. Kinosearch does not index files from your file system; you must parse your data and feed it to Kinosearch.

    Ht:/dig is nice, but the last time I looked, it had no API. I found this to be too limiting. It indexes documents.

    The Google Appliance is cool (and kewl) but also very expensive. This black box (well, it is really gold or blue) does a lot of the work for you. Configuring its output is dependent on your ability to do XSLT. You can feed the Google Appliance database dumps and other streams of data. Nice. I still think the price is steep.

    There's Plucene, a Perl port of Lucene. Too slow, and seemingly unsupported.

    Lucene and its kin seem to be the Gold Standard these days. I appreciate that, but alas, I don't have any Java experience. Increasingly people swear against SOLR, a Web Services-based interface to Lucene.

    Zebra is an unsung hero. It has been around for more than ten years, actively supported and used extensively in Library Land. (I'm a librarian.) This thing can index just about any kind of document. It supports every type of searching feature (stemming, wild card, fielded, Boolean logic, relevance ranked, etc.). It can read files or be fed things from STDIN. Fast!

    As an added bonus, I advocate readers explore abstracting their search interfaces with something like OpenSearch or Search/Retrieve via URL (SRU). These abstract layers allow you to create user interfaces to your underlying indexers without worrying what those indexers are. In other words, these abstract layers define the syntax for queries, the transport mechanism to the index, and the structure of the returned result. Given such a framework, you can write an OpenSearch or SRU interface to your index, but if you decide that Lucene is not what you want to use anymore but Kinosearch is, then you can change your indexer without the need to change your user interface. Very nice. OpenSearch is simpler to implement but is weak when it comes to expressive searches and search results. SRU is more robust but also more complicated.

  6. the scholary communications process is broken on Libraries Defend Open Access · · Score: 5, Informative

    The scholarly communications process is broken, and it has been this way for at least 15 years. I applaud the efforts of ARL and decry the lies and propaganda articulated by PRISM.

    Again, the process is broken, and there are three contributing factors, listed here in no priority order. First, librarians (and libraries) desire to preserve the historical record for future use. This means they (we) desire to collect and organize just about as much of human's intellectual output in order to foster the growth of knowledge. Idealistic, I know, but it is true. Second, scholars (usually university faculty) have the natural desire for promotion and tenure. They want to be recognized by their peers and rewarded for achievements. This is often realized through publishing journal articles in sets of established venues. Third, publishers have the natural desire to earn as much money as possible. This is the nature of capitalism.

    This three-fold combination (buy everything for the sake of future generations, published in established venues, and make as much money as possible) has driven the prices of scholarly journals through the roof. For example, just guess how much the average scholarly journal costs per year? If you guessed less than a few thousand dollars, then you were wrong. Twelve issues. Glossy paper. No ads. $3,000/year or more. Just about the worse journal is Brain Research costing close to $15,000/year.

    Each of the three groups (librarians, scholars/researchers, and publishers) have the "rights" to do what they are doing, but in the process I sincerely believe the public gets the short end of the stick. Because the journals are licensed (not purchased) from the publishers, a person needs to be a part of the licensee's membership group in order to read the articles. This excluded the general public, researchers from abroad, or people in third-world countries. How are these people suppose to benefit from the research if they can't have access to the content?

    Open access publishing is seen as one possible solution to these problems. It is very much akin to open source software. Research something. (Scratch an itch.) Write about it. (Document your software.) Deposit it in an archive and give it away (Make it available for download). Wait for comments. (Support your software.) Repeat, and enjoy the acknowledgement of your peers.

    Open access publishing is not the answer to everything just as open source software is not the answer to everything. On the other hand, the public -- who has funded much of the research of scholars through tax-paid grants -- does have the right to access to materials they helped create. PRISM advocates the commercial sector continue to have control over the distribution process. Such a perspective is a disservice to the nature of scholarship and the freedom of access to fundamental knowledge.

    --
    Eric Lease Morgan
    University Libraries of Notre Dame

  7. Re:The Old Way of Scientific Publishing Needs to G on Open Access For Research Gaining Steam · · Score: 1

    BayaWeaver++

  8. Re:That's great! on Free Global Virtual Scientific Library · · Score: 2, Interesting

    I concur! This is idea is way overdue. Do y'all know how much these articles cost in the formally published form? Thousands of dollars a year. If libraries didn't feel compelled to purchase them (librarians are nice), then the journals would dry-up and blow away. With the 'Net there is not nearly as much need for journals. Let open access become the norm, not the exception. --A librarian

  9. Re:Such a deal on Video on Demand From the Public Library · · Score: 1

    I concur.

    While I am not one to say, "Buy lots o' books." The fees a library will pay to provide this service will be high, and IMHO might be better spent other places.

  10. Indexing public domain content on Reining in Google · · Score: 1

    I seriously don't understand what all the fuss is about. All that is happening is the creation of an index, a list of words associated with pointers to the words put into context. It is not like you can realistically download the entire book, and I sincerely believe the small numbers of people who will go through the book and download each image will be far smaller than the number of people who will buy the book. These people who are making so much noise would be better off spending their time making more of their content available digitally.

    The world is not coming to an end, just changing.

    --
    Eric Lease Morgan, Librarian
    University Libraries of Notre Dame