Slashdot Mirror


Haystack: A More Compelling View Of Your Data

Peristaltic writes "MIT's Haystack project has released the source for it's "Universal Information Client", Haystack. In their words: 'Haystack looks into the use of artificial intelligence techniques for analyzing unstructured information and providing more accurate retrieval.' Unlike some attempts I've seen in the past to pull it all together on my desktop, Haystack shows some promise -- One of it's more useful features allows you to take the information you've been wallowing through, and have Haystack continually refine a 'dynamic hierarchy' until you get what you need. Haystack also performs some neat tricks such as combining Email, IM, web pages, etc. into a single inbox."

12 of 246 comments (clear)

  1. Runtime overhead by PureFiction · · Score: 4, Informative
    Beware the load on your system if you wish to try this out. It eats RAM and CPU with gleeful abandon.

    From the system requirements:

    • - Pentium III 700mhz-based computer or better (Pentium 4 2ghz strongly recommended)
      - 12 megabytes of RAM (768 megabytes strongly recommended)

    s/strongly recommended/REQUIRED/
    1. Re:Runtime overhead by PureFiction · · Score: 4, Informative

      Arg, cut-n-paste errors. Should read 512M

      Please take note of the following system requirements for Haystack:

      * Pentium III 700mhz-based computer or better (Pentium 4 2ghz strongly recommended)
      * 512 megabytes of RAM (768 megabytes strongly recommended)
      * Windows 2000, Windows XP, or Linux (Linux build requires GTK+ 2.0 libraries)
      * At least 1 gigabyte of disk space (or more, as your repository grows)
      * Java 2 Development Kit (JDK) 1.4 or later note that JDK 1.4.1 does not work with Haystack; use JDK 1.4.1_02 instead)

  2. Re:Awesome Mozilla effect. by ergonal · · Score: 2, Informative

    Happens in Opera 7.02 too.

  3. Re:Awesome Mozilla effect. by tuffy · · Score: 4, Informative
    This is the nifty bit of code that generates that effect:

    <div style="BACKGROUND-ATTACHMENT: fixed; BACKGROUND-IMAGE: url(images/cover.png); WIDTH: 520px; height:370px; BACKGROUND-REPEAT: no-repeat"></div>

    Fun with Cascading Style Sheets :) It might've been more effective, however, to stick the big image in an iframe so people can scroll around in it easier and have a look.

    --

    Ita erat quando hic adveni.

  4. Re:Awesome Mozilla effect. by GraZZ · · Score: 2, Informative

    I agree; the only way to see the left side of the image is to resize your browser narrower...

    I'm sure this isn't what the site's creator intended, as it makes it hard to look around such a pretty interface. :)

  5. Re:Awesome Mozilla effect. by OrangeGoo · · Score: 2, Informative

    Actually, it's IE that has it wrong, not Mozilla. IE has yet to do CSS properly (funny that they can take the time to invent their own CSS, but can't be bothered to implement the standardized stuff). IE also doesn't support the alpha channel on PNGs, which makes them all but useless from a web-design standpoint. Since IE dominates, we have to design to them... hooray... Nuts to IE.

  6. Re:I just want a relational filesystem... by Xerithane · · Score: 2, Informative

    Now, of course linux trolls will whine and whatnot, but SQL Server is a killer DMBS, and this filesystem will be cool. Imagine how fast apps will start when they dont have to scour a half dozen directories for .dll files, but instead "SELECT location FROM files WHERE filename = 'msvcrt.dll' AND version = '7.8.29'"

    A file system built on an SQL engine doesn't work... It's like putting a Viper engine in a Ford Focus. A simple meta-dbm attached to each node (and visa versa, an index on the meta-dbms... similar to how the iPod works) would work just fine. I've never saw the point of allowing an entire SQL engine on a filesystem.

    Anyways, in a few decades someone will write a free-as-in-no-money version for lunix. So hold tight.

    That was really damn funny, thank you.

    --
    Dacels Jewelers can't be trusted.
  7. I was a usability tester... by Anonymous Coward · · Score: 3, Informative

    for Haystack at LCS recently, and was not that impressed. It is designed to do certain kinds of tasks very well (e.g., editing things that are embedded in other types of information - the tests given were things like "edit this picture that's a part of this entry in your Outlook address book"). Unfortunately, at the expense of making these tasks as close to one-click as possible, other things (versatility the most, but also common sense design) have failed.

    I find it easy enough to edit information of the "My Documents" variety without worrying about how it is integrated into other information on my computer, and I'm sure other readers here do, as well.

    The best way to actually use this software would be in the case where John Q. has a specific task to do over and over again but isn't ready to tackle a batch process.

  8. Six Degrees by mblase · · Score: 5, Informative

    Six Degrees by Creo is another attempt to do this same sort of thing, except that it's commercial and it's been available for Mac OS X and Windows for several months.

  9. Scopeware Vision by www!!!1 · · Score: 2, Informative

    Scopeware Vision is similar but better than this. It only requires 128 megs of ram!

    Try the 30 day free trial. It rulz!

  10. Another Iteration of the OpenDoc by jonbrewer · · Score: 2, Informative

    From the Design Principles:

    "...provides a single, uniform interface for manipulation of e-mail, instant messages, addresses, web pages, documents, news, bibliographies, annotations, music, images, etc."

    "...attempts to match a user's own focus on objects in view and what can be done with them. An operation (such as spellchecking, sending an e-mail message, or rotating an image) can be invoked at any time on any object for which the operation "makes sense" (i.e. a blob of text, a person, or an image respectively)."

    Back in the heady days of the PPC 601 and the Newton, one of Apple's software groups was working on this problem exactly. While I don't think OpenDoc could organize your information, it was certainly a uniform interface for manipulating stuff, with the focus on the stuff, and not the application in use. At that point, about seven years ago, I naively believed that one day OpenDoc would provide an environment in which I could edit a web page and all elements (including raster and vector images) without having six applications loaded. Ha!

  11. Re:I just want a relational filesystem... by jorleif · · Score: 2, Informative

    In regards to plain text/personal information -- have you thought about looking at Bayesian filtering for a solution to that? I haven't (yet) but the idea is festering in my brain.

    Bayesian filtering is another of those words that mean a lot of things ;)
    Nowadays it is usually used in reference to the techniques used for spam-filtering, which is a very specific task. Classification: Spam / Not Spam. Basically everything that uses the bayesian view on statistics can be considered a bayesian method, without considering the underlying model. In other words for many statistical models it is possible to derive bayesian optimization schemes (or "learning rules").
    A widely used set of language models are the Hidden Markov Models. I'm planning to use them on an information extraction problem (populating database tables from free-text descriptions), and that's about the closest to the problem we're discussing that I've been. You could probably use them as a partial solution here as well, but I can't think of any really clever scheme at the moment.

    For personal information one would like to have something that clusters the data into different categories. There are lots of methods for this. One I'm familiar with is Self Organizing Maps (an example paper about them).

    And finally, sorry to be boring, but I'm not currently working on anything that would create something like the system we've discussed =)