Slashdot Mirror


Haystack: A More Compelling View Of Your Data

Peristaltic writes "MIT's Haystack project has released the source for it's "Universal Information Client", Haystack. In their words: 'Haystack looks into the use of artificial intelligence techniques for analyzing unstructured information and providing more accurate retrieval.' Unlike some attempts I've seen in the past to pull it all together on my desktop, Haystack shows some promise -- One of it's more useful features allows you to take the information you've been wallowing through, and have Haystack continually refine a 'dynamic hierarchy' until you get what you need. Haystack also performs some neat tricks such as combining Email, IM, web pages, etc. into a single inbox."

24 of 246 comments (clear)

  1. That's great and all, but.. by notque · · Score: 4, Interesting

    Haystack also performs some neat tricks such as combining Email, IM, web pages, etc. into a single inbox

    It may just be me, but this is a feature I never want.

    I do not want 1 large program to run all of my applications. I do not want to get my email, from where I get my web pages, and my IM. I don't want any of this.

    I am quite happy with seperate programs which I can use at my pleasure. I'm happy with the lack of bulk, and the fact I can change an email client without changing a web choice. (although I only use pine anyway.)

    Is this just me? Do all of you want your programs shoved together in one large application?

    I didn't get any options on my cell phone (like text messaging) because I purchased a cell phone. I wanted a cell phone. To make calls. Nothing else.

    --
    http://use.perl.org
    1. Re:That's great and all, but.. by RevMike · · Score: 5, Funny

      So should I assume you don't want it embedded within Emacs.

    2. Re:That's great and all, but.. by Xerithane · · Score: 4, Insightful

      Is this just me? Do all of you want your programs shoved together in one large application?

      You mean like a Window Manager? That's how I see this thing... it's like a Window Manager with applications embedded inside of it (think of a forced dock type thing.) It just handles whatever data you present it with (or the computer presents it with) automatically.

      I didn't get any options on my cell phone (like text messaging) because I purchased a cell phone. I wanted a cell phone. To make calls. Nothing else.

      My cell-phone has bluetooth, PDA functions, games, voice recording, voice dialing... that's the great thing about choice. You, nor I, are the entire market.

      --
      Dacels Jewelers can't be trusted.
    3. Re:That's great and all, but.. by rdeadman · · Score: 4, Interesting

      I think what Haystack is trying to solve is the data management issue. For thirty years we have been living with application-centric computers. So much so that we think in terms of best-of-bread point-tools. Do we know where Mozilla stores our email folders? No, its hidden by the application. (Okay, I do, but that is because I'm a bit geeky and share my Mozilla email folders from a File Server across my intranet...) How about Outlook, Netscape, Eclipse, etc.

      In my inbox I have folders for home, each client project I am working on, future leads, charitable organizations I am involved in. A similar parallel hierarchy is repeated in my file system for documents. My IM tools have their own way of tracking contacts that is unrelated to my email or projects. I store my Eclipse projects in yet another place. Mozilla organizes my bookmarks in yet another hierarchy. It's all a real mess and makes working on a project a job of mentally mapping all the pieces together.

      Now, what would be real nice would be if Haystack could define a plugin API (a la Eclipse) so that my email client could be wrapped and plugged in to Haystack. Same for IM clients, web browsers, etc. The point tool then only has to worry about its job and hands off data persistence to haystack. Then I can choose the best app and let Haystack worry about tying the data together. As someone else mentioned, this sounds more like a replacement for the file system. But it could be more, if each plugin could define how it interacts with other plugins and defines its own responsibilities.

      I'm sure there is a lot of refinement needed, but it is an interesting new paradigm. Activity-centred desktop insteaed of a tool-centred desktop.

    4. Re:That's great and all, but.. by LostCluster · · Score: 4, Insightful

      I do not want 1 large program to run all of my applications. I do not want to get my email, from where I get my web pages, and my IM. I don't want any of this.

      So I take it you're not running Windows, Internet Explorer, and MSN Messenger?

      Well, even if you're running Linux, Mozilla, and AOL Instant Messenger, they're still running on the same physical hardware and using the same window manager software in order to keep the interface consistant and organized.

      And that's the point of this project and several other next-gen file systems in development now... Presenting users with a unified and organized interface that shows them their data in a way they can find it easily. From a user perspective, it makes more sense to store information as "messages that came in from Bonnie" rather than have a seperate file storage device for e-mail, IMs, voicemails, etc.

      You might think it's simpler to have a physical device manage each communications protocol you use, and I'm sure product manufacturers will continue to support you with products based on that concept. However, most users would rather have their computers keep the difference between protocols to itself.

      It doesn't matter how the information gets to the computer as much as what the information is and which person or organization is credited as the author. That's the best way to present information to a user who doesn't care about tech stuff.

  2. Runtime overhead by PureFiction · · Score: 4, Informative
    Beware the load on your system if you wish to try this out. It eats RAM and CPU with gleeful abandon.

    From the system requirements:

    • - Pentium III 700mhz-based computer or better (Pentium 4 2ghz strongly recommended)
      - 12 megabytes of RAM (768 megabytes strongly recommended)

    s/strongly recommended/REQUIRED/
    1. Re:Runtime overhead by 3.5+stripes · · Score: 3, Funny

      Big difference between 12 and 768, damn.

      --


      He tried to kill me with a forklift!
    2. Re:Runtime overhead by PureFiction · · Score: 4, Informative

      Arg, cut-n-paste errors. Should read 512M

      Please take note of the following system requirements for Haystack:

      * Pentium III 700mhz-based computer or better (Pentium 4 2ghz strongly recommended)
      * 512 megabytes of RAM (768 megabytes strongly recommended)
      * Windows 2000, Windows XP, or Linux (Linux build requires GTK+ 2.0 libraries)
      * At least 1 gigabyte of disk space (or more, as your repository grows)
      * Java 2 Development Kit (JDK) 1.4 or later note that JDK 1.4.1 does not work with Haystack; use JDK 1.4.1_02 instead)

  3. Awesome Mozilla effect. by GraZZ · · Score: 3, Funny

    Wow. Looking at the Haystack site with Mozilla looks awesome! I don't know if it's my version (1.4rc1) or some weird image setting, but the main image on the page stays stationary as I scroll around, but the clipping of the image changes. It's really hard to describe, but looks awesome.

    Of course, IE just renders it properly. BOOOORING.

    1. Re:Awesome Mozilla effect. by tuffy · · Score: 4, Informative
      This is the nifty bit of code that generates that effect:

      <div style="BACKGROUND-ATTACHMENT: fixed; BACKGROUND-IMAGE: url(images/cover.png); WIDTH: 520px; height:370px; BACKGROUND-REPEAT: no-repeat"></div>

      Fun with Cascading Style Sheets :) It might've been more effective, however, to stick the big image in an iframe so people can scroll around in it easier and have a look.

      --

      Ita erat quando hic adveni.

    2. Re:Awesome Mozilla effect. by arvindn · · Score: 3, Offtopic
      No, mozilla renders it properly. The relevant code is this:

      <div style="background-attachment: fixed; background-image: url(http://haystack.lcs.mit.edu/images/cover.png); width: 520px; height: 370px; background-repeat: no-repeat;"></div>

      So it is supposed to be stationary. Also notice that you don't see the whole image in IE.

      Whoever designed the page must be really geeky if they don't care about it working correctly in MSIE :-)

  4. THe real test. by Unknown+Poltroon · · Score: 3, Funny

    Can it organize 3 gigs of random pr0n?

    --
    All Troll + "offtopic" mods are meta moderated as "Unfair", because you abused the system.
    1. Re:THe real test. by Rick.C · · Score: 3, Funny
      Can it organize 3 gigs of random pr0n?

      Yes, but that will require some optional hardware: eye-tracking camera and moisture-sensing drool-cup attachment.

      --
      You were 80% angel, 10% demon. The rest was hard to explain. - Over The Rhine
      "Math in a song is good."-Linford
  5. I just want a relational filesystem... by Xerithane · · Score: 5, Interesting

    Not too much to ask, it doesn't even need to be truly a filesystem. Just overload all the file access commands (At this point, probably easier to just write a new filesystem)...

    Group data by category, content, whatever. "Symlink" to the inodes, and you're off. We don't need AI for that and I think it would be a more complete solution. I don't see an AI engine that can correctly categorize my mp3's, I don't think I'd trust it for all of my data yet. Let's start small and get usable systems.

    Spiffy program though, wish it weren't in Java... wish it weren't 42MB... wish it ran smoothly under Linux. I'll stop complaining now.

    On a side note, Did anybody else find that scrolling image annoying and mentally confusing. Er, I'll really stop complaining now.

    --
    Dacels Jewelers can't be trusted.
    1. Re:I just want a relational filesystem... by krb · · Score: 3, Insightful

      You say "Group data by category, content, whatever" and then say "we don't need AI for that". Well, you're almost right, but you need some intelligence in order to make decisions about what the content of file X really is. You could say, "well, yeah, that's me..." but the point of this and other Knowledge Management systems is that it takes the responsbility of categorization off of the user, because we are often inconsistent, or, at least, incomplete. Let's say I have a document that pertains to two or more general topics, lets say, Pollution, Energy Use and Windmills. Let's also say that right now i'm using it for a school report on alternative energy, so i classify it, quite sensibly for now, by year, course number, and assignment. That's totally useless in a few years when i'm looking for the information. I *could* have been smarter and manually attached some meta data to the file describing the kinds of topics it relates to, but i may miss one, and plus, that's extra work for me. Projects like this use complicated statistical (usually) analysis to determine the content for you automatically, and maintain a persistent database of all files realted to particular topics/content items, etc. Haystack and many others do this categorization with an ontologie which predefines the topic groups or elements they care about. Some systems derive the content groups dynamically, and include fuzzy searching to allow you to find documents and files related to some keywords (or if they're real good, natural language query) you enter.

      What you mentioned is not that different from what they're doing, except they're not making it transparent -- they're making into a workspace.

      I'll note also that categorization of text into topics or genres, while difficult, is easier than doing the same with music. The kinds of statistical analysis you can do on text doesn't lend itself to fourier decompisitions. To properly categorize music (in my opinion at least, which admittedly counts for little) the best technique would be to separate and identify the individual instruments (voices) in the song. This makes categorization a bit easier because now you can get data for tempo, rhythm, sohpistication of note progression, etc. on a per instrument basis. I'm not sure it's possible tho.

      My 57 yen.

      --
  6. Isn�t this named for the problem? by Anonymous Coward · · Score: 4, Funny

    Isn't haystack the problem that this tries to fix? I think this project should have been called 'needle' or possibly 'findy.'

  7. Screenshots by ergonal · · Score: 4, Funny

    There was only one measly screenshot in the overview section, and NO screenshots in the screenshot section, so here's another one.

  8. Hahhahaha suckers! by Anonymous Coward · · Score: 5, Funny

    Nothing like slashdotting MIT to make you feel like you've accomplished something! How's your precious class-A IP registry now?

    Sincerely

    Bunker Hill Community College

  9. Re:WTF? No Mac OS X version?!!! by Anonymous Coward · · Score: 3, Funny

    It's assumed that if you don't run windows you are inteligent enough to organize your own info.

  10. Re:WTF? No Mac OS X version?!!! by Anonymous Coward · · Score: 3, Funny

    It's assumed that if you don't run windows you are inteligent enough to organize your own info.

    That would be much funnier if it didn't run on Linux.

    Wait a minute .....

  11. I was a usability tester... by Anonymous Coward · · Score: 3, Informative

    for Haystack at LCS recently, and was not that impressed. It is designed to do certain kinds of tasks very well (e.g., editing things that are embedded in other types of information - the tests given were things like "edit this picture that's a part of this entry in your Outlook address book"). Unfortunately, at the expense of making these tasks as close to one-click as possible, other things (versatility the most, but also common sense design) have failed.

    I find it easy enough to edit information of the "My Documents" variety without worrying about how it is integrated into other information on my computer, and I'm sure other readers here do, as well.

    The best way to actually use this software would be in the case where John Q. has a specific task to do over and over again but isn't ready to tackle a batch process.

  12. Agents... by orn · · Score: 5, Interesting

    Haystack is an interesting idea, but I have a hard time distinguishing what it does from what, say, Lotus Notes does. And Lotus is _terrible_.

    I like the idea of bringing all my information together in one place. I don't like the idea of only having it in that one place. What I would like would be an application that can watch how I use the computer, then bring those applications together to make it more seemless.

    For example, I have about four different calendars in my life: the work calendar, the one on the cell phone that I use for stuff that I can't miss, the calendar that schedules airplane rentals, and (of coursE) my girlfriend's calendar. So how do I bring those all together, and yet still be making entries in them separately?

    The same is true for information. I have a primitive blogging system (really just a bunch of text files that are date coded), I have work documents that I use regularly, I have web pages that I monitor (sometimes a little too often) and I have textbooks that I'm reading (instrument flying at the moment). So how do I get all these forms of information - or at least an index into them - together in one place? But again, without changing the current organization scheme.

    This is the tool that will make the computer a lot more useful - an actual organizational tool.

    Rudy

    --
    1. 2.
    1. Re:Agents... by gobbo · · Score: 4, Interesting

      Wholehearted Agreement with the parent. Lotus Notes is shoved into our laps at work, and it's been a struggle to part out its functionality into the proper parts: Mail.app, Safari/Camino, Address Book (waiting for propr LDAP support, grr), iCal, and other 'business' tools, on my machine. [Not that I'm an Apple Software Fanatic, but they work and fit into the budget.]

      L.Notes had a whole wing on the now-MIA Interface Hall of Shame. It reinvents the conventions found on other platforms (it tries to be a platform unto itself) and does so badly; it's buggy, slow, and designed for administration [decent encrypted document database scheme].

      Plus, it centralizes, for better or worse, all my information on servers controlled by I.T..

      Now I'd love to have a central app that takes feeds from my favourite info management apps, sorts/ranks/prioritizes/interrelates the items for me according to my usage and prefs, and lets me 'zoom in' to a task by switching to the preferred stand-alone app at will. Haystack has only part of the picture, the model is still gather-control, rather than sift-sort-go.

      One item I've found intriguing is StickyBrain, a sticky-on-steriods app, by Chronos LC, which takes info in many categories and allows for quick index searching, plus offers system-wide info-archiving services and some alarm and word-processing features. I had the same kind of thing running with BBEdit, a notes directory, and grep, but it was like hammering nails with a wrench.

      I want all my info hotlinked to lists of related items, dynamically: make every significant word a keyword, realtime. After all, what are multi-GHz and piles'o'RAM for, anyway, when not rendering?

  13. Six Degrees by mblase · · Score: 5, Informative

    Six Degrees by Creo is another attempt to do this same sort of thing, except that it's commercial and it's been available for Mac OS X and Windows for several months.