Slashdot Mirror


Haystack: A More Compelling View Of Your Data

Peristaltic writes "MIT's Haystack project has released the source for it's "Universal Information Client", Haystack. In their words: 'Haystack looks into the use of artificial intelligence techniques for analyzing unstructured information and providing more accurate retrieval.' Unlike some attempts I've seen in the past to pull it all together on my desktop, Haystack shows some promise -- One of it's more useful features allows you to take the information you've been wallowing through, and have Haystack continually refine a 'dynamic hierarchy' until you get what you need. Haystack also performs some neat tricks such as combining Email, IM, web pages, etc. into a single inbox."

3 of 246 comments (clear)

  1. Re:That's great and all, but.. by Xerithane · · Score: 4, Insightful

    Is this just me? Do all of you want your programs shoved together in one large application?

    You mean like a Window Manager? That's how I see this thing... it's like a Window Manager with applications embedded inside of it (think of a forced dock type thing.) It just handles whatever data you present it with (or the computer presents it with) automatically.

    I didn't get any options on my cell phone (like text messaging) because I purchased a cell phone. I wanted a cell phone. To make calls. Nothing else.

    My cell-phone has bluetooth, PDA functions, games, voice recording, voice dialing... that's the great thing about choice. You, nor I, are the entire market.

    --
    Dacels Jewelers can't be trusted.
  2. Re:That's great and all, but.. by LostCluster · · Score: 4, Insightful

    I do not want 1 large program to run all of my applications. I do not want to get my email, from where I get my web pages, and my IM. I don't want any of this.

    So I take it you're not running Windows, Internet Explorer, and MSN Messenger?

    Well, even if you're running Linux, Mozilla, and AOL Instant Messenger, they're still running on the same physical hardware and using the same window manager software in order to keep the interface consistant and organized.

    And that's the point of this project and several other next-gen file systems in development now... Presenting users with a unified and organized interface that shows them their data in a way they can find it easily. From a user perspective, it makes more sense to store information as "messages that came in from Bonnie" rather than have a seperate file storage device for e-mail, IMs, voicemails, etc.

    You might think it's simpler to have a physical device manage each communications protocol you use, and I'm sure product manufacturers will continue to support you with products based on that concept. However, most users would rather have their computers keep the difference between protocols to itself.

    It doesn't matter how the information gets to the computer as much as what the information is and which person or organization is credited as the author. That's the best way to present information to a user who doesn't care about tech stuff.

  3. Re:I just want a relational filesystem... by krb · · Score: 3, Insightful

    You say "Group data by category, content, whatever" and then say "we don't need AI for that". Well, you're almost right, but you need some intelligence in order to make decisions about what the content of file X really is. You could say, "well, yeah, that's me..." but the point of this and other Knowledge Management systems is that it takes the responsbility of categorization off of the user, because we are often inconsistent, or, at least, incomplete. Let's say I have a document that pertains to two or more general topics, lets say, Pollution, Energy Use and Windmills. Let's also say that right now i'm using it for a school report on alternative energy, so i classify it, quite sensibly for now, by year, course number, and assignment. That's totally useless in a few years when i'm looking for the information. I *could* have been smarter and manually attached some meta data to the file describing the kinds of topics it relates to, but i may miss one, and plus, that's extra work for me. Projects like this use complicated statistical (usually) analysis to determine the content for you automatically, and maintain a persistent database of all files realted to particular topics/content items, etc. Haystack and many others do this categorization with an ontologie which predefines the topic groups or elements they care about. Some systems derive the content groups dynamically, and include fuzzy searching to allow you to find documents and files related to some keywords (or if they're real good, natural language query) you enter.

    What you mentioned is not that different from what they're doing, except they're not making it transparent -- they're making into a workspace.

    I'll note also that categorization of text into topics or genres, while difficult, is easier than doing the same with music. The kinds of statistical analysis you can do on text doesn't lend itself to fourier decompisitions. To properly categorize music (in my opinion at least, which admittedly counts for little) the best technique would be to separate and identify the individual instruments (voices) in the song. This makes categorization a bit easier because now you can get data for tempo, rhythm, sohpistication of note progression, etc. on a per instrument basis. I'm not sure it's possible tho.

    My 57 yen.

    --