Slashdot Mirror


Haystack: A More Compelling View Of Your Data

Peristaltic writes "MIT's Haystack project has released the source for it's "Universal Information Client", Haystack. In their words: 'Haystack looks into the use of artificial intelligence techniques for analyzing unstructured information and providing more accurate retrieval.' Unlike some attempts I've seen in the past to pull it all together on my desktop, Haystack shows some promise -- One of it's more useful features allows you to take the information you've been wallowing through, and have Haystack continually refine a 'dynamic hierarchy' until you get what you need. Haystack also performs some neat tricks such as combining Email, IM, web pages, etc. into a single inbox."

19 of 246 comments (clear)

  1. That's great and all, but.. by notque · · Score: 4, Interesting

    Haystack also performs some neat tricks such as combining Email, IM, web pages, etc. into a single inbox

    It may just be me, but this is a feature I never want.

    I do not want 1 large program to run all of my applications. I do not want to get my email, from where I get my web pages, and my IM. I don't want any of this.

    I am quite happy with seperate programs which I can use at my pleasure. I'm happy with the lack of bulk, and the fact I can change an email client without changing a web choice. (although I only use pine anyway.)

    Is this just me? Do all of you want your programs shoved together in one large application?

    I didn't get any options on my cell phone (like text messaging) because I purchased a cell phone. I wanted a cell phone. To make calls. Nothing else.

    --
    http://use.perl.org
    1. Re:That's great and all, but.. by Transient0 · · Score: 1, Interesting

      Yeah.. I have to say I'm not sure how radically different it is compared to just creating yourself a local web-site with a bunch of links to useful files and programs on it, particularly if you include cgi links to shell scripts and what not.

      Perhaps even more significantly... how does this really differ from a decent OS or Desktop-Env? I mean in GNOME I can have a Mozilla window open in a quarter of my screen, a terminal window running Mutt in another corner, OpenOffice in another part and nautilus file browser for the rest, etc. The only thing that it seems to include that isn't easily accomplished by an intelligent custom set-up of Linux (or Windows or Mac for that matter) is the arbitrary associations between files and you could do that with a few clever applications of symlinks.

      What I'm saying basically is that this package looks like it does some neat things, but it takes up a gig of space and a ton of memory when you could hack together something that works just as well for your individual purposes with almost no memory or storage overhead in a couple of hours if you know a scripting language.

    2. Re:That's great and all, but.. by rdeadman · · Score: 4, Interesting

      I think what Haystack is trying to solve is the data management issue. For thirty years we have been living with application-centric computers. So much so that we think in terms of best-of-bread point-tools. Do we know where Mozilla stores our email folders? No, its hidden by the application. (Okay, I do, but that is because I'm a bit geeky and share my Mozilla email folders from a File Server across my intranet...) How about Outlook, Netscape, Eclipse, etc.

      In my inbox I have folders for home, each client project I am working on, future leads, charitable organizations I am involved in. A similar parallel hierarchy is repeated in my file system for documents. My IM tools have their own way of tracking contacts that is unrelated to my email or projects. I store my Eclipse projects in yet another place. Mozilla organizes my bookmarks in yet another hierarchy. It's all a real mess and makes working on a project a job of mentally mapping all the pieces together.

      Now, what would be real nice would be if Haystack could define a plugin API (a la Eclipse) so that my email client could be wrapped and plugged in to Haystack. Same for IM clients, web browsers, etc. The point tool then only has to worry about its job and hands off data persistence to haystack. Then I can choose the best app and let Haystack worry about tying the data together. As someone else mentioned, this sounds more like a replacement for the file system. But it could be more, if each plugin could define how it interacts with other plugins and defines its own responsibilities.

      I'm sure there is a lot of refinement needed, but it is an interesting new paradigm. Activity-centred desktop insteaed of a tool-centred desktop.

  2. nefarious purposes? by Anonymous Coward · · Score: 1, Interesting

    I'm worried that this could be used for data mining/viewing by the federal gov't. Having this software know all about you is a bad idea.

  3. I just want a relational filesystem... by Xerithane · · Score: 5, Interesting

    Not too much to ask, it doesn't even need to be truly a filesystem. Just overload all the file access commands (At this point, probably easier to just write a new filesystem)...

    Group data by category, content, whatever. "Symlink" to the inodes, and you're off. We don't need AI for that and I think it would be a more complete solution. I don't see an AI engine that can correctly categorize my mp3's, I don't think I'd trust it for all of my data yet. Let's start small and get usable systems.

    Spiffy program though, wish it weren't in Java... wish it weren't 42MB... wish it ran smoothly under Linux. I'll stop complaining now.

    On a side note, Did anybody else find that scrolling image annoying and mentally confusing. Er, I'll really stop complaining now.

    --
    Dacels Jewelers can't be trusted.
    1. Re:I just want a relational filesystem... by jorleif · · Score: 2, Interesting

      The problem with the term AI is that it usually when something "AI" starts to work well it's being called something else. Remember that compilers and information retrieval (google) used to be AI once. This begs the question: What did you mean when you said fuzzy logic, but not AI?

      Is AI as in symbolic AI (search-based etc.)? It seems to me that all sensible information categorization systems, even those built on top of fuzzy logic could reasonably be called AI. Google uses very sophisticated data-categorization algorithms, or at least it seems so based on my search results. Those are probably based on statistical classifiers and other such AI techniques.

    2. Re:I just want a relational filesystem... by Xerithane · · Score: 2, Interesting

      Is AI as in symbolic AI (search-based etc.)? It seems to me that all sensible information categorization systems, even those built on top of fuzzy logic could reasonably be called AI. Google uses very sophisticated data-categorization algorithms, or at least it seems so based on my search results. Those are probably based on statistical classifiers and other such AI techniques.

      I'm using Fuzzy Logic as just a way of branching true-false trees. Not so much a full-blown AI system, just (as you said) statistical qualifiers.

      I don't view AI as "AI" -- it's mostly types of AI. To me, AI is something that is entirely abstract enough to handle tasks (Think self-configuring Universal Turing Machine) -- otherwise it's just statistical programs over very broad data sets.

      I've done some work with neural networks (indirectly) and that was the general consensus there, as well... the mindset just stuck.

      --
      Dacels Jewelers can't be trusted.
    3. Re:I just want a relational filesystem... by jorleif · · Score: 2, Interesting

      I don't view AI as "AI" -- it's mostly types of AI. To me, AI is something that is entirely abstract enough to handle tasks (Think self-configuring Universal Turing Machine) -- otherwise it's just statistical programs over very broad data sets.

      I might misunderstand you, but does this not mean that the "AI"-methods used by Haystack would not be AI? I also have some Neural Network / Statistical processing background so I tend to share this same view that the intelligence is in the designer not in the program. However, the term AI is often used to describe both neural and statistical methods. Actually neural networks could also be perceived as statistical methods. I don't know about fuzzy logic but my intuition tends hint that they arent fundamentally very different either, just based on different mathematical theory.

      To return to your relational filesystem idea, I would prefer having both relational search capabilities and google-like utilization of implicit information. I don't know about you, but I certainly am not very good at being consistent, so search and selection methods allowing some kind of fuzziness criteria would certainly be nice. The basic OS could still work with files, but my personal information should be organized in some more practical way, and should preferably be available all the time from diverse clients (home computer, work computer, PDA, cellphone...).

  4. The Allinwonder Pro File System? by istartedi · · Score: 2, Interesting

    combining Email, IM, web pages, etc. into a single inbox

    Whatever happened to the "does one thing, and does it very well" philosophy? If I sorta remember that I got something in an e-mail, I look in my e-mail. What's the advantage of throwing away that piece of information (where it came from)?

    Yes, it's nice to use the computer to do grunt work for us, but there are some things that are better left to the user. Some of us like to come up with little "systems" for organizing things that are unique to us. We've all heard stories of the receptionist who files contacts under 'D' because new contacts are always invited for Drinks. An AI is not going to be any more rational than that, and the kooky system it devises won't be in our heads--it'll be in some obfuscated format that nobody will understand, not even the ditzy receptionist.

    --
    For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
    1. Re:The Allinwonder Pro File System? by jorleif · · Score: 2, Interesting

      The "does one thing, and does it very well" philosophy was based on systems where you could integrate these utilities easily with each other, the UNIX command-line. Haystack is actually similar after a fashion, it makes all information processable within the same framework. With desktop applications you have a separated information into different applications that work on different information. What you would actually want to do is separate the different tasks into specialized interfaces optimized for that task, keeping the information processable by other programs when needed.

      Receptionists might be stupid and AIs are not that bright either, but you must admit that a local google probably would allow finding the contact people anyway just based on contextual information.

  5. So is this another search engine ? by Rosco+P.+Coltrane · · Score: 2, Interesting

    I'll believe in their AI when I can type "X free" as a search query and it returns a link to www.xfree86.org instead of a million links to pr0n sites. Does this AI learn what people search for usually ? is it able to determine over time that capital-"X" and "free" in my particular searches are about opensource graphical software, unlike the same query by the dirty old man next door ?

    By the way Haystack people, when you use titles and phrases containing "universal", "seeks to bring [...] to the average user", "artificial intelligence" , it trips my PR bullshit meter. I was about to bail out when I noticed the download link.

    --
    "A door is what a dog is perpetually on the wrong side of" - Ogden Nash
  6. Re:Performences... by IamTheRealMike · · Score: 2, Interesting
    I suspect they embed IE. We do that at work for one of our Java apps, it's very easy if you have the right tools (we use neva coroutine, see nevaobject.com).

    It does reduce your portability somewhat of course :) I've been getting our app to run using Wine. Internet Explorer in a JVM in Wine on Linux is a bit bizarre, but we haven't seen any major speed problems with it so far.....

  7. Agents... by orn · · Score: 5, Interesting

    Haystack is an interesting idea, but I have a hard time distinguishing what it does from what, say, Lotus Notes does. And Lotus is _terrible_.

    I like the idea of bringing all my information together in one place. I don't like the idea of only having it in that one place. What I would like would be an application that can watch how I use the computer, then bring those applications together to make it more seemless.

    For example, I have about four different calendars in my life: the work calendar, the one on the cell phone that I use for stuff that I can't miss, the calendar that schedules airplane rentals, and (of coursE) my girlfriend's calendar. So how do I bring those all together, and yet still be making entries in them separately?

    The same is true for information. I have a primitive blogging system (really just a bunch of text files that are date coded), I have work documents that I use regularly, I have web pages that I monitor (sometimes a little too often) and I have textbooks that I'm reading (instrument flying at the moment). So how do I get all these forms of information - or at least an index into them - together in one place? But again, without changing the current organization scheme.

    This is the tool that will make the computer a lot more useful - an actual organizational tool.

    Rudy

    --
    1. 2.
    1. Re:Agents... by gobbo · · Score: 4, Interesting

      Wholehearted Agreement with the parent. Lotus Notes is shoved into our laps at work, and it's been a struggle to part out its functionality into the proper parts: Mail.app, Safari/Camino, Address Book (waiting for propr LDAP support, grr), iCal, and other 'business' tools, on my machine. [Not that I'm an Apple Software Fanatic, but they work and fit into the budget.]

      L.Notes had a whole wing on the now-MIA Interface Hall of Shame. It reinvents the conventions found on other platforms (it tries to be a platform unto itself) and does so badly; it's buggy, slow, and designed for administration [decent encrypted document database scheme].

      Plus, it centralizes, for better or worse, all my information on servers controlled by I.T..

      Now I'd love to have a central app that takes feeds from my favourite info management apps, sorts/ranks/prioritizes/interrelates the items for me according to my usage and prefs, and lets me 'zoom in' to a task by switching to the preferred stand-alone app at will. Haystack has only part of the picture, the model is still gather-control, rather than sift-sort-go.

      One item I've found intriguing is StickyBrain, a sticky-on-steriods app, by Chronos LC, which takes info in many categories and allows for quick index searching, plus offers system-wide info-archiving services and some alarm and word-processing features. I had the same kind of thing running with BBEdit, a notes directory, and grep, but it was like hammering nails with a wrench.

      I want all my info hotlinked to lists of related items, dynamically: make every significant word a keyword, realtime. After all, what are multi-GHz and piles'o'RAM for, anyway, when not rendering?

  8. Re:I don't (and you probably don't either) by Xerithane · · Score: 2, Interesting

    A file system with the power and flexibility of a relational database ceases to be a file system. What are things like "cp" supposed to mean? How do you transfer "a row" through a serial connection? What kind of transactional guarantees is it going to make; if it's going to make DBMS guarantees, it's too slow for many file system applications, and if it's not going to do that, is it really a DBMS?

    I didn't say "relational database" -- I said "relational filesystem." As in, finding documents that are related to some other entity. I enjoy messing around in the Gimp. Sometimes I do work related images, other times it's just for fun. I'd like to put every image under $HOME/gimpwork. However, I like to find out which ones are for work and which job, for fun, etc.

    I'd like to be able to say "ls --category=work $HOME/gimpwork" and show only those files. This doesn't require a database, it requires a few meta flags.

    File copying is the same, ls is the same, everything is the same. Maybe just a wee bit slower.

    If you want a database, just use a database. MySQL and various embedded databases are widely available on Linux now; no need to clutter up the kernel.

    You wouldn't have to clutter the kernel. A system that I am envisioning could reside purely ontop of any existing filesystem. It could have a DB backend (but that would be overkill)

    There are some logistics problems that would make it easier to be in a kernel module -- but assuming everybody would use the proper set of commands, it could keep everything in sync just fine without mucking in kernel space.

    --
    Dacels Jewelers can't be trusted.
  9. Have you tried it? by jamie(really) · · Score: 2, Interesting

    I've been waiting for this for a few weeks now. I've been looking for a PIM that has email, calendar, and tasks. Apart from Outlook, what product has that? I have recently tried:

    Outlook
    OSAF's Chandler PIM
    Haystack
    Pogomail (not a PIM)
    Eudora (not a PIM)
    Mozilla

    I am now using Mozilla because it has bayesian spam filtering built in and because it has a calendar plug in.

    I have decided not to use Haystack. It is simply not production ready, and I'm sure the guys at MIT wont mind me saying so. It crashes. It locks up. It doesnt have undo!!!! I cant tell you how many times I screwed up one of the panels and couldnt get it back. I also couldnt figure out how to delete spam. I get about 200 emails per day, of which 8 arent spam. I could use a pop filter, but I have an emap client too.

    However, I am very impressed by this software and it is absolutely the way forward. I *want* my information integrated. I want my tasks to automatically reference the people I need to do them with and the web pages I used for reference and the dates in my calendar. I want my contacts to appear in many different categories, instead of as a different copy in each category all of which I'd have to update.

    I want email and calendar and tasks to be like a light switch or a tv. I want to just turn it on and it all be there. This software is fabulous and you would all benefit from giving it a test drive, even if you ultimately uninstall it.

  10. Just because it's from MIT... by Fly+Ricky+-+The+Wine · · Score: 2, Interesting

    doesn't mean it's actually a properly framed idea. The market pressures of usability have pretty much spelled out the answer long ago... different functionality calls for discrete apps in this instance. There's simply not enough synergy between IM and Email being in the same place to make it worthwhile... it's just cluttered. It would have happened long ago and been successful if it were useful because it's not technically very difficult to accomplish. Blah. Some ideas that come out of that place are pretty weak (and others rock.) Oh well.

  11. No, an arbitrary desktop menu by chriso11 · · Score: 2, Interesting

    What I think would be cool would be a multidesktop type of environment. No, I'm not talking about multiple virtual desktops either.

    You could have a different desktop for each project. You might have several emails for the given project, a few documents and spec sheets, some pictures, and some code. Keep the hierarchical file system underneath. Everything on the desktop is a link to something in the filesystem. Make it easy to copy, manipulate and navigate between different desktops. Basically, this would be an alternative hierachy, independent of the filesystem hierarchy.

    --
    No, I don't trust in god. He'll have to pay up front, like everybody else.
  12. Enfish Onespace by vivarin · · Score: 2, Interesting

    http://www.enfish.com

    Same thing -- hard to make it fast enough.