Haystack: A More Compelling View Of Your Data
Peristaltic writes "MIT's Haystack project has released the source for it's "Universal Information Client", Haystack.
In their words: 'Haystack looks into the use of artificial intelligence techniques for analyzing unstructured information and providing more accurate retrieval.' Unlike some attempts I've seen in the past to pull it all together on my desktop, Haystack shows some promise -- One of it's more useful features allows you to take the information you've been wallowing through, and have Haystack continually refine a 'dynamic hierarchy' until you get what you need. Haystack also performs some neat tricks such as combining Email, IM, web pages, etc. into a single inbox."
The ultimate test for such a system is putting my inbox into the information stream. At the end of the day, 99% of it better be trashed automagically.
Is this just me? Do all of you want your programs shoved together in one large application?
You mean like a Window Manager? That's how I see this thing... it's like a Window Manager with applications embedded inside of it (think of a forced dock type thing.) It just handles whatever data you present it with (or the computer presents it with) automatically.
I didn't get any options on my cell phone (like text messaging) because I purchased a cell phone. I wanted a cell phone. To make calls. Nothing else.
My cell-phone has bluetooth, PDA functions, games, voice recording, voice dialing... that's the great thing about choice. You, nor I, are the entire market.
Dacels Jewelers can't be trusted.
the problem with having everything separate is that relevant communications cannot get to you fast enough. you may forget to check your email, or your voicemail, or not log into aim. I think you don't want it because you haven't seen it work well, or to your benefit. oh yeah, is the girl that would be upset with you your mom?
And that's exactly my problem, the flexability.
I can create, and manage my own set of windows, programs, etc for the functions I want to do.
That doesn't mean that others won't find this program useful, but I certainly wouldn't want to use it.
And even if the program was incredibly flexible, I still would rather do it myself. I don't need any more programs controlling all of my data in a centralized place. I do not own a tin foil hat, but I'd rather just manage it all on my own.
http://use.perl.org
Good news! You're getting one, the successor to Windows XP will sport WinFS
Yes, that is great news--the attempt will break so many things that it will seriously hurt Microsoft.
Anyways, in a few decades someone will write a free-as-in-no-money version for lunix. So hold tight.
DBMS-based file systems have been around for decades; there are good reasons why people aren't using them.
Linux has several file systems using database technologies (as well as change notification). However, what Linux doesn't have is a file system that lets you perform arbitrary relational operations. That's because such a "file system" would simply not conform to the interface and semantics expected of a file system, and lots of things would break.
But, of course, if you are Microsoft, you don't have to worry about standards, you just merrily break things and redefine APIs whenever you please.
I do not want 1 large program to run all of my applications. I do not want to get my email, from where I get my web pages, and my IM. I don't want any of this.
So I take it you're not running Windows, Internet Explorer, and MSN Messenger?
Well, even if you're running Linux, Mozilla, and AOL Instant Messenger, they're still running on the same physical hardware and using the same window manager software in order to keep the interface consistant and organized.
And that's the point of this project and several other next-gen file systems in development now... Presenting users with a unified and organized interface that shows them their data in a way they can find it easily. From a user perspective, it makes more sense to store information as "messages that came in from Bonnie" rather than have a seperate file storage device for e-mail, IMs, voicemails, etc.
You might think it's simpler to have a physical device manage each communications protocol you use, and I'm sure product manufacturers will continue to support you with products based on that concept. However, most users would rather have their computers keep the difference between protocols to itself.
It doesn't matter how the information gets to the computer as much as what the information is and which person or organization is credited as the author. That's the best way to present information to a user who doesn't care about tech stuff.
You say "Group data by category, content, whatever" and then say "we don't need AI for that". Well, you're almost right, but you need some intelligence in order to make decisions about what the content of file X really is. You could say, "well, yeah, that's me..." but the point of this and other Knowledge Management systems is that it takes the responsbility of categorization off of the user, because we are often inconsistent, or, at least, incomplete. Let's say I have a document that pertains to two or more general topics, lets say, Pollution, Energy Use and Windmills. Let's also say that right now i'm using it for a school report on alternative energy, so i classify it, quite sensibly for now, by year, course number, and assignment. That's totally useless in a few years when i'm looking for the information. I *could* have been smarter and manually attached some meta data to the file describing the kinds of topics it relates to, but i may miss one, and plus, that's extra work for me. Projects like this use complicated statistical (usually) analysis to determine the content for you automatically, and maintain a persistent database of all files realted to particular topics/content items, etc. Haystack and many others do this categorization with an ontologie which predefines the topic groups or elements they care about. Some systems derive the content groups dynamically, and include fuzzy searching to allow you to find documents and files related to some keywords (or if they're real good, natural language query) you enter.
What you mentioned is not that different from what they're doing, except they're not making it transparent -- they're making into a workspace.
I'll note also that categorization of text into topics or genres, while difficult, is easier than doing the same with music. The kinds of statistical analysis you can do on text doesn't lend itself to fourier decompisitions. To properly categorize music (in my opinion at least, which admittedly counts for little) the best technique would be to separate and identify the individual instruments (voices) in the song. This makes categorization a bit easier because now you can get data for tempo, rhythm, sohpistication of note progression, etc. on a per instrument basis. I'm not sure it's possible tho.
My 57 yen.
And the whole idea here is that you can come up with your own system for organizing - the website says user-defined predicates get treated just the same as the built-ins. All the self-organizing stuff is just to speed things up - you can make corrections (no harder than doing it yourself in the first place) and it gets better at doing things the way you want.
And even if the program was incredibly flexible, I still would rather do it myself. I don't need any more programs controlling all of my data in a centralized place.
I think you are overreacting a bit. This is a research project after all so it's hardly perfect. What about your current OS, doesn't it "control all your data in a centralized place"? How is this different except for being more convenient? Do you actually care if the programs operating on your data are different processes, or plugin threads? Do you actually care what the underlying data representation is as long as its fairly efficient and allows for convenient operations=
I personally think contextual right-click menus are very convenient because they group functionality not by application, but by task. E.g. When you manage files you might also want to rename one, so that can be put in a contextual menu. The way computers work today is not an expression of great freedom that projects like Haystack endanger, but the result of systems evolution. This means that some rough points aren't being fixed even though there underlying reasons have gone away with newer hardware ages ago. Haystack-like things just offer a different views, if they are any good they might survive, otherwise they will die without causing any harm.
Yes, I agree, everyone is able and freely does use the genre field (or not) but what of cases where a person doesn't know which genre to assign?
As well, wouldn't it be better if there were tags for multiple or meta genres. Doesn't the depth, and consequently the power, of a system increase as you increase the number of meaningful and useful connections?
I think people forget that if you don't put much thought into it, you really shouldn't expect much intellegence out of it.
Words to men, as air to birds.