Microsoft's Search Engine Plans
prostoalex writes "Andy Beal from SearchEngineGuide.com interviews Robert Scoble from Microsoft. Scoble tells the audience what current search technologies Microsoft is working on as part of its Longhorn/WinFS development as well as in the field of Internet. Scoble also discusses current problems with local drive and Internet searching, such as absence of metadata for a lot of files out there: "When I take pictures off of my Nikon, they have some metadata (for instance, inside the file is the date it was taken, along with the exposure information) but that metadata isn't useful for most human searches. For instance, how about if I wanted to search for "my wedding photos?" Neither X1, nor Windows XP's built in search would find your wedding photos. Why? Because they have useless names like DSC0001.jpg and there's no metadata that says they are wedding photos.""
Make it "Micosoft Search powered by GOOGLE". Then "maybe" it might function well. Also metadata needs to be created by the user, I aint gonna be entereing data on a keypad on my camera for every photo.
"It's so convenient to have a system where everyone is a criminal" - A. Hitler
For instance, how about if I wanted to search for "my wedding photos?" Neither X1, nor Windows XP's built in search would find your wedding photos. Why? Because they have useless names like DSC0001.jpg and there's no metadata that says they are wedding photos."
That's why you can change filenames and organize things into directories.
I think thats what "organization" is for. You place files like "DSC0001.jpg" in things called "folders", and then name the folder "Wedding" or something.
I dunno.
Bowie J. Poag
What happened to thumbnails?
So instead of offering their official toolbar for IE only (the one for Mozilla is unofficial), start to slowly phase out the Google Toolbar and replace it with the Google Browser which would basically be a Google branded Mozilla Firebird. With all the features that make Firebird great like Tabbed Browsing, with the addition of the Google Toolbar features such as PageRank, etc. All on a cross platform basis.
If people get used to downloading better browsers now, then they won't even notice when the next release of IE starts to reject the Google Toolbar.
Let them know what you think
If users didn't suck so much, then descriptive dir names would easily solve the problem of trying to locate a wedding photograph on a hard drive.
So what, the image file is named "DSC0001.JPG" -- who cares. Put it in a folder named "my images" and there's no wonder you can't find it!! Put it in a folder named "wedding photos", and then you've got something there!
The best way to describe it to the average joe (non)user is that directories/folders are analogous to folders in a filing cabinet. Would you file telephone bills, for example, under "mortgage" or "telephone"?
Thanks Microsoft for "my photos", and "my documents", and the like. We appreciate it!
Skiers and Riders -- http://www.snowjournal.com
I hope the industry sees the opportunities that Longhorn's WinFS opens up. We can either work together and share data with each other, or we can be afraid and keep data to ourselves.
Share data? with whom? how can you share data that is in either proprietary format or "patented XML" ???
It is following the OpenStandard that will help in "working together and sharing data".
Consensus is good, but informed dictatorship is better
Why? Because they have useless names like DSC0001.jpg and there's no metadata that says they are wedding photos.""
d aylight...horizon?), you've got more issues than just locating a particular photo.
You mean to say you don't know the date you got married? You're in trouble.... iPhoto on OS X at least breaks them out into folders according to either last imported and/or month/year etc.. You're responsible to breaking them down further, in which case you don't search the entire drive later, you simply open iPhoto and take a short trip to your wedding folder, just like having a folder in a drawer in a cabinet in your home.
It's not really that hard, now is it? if you're dropping any files onto your drive randomly, the issue is with your basic housekeeping, not that a top level search tool seems blind to your target.
You're talking about EXIF, and the list of data there is long. Why you took the picture isn't part of it, and if you want the camera to interpret which part of the subject matter is root (noses..faces...age...sex...background..,trees...
Would you want to trust your private data, gathered from govenment departments, purchases and financial transactions etc, being accessed by such a system run by any old govenmental or business agency?
How about your private correspondence on friends and acquaintances home computers.
Microsoft culled the URL name:password@ functionality from Internet Explorer because it claimed it could not create a secure enough fix, yet in the same month, it yet again proposes a privacy nightmare such as this? Madness.
WinFS sounds promising, but unless Microsoft makes the WinFS specs open and free, it'll be yet another lock-in technology, which would be very dissapointing.
Adding metadata to all your files would require a lot of time and effort, and if it's a closed technology, it'd be yet another reason people wouldn't want to even attempt switching to another OS. I can almost hear it now...
"This other OS looks cool, but I've spent so much time adding metadata to all my files, and I can't export that metadata to this other OS because the format is proprietary and patented... I'd better stick with Windows, switching OS's would be too hard..."
Sorry, someone had to state the blatantly obvious. As usual, all promising technologies coming out of Microsoft are poisoned. And most people don't even realize it. Not even intelligent people. Most .NET developers don't even realize that .NET's so-called "standardization" via ECMA doesn't really make it an open standard (lots of the "standardized" .NET technology is encumbered by patents).
-Teckla
"Why? Because they have useless names like DSC0001.jpg and there's no metadata that says they are wedding photos."
Metadata is a stupid concept. It puts the cart before the horse. Files should not have to 'know' about themselves, they are not objects.
You have to treat files as just files, their names are nothing more than identifiers, their contents are nothing more than contents.
By all means its possible to build a great search capability into a filesystem, but you need to build the 'meta' data _outside_ the file.
A system built on file metadata is doomed to be incompatible with anything but the latest datatypes designed for it.
Used to be if you wanted to find a file real fast under windows, you'd hit WINDOWS+F to pullup the find window, enter in your search query, and Go.
Now if you're in front of an XP machine and want to find say...all the pictures on the system you can't just enter in "*.JPG" anymore. You have to read what some animated dog is asking you, click on one of the options before you get to the search query window, then enter in the query. doesn't sound like much of a hassle, but it IS an extra step.
$cat
If your file system supported true symbolic links, your problem could be largely mitigated by using them.
"Give away the stone, let the oceans take and transmutate this cold and faded anchor." - Maynard James Keenan
"Neither X1, nor Windows XP's built in search would find your wedding photos. Why? Because they have useless names like DSC0001.jpg and there's no metadata that says they are wedding photos."
Metadata will NEVER improve searching in this way unless the things that generate the content FORCE you to put it in before they can snap pictures, etc...
Even if people were forced to put metadata into all their files there is a big chance that typos and other errors in entering the info would occur. This will make the metadata totally useless in a search!
It's saved my bacon more than once. As we move away from text, we become completely dependent on metadata to find things. Standards for metadata need to be settled soon, or Moore's law means our computers will become less and less useful.
--Mike--
"As we move away from text, we become completely dependent on metadata..."
Exactly what do you think metadata is? This system would require more text than current. At present, you can rename the files and put them in folders, which works quite well if you have any organizational ability. Metadata would require dozens of unrelated pieces of info be input, and the a more complex retrieval (search) process would be required. While metadata standards are important, it's only advanced users who will be using them. How many "typical" users do you know that are going to search for a photo by the F-value?
And for the record, I've never used the "containing text" search, because I name files in unambiguous ways.
G
If Dylan had a more common name, like, uhm...Mike... the value would go down. What would you do to include the names of the other 4 people in the photo? How do you link it to Dylan's other photos, etc?
--Mike--
This is more useful than it would seem. I've read a bunch of posts that talk about keeping the wedding pictures in a folder called "Wedding", and that's the extent of the organization.
Except it doesn't work that way. If I dig around a little, I see that I have the same images in several places: in the folder called "Vacation", another folder called "work" where I did some touchups, another folder called "staging" where I laid things out before putting them on the server, and again on the server, where my family can view them on the web.
If I follow the suggestion of putting them all into a single folder, then I've created a logistical headache. The _only_ thing I've gained is the ability to find all the files at once. Using metadata, I would no longer have that restriction - I could put files where they made the most sense, and still find all the files at once.
ACDsee, a well-known and, at one time, free, image viewing and organising app, supports metadata. It puts it in a "descript.ion" text file in each directory. This is an ancient DOS standard. It's still supported by a few Windows apps, notably the Far manager (a shareware clone of Norton Commander for Win) and ReGet, a downloader; both Russian.
In fact I find the "descript.ion" metadata so useful I stick with apps that use it. At my last job, a web news site, I organised out image library using ACDsee and this metadata to add notes. ACDsee also has a nice batch rename.
No need to invent a whole bloody new file system to find your wedding photos.
Imagine a world where we're all broadcasting identity. Say we've got RFID-enabled nametags at the wedding. Now picture a camera that has a (preferably directional) RFID reader. Suddenly all your photos have the names of all the subjects automagically added as metadata. Scary in some contexts, useful in others (like most technologies, I suppose).