Slashdot Mirror


Microsoft's Search Engine Plans

prostoalex writes "Andy Beal from SearchEngineGuide.com interviews Robert Scoble from Microsoft. Scoble tells the audience what current search technologies Microsoft is working on as part of its Longhorn/WinFS development as well as in the field of Internet. Scoble also discusses current problems with local drive and Internet searching, such as absence of metadata for a lot of files out there: "When I take pictures off of my Nikon, they have some metadata (for instance, inside the file is the date it was taken, along with the exposure information) but that metadata isn't useful for most human searches. For instance, how about if I wanted to search for "my wedding photos?" Neither X1, nor Windows XP's built in search would find your wedding photos. Why? Because they have useless names like DSC0001.jpg and there's no metadata that says they are wedding photos.""

24 of 407 comments (clear)

  1. Search by date by Saven+Marek · · Score: 5, Interesting

    I can get around searching for "wedding photos" because I remember the date. 3 special days, and hundreds of wedding photos appear.

    It's part of being human that we don't necessarily remember the phrase "wedding photos" but we may remember many other tiny pieces of data about a shoot that are unique to us, and the time and date are one of those. I can be certain the post 9pm photos done on those days are pretty embarassing.

    Just concentrating on "Wedding Photos" is useful if someone else is searching my picture archive, but that's not useful to me

    nude geekgrrls

    1. Re:Search by date by higgins · · Score: 3, Interesting

      David Gelernter built a system called Lifestreams that basically claimed that time-ordered series plus some simple search and organization operators was everything you needed. It always seemed like a pretty good idea to me.

      That said, if you do have metadata available, you can do a lot with it.

    2. Re:Search by date by alangmead · · Score: 4, Interesting

      What if a system's search could draw information from all the applications within the system. For example, if your electronic datebook had a day long entry for "wedding" and the photo manager has photos taken on that date, then a search for "wedding photos" would be able to find out when the wedding was, match it up with the date the photos were taken, and come up with the appropriate set.

      To some extent Apple tried this with "Newton Intelligence" on the MessagePad. If you wrote "Thursday, Lunch with Bob at Redbones" It would (after you fixed all of its handwriting recognition mistakes) look up Bob and Redbones in the address book, look in the calendar for the next occurrence of a Thursday, and schedule a noon time appointment.

      Newton Intelligence really only amounted to a small set of interapplication tricks, but it was assumed that as the popularity of the units increased, the functionality would be extended. (which pretty much tells you what happened to it.)

    3. Re:Search by date by Lord_Dweomer · · Score: 2, Interesting
      What if photo meta data could be saved steganographically (spelling?) inside the image? Oh wait.....what if the image got edited, whoops.

      --
      Buy Steampunk Clothing Online!
  2. Re:I have a suggestion for em.. by queen+of+everything · · Score: 2, Interesting

    I think they already are...

    My site for a long time wasn't ranked on Google, MSN, yahoo! search. Then one day I was on the first page for Google. Amazingly enough, I was in exactly the same place on msn and yahoo searches. They all supposedly have their own crawlers, but why was it until I was listed on Google that I was listed on the rest? Just a theory I have...it probably means nothing.

    --
    "Wisdom is not a product of schooling but of the life-long attempt to acquire it." -Albert Einstein
  3. Re:I'm not buying it by Saven+Marek · · Score: 4, Interesting

    Even easier than putting into directories is using a portfolio type application, like Picasa (the original version of Apple's iPhoto btw) which allows simple drag and drop library creation. You can have pictures in multiple libraries, it just takes a small few moments to drop photos into their correct places and they are sorted as need be. If you want wedding photos, look in there if you want photos of janine, kate or benson look in their respective folders.

    It doesnt need to be a morass of embedded folder after folder either, as humans have mental acuity unlike a computer. You may have uncle bob who is photographed a lot and auntie beryl who isn't, but all the photos of beryl you may know will contain bob. We can store a surprising amount of information, and perhaps 5 to 10 libraries is all you will need for most peoples collections.

    Special occasions get their own. It just takes moments after downloading the photos.

    nude geekgrrls

  4. Check out Phil Greenspun's similar idea by Speed+Racer · · Score: 4, Interesting

    Phil Greenspun has a similar idea and is looking for help on how to accomplish this on a personal level with existing the Windows XP filesystem. Check out his blog post for details. There's already an intersting discussion taking place in the comments for that post.

    --
    Free Mac Mini. Yes, I'm
  5. Re:Hmmmm... by jd142 · · Score: 2, Interesting

    And even better, many photo programs allow batch renames. So while you're putting them in the wedding folder, rename them all to wedding####.jpg and let the program automatically append numbers.

    Reminds me of Scotty's line, "The more they overthink the plumbing, the easier it is to stop up the drains." They've developed a complex solution for a simple problem that already had a simple solution.

    While a database driven file system with the ability to let users define their own metadata fields in the database sounds really, really cool, I won't be using Microsoft's first or second version for anything I value.

    So what's the status of the *nix version of a database file system?

  6. Re:Ouch by Otter · · Score: 5, Interesting
    Yeah, I'm sure being a computer science visionary is harder than it looks. But from the outside, all they seem to ever do is to announce that computer use is difficult because software developers aren't as smart as them, and that what we really need is some way for everything to magically sort itself out. Details of implementation to be left to those of less rarified brilliance.

    The closest thing to a workable scheme is Gelerntner's Lifestream stuff -- where your system knows that you got married on a certain date (even if you have trouble remembering it) and that documents (JPEGs, Word files, GNUCash transactions from that time probably pertain to it.

  7. Adding metadata is not the way by Flyboy+Connor · · Score: 5, Interesting
    Scoble's idea is that you will add metadata to your files. Can you imagine? You have literally tens of thousands of files you created (photos, documents, etc.) on your hard drive and you are going to add metadata to all of them? Does he really think people are going to do that? If they would be willing to do that, they would just rename those photo files from "DSC00001.JPG" to "MyWedding00001.JPG".

    Judiging from the interview, the "innovative" Longhorn seems to allow you to add metadata in a slightly user-friendly way. But virtually nobody will use it, except maybe to mark a few important files which you have stored in a special place anyway.

    So what would be a better solution then? My idea is that metadata should be added automatically. For instance, a human will recognize most wedding photos for what they are. Getting a computer to recognize this is not trivial, but lots of research is currently invested in this. Already computers can easily recognize general categories ("groups of people", "nature", "animal", "portrait"). My guess is that it is already possible to implement a system that you can train to let the computer recognize your particular brand of photos.

    I don't expect Microsoft to try to go into this way of innovation. They will probably wait until an entrepeneur develops it and then copy it or buy them out.

  8. Re:I'm not buying it by Anonymous Coward · · Score: 0, Interesting

    There should be no way to just click the "OK" button without having entered something. Or you could make Photi come back every 5 minutes saying "Lizten man, if you don't giff me ze names right now, I'll notify the authorities!! We haff ways to make you talk!!!"

  9. Thumbnails don't scale! by ka9dgx · · Score: 2, Interesting
    The fact is that thumbnails need to be at least 200x200 pixels before you can really tell what's in the picture. Once you pass the first few thousand photos, it's no longer feasable to visually search through them... your brain starts to hurt!

    I store them by date photographed, using ThumbsPlus to view thumbnails and metadata stored in a database. So far, it's worked out for the 45Gb of photos I've taken in the past 5 years.

    --Mike--

    PS: Yes, I'll chat with and give ideas to anyone who wants to make this better... even Microsoft.

  10. Slashdot luddites by Anonymous Coward · · Score: 1, Interesting

    Hmm... stories up for a few minutes, and of course the Slashdot luddites has pipes up with comments says that "i just need to put them in a folder. stupid microsoft."

    The point is folders only allow a single hierarchy of data. Sure you can make a Wedding Photos folder. But what if you also want a folder with all the pictures of Uncle Bob from multiple events, a folder with 5-star photos from multiple events, a folder with night photos, a folder with wild partying photos, and a folder with photos of centerpieces.

    The Longhorn WinFS will allow you to make queries saying "show me all the photos with Uncle Bob (from my mom's side) and Aunt Jane (from my dad's) that were taken in daylight at formal special events in the last two years that I've rated with 4 stars or more." This cannot be done with modern file systems (unless you want to use some stupid non-standard awkward file naming system that you think covers every possibility), although it can be done with other software (ie. Photoshop Album). Assuming you maintain the meta data... with which Photoshop Album, for example, is a simple drag-and-drop operation.

    The trick is incorporating it into the file system mean you don't have to reinvent the wheel. The meta-data technology used for the photos can be used when you're writing, say, a music cataloging application (artist, genre, rating, keywords, composer, publication date, length) or a document repository (client, project, document type, importance, length) or a cataloging application for the terabytes of video files we're all going to have one day.

    It is, needless a good idea and where file systems are heading in the future. People who want to defeat Microsoft would be well advised to see the benefits instead of sticking their heads in the sand.

  11. More feature creep by krygny · · Score: 3, Interesting

    How many people have trouble finding files on their hard drive using the most basic search criteria. People who are so unorganized as to lose files on their hard drive are probably not sophisticated enough to use advanced search methods successfully.

    --
    Research shows that 67% of those who use the term "research shows", are just making shit up.
  12. Great ! Just what we were missing... by catyoul · · Score: 2, Interesting
    But, contacts in Outlook can't be used by other applications (...) By putting a "contacts" file type into the OS itself, rather than forcing applications developers to come up with their own contacts methodology. What if ALL applications, not just Outlook, could use that new file type? What if we could associate that file type to social software services like Friendster, Tribe, Yahoo's personals, or Google's Orkut? Would that radically change how you would keep track of your contacts? Would that make contacts radically more useful?
    Would that make worms spreading even better ? ;)
  13. MS already does this and nobody uses it by Pointy_Hair · · Score: 3, Interesting

    Ever look at the properties page of an MS Office file? There's enough metadata tags in there to keep you busy for hours.

    Does anyone really fill those in? Rarely.

    Is there a method to search on them? Never looked.

    Sometimes it's interesting to browse the properties page to see who really created a spreadsheet or document. For example, people who shamelessly "borrow" templates from former employers and either aren't smart enough or too lazy to do just a little clean up. But that's about it.

  14. Define 'metadata' by inkswamp · · Score: 3, Interesting
    Because they have useless names like DSC0001.jpg and there's no metadata that says they are wedding photos.""

    Doesn't storing your photos in hierarchical folders labeled appropriately count as metadata? I know it's not very flexible or powerful, but it's metadata of a sort. Store your wedding photos in a wedding folder in a photos folder.

    Now, if you're talking about a database of metadata about files, then that's something else.

    --
    --Rick "If it isn't broken, take it apart and find out why."
  15. calendar based metadata by chocolatetrumpet · · Score: 2, Interesting

    Maybe the photo software could check with your calendar, see that a certain date/time was "my wedding," and assign that metadata to photos as they are downloaded. Most photos already have time/date metadata.

    --
    Spoon not. Fork, or fork not. There is no spoon.
  16. meta data by WhatsAProGingrass · · Score: 2, Interesting

    Yeah, I can't wait to download stuff from the internet full of their own meta data. Isn't it true that search engines are not using meta data as much cause of false data? The OS having its own contacts list might seem like a good idea, but i can see many people trying to hack into it and mass mail all your friends.

    --
    Mark
  17. Apple's Solution by JohnsonWax · · Score: 3, Interesting

    Apple has a solution to this, which has trade-offs, but seems pretty functional.

    Essentially, each of their iLife apps is a replacement for the Finder. Do we really need music search integrated with file search? Or is it sufficient to build independent metadata (ID3) and filestructure (playlists) just for music. That's really the brilliance of iTunes in that it never takes you back to your HD filestructure. You can even ask it to maintain the HD filestructure to reflect the metadata structure, so it'll keep everything in an artist/album/song structure, naming things as needed.

    iPhoto is set up the same way, but it's pretty apparent that the iPhoto guys are the 'B' team, since they haven't gotten it nearly as slick as iTunes yet, but it also has the equivalent of content metadata, playlists, and smart playlists. So, yes, I can easily find my wedding photos. The trade-off is that you can't search for 'Wedding' in the Finder and get wedding photos, wedding songs, etc. Maybe that's upcoming, but I'm not totally convinced of the value.

    The iTunes organizational structure does carry into iPhoto, so if you want to select a song for a slideshow in iPhoto, you can see your iTunes playlists, and filter against metadata. It also carries into iMovie, etc.

    Other posters have clearly identified the problems with metadata. File organization is generallly only useful if you are willing to symlink across all of your metadata, otherwise your photos of you mom and your wedding photos are disjoint, since some should be in both places. The single biggest problem with metadata is putting it in to begin with. iPhoto now allows you to do that during photo import - using a slide-show type UI.

    I think MSs tendency to do everything in one place is interesting, but tends to not come off so well. Having everything in SQL could eliminate one of the shortcomings in Apple's implementation which is that they need to maintain an XML intermediate structure for music files, photos, etc. While somewhat handy, it's main function is to join file metadata and the FS, which means that it is somewhat fragile.

  18. Re:Ouch by cybpunks3 · · Score: 2, Interesting

    --
    The closest thing to a workable scheme is Gelerntner's Lifestream stuff -- where your system knows that you got married on a certain date
    --

    That's fine for personal photos, but what about MP3s or other acquired media which has no direct association with personal life events?

  19. Put a GPS in the camera by Animats · · Score: 2, Interesting
    The right solution is to put a GPS receiver in the camera and tag photos with time, date, and location. No user action is required at picture-taking time. Ricoh is already selling such a camera in Japan. Kodak has a camera that plugs into an external GPS, but that's too clunky.

    Pros would love this; often you want to search some big image archive for pictures of a specific location. Tourists would find their photos self-organizing.

    Lookup can then be by address, or using a map or globe. Think MapQuest.

    This offers the possibility of a new (and totally legitimate) peer-to-peer application - location based picture-sharing. See the pictures others took of tourist locations.

  20. iPhoto - The Application Paradigm by Slur · · Score: 3, Interesting

    I think it's cool that Microsoft is taking cues from the iApps - interesting that they want to integrate it so much into the operating system. Whereas so far Apple is stressing an application-centered solution on top of a more general-purpose filesystem, Microsoft is getting deeper into the integration game, getting into file metadata a la BeOS, and tracking files according to thematic relevance a la relational databases.

    If the "smart desktop" idea catches on it will be interesting to see the response from developers on Mac OS X and Linux, as far as offering intelligent activity tracking. Somehow I see a twisty maze of documents and activities, all alike.

    Should operating systems do all the work of organizing users files for them, concealing the filesystem behind a database veneer, or behind a purely task-oriented veneer? Should this kind of thing be left to application developers, like the maker of Path Finder?

    Wouldn't Windows be more useful if it was a truly modular system that could be configured simply by stripping away unwanted components? Isn't that what makes Darwin so healthy in the enterprise market today?

    --
    -- thinkyhead software and media
  21. PC based Google search application by Twister002 · · Score: 2, Interesting

    What I'd like to see come out of Google, is an add in that will categorize and search my local drives using the Google search algorithm. They have Google appliances that businesses can buy and use internally. I'd like to see a home based, and home priced, version of that application. Maybe have it search the internet as well, present the results separately. So if I'm looking for a file containing the words "efficient search keywords" (or something like that) it shows me files in my local system (including network shares maybe) as well as results on the internet.

    --
    "For a successful technology, honesty must take precedence over public relations for nature cannot be fooled." -Feynman