Google to Launch Mac Version of Google Desktop UPDATED
phaedo00 writes "Arstechnica is reporting that Google today announced that they are pursuing a Google Desktop for Apple's Mac OS X. Google chief executive Eric Schmidt saying it had to be rebuilt from the ground up because of the fundamental differences between the Mac OS and Windows. 'We intend to do it,' Schmidt said." Update: 10/30 23:51 GMT by M : Seems like Reuters and others may have heard wrong about a potential Mac version.
I'm curious what improvements Google will make to the overall user experience of Mac OS X. Search is already a fundamental part of the Mac desktop experience: virtually every application features a search field in the upper-right hand corner of the window (lower-right-ish for some bizarre reason on iCal). The Google mantra of "search, don't sort" is at least partially alive on this platform today.
Current list of supported formats in addition to HTML include TeX, DVI, PS, full text, mail, man pages, news, troff, WordPerfect, RTF, Microsoft Word/Excel, SGML, C sources and many more. Stubs for PDF support is included in Harvest and will use Xpdf or Acroread to process PDF files. Adding support for new format is easy due to Harvest's modular design.
There are a few others, do your own homework if you want themI'll do the stupid thing first and then you shy people follow...
The idea of desktop search is good, but I think the google version lacks in few details.
.doc .ppt etc. formats.
You cannot define which directories to index, and it only indexes single machine. (understandable since it's desktop search, not small network search)
The google search keeps index of the data on the desktop harddrive. If you have lots of files, the index size gets insanely large, some say nearly 2Gb when you have large amount of documents lying around.
It would be relatively easy to build something similar which would work over administrative shares using samba crawlers with defined administrative password for each machine, and you'd have control of which data it would collect. Maybe nfs crawlers too. Plenty of both freely available.
Tricky part is to create the meta indexing of the containing
But the more open developement would allow other indexing, such as ID3 tags.
And perhaps you could add your own meta data to indexed files by filetype, and enhance the search for example only images by containing meta description something like: "meta this image has: cat vase window apple". Search for apple and it returns that picture, crude but works atleast partially.
Problem with this kind of version is that you'd need separate server for the searching, you could reuse some old machine for this.(not problem for most of people here since everyone has extra box somewhere in intranet)
Make the search running with mysql+apache and it would be almost platform independent.
There are no atheists when recovering from tape backup.
The competition is going to be tough on the Mac platform with launchbar, quicksilver allready there and do not forget apple's upcomming spotlight. Seems like another fine example of a function at which the Mac platform is ahead of its competition: "fast access to content".
Most importantly, this is not about API, this is about data. What this is all about is searching and indexing datafiles and from this point of view the files on a typical Mac OS X machine and a Linux desktop will be quite different.
For instance on Mac OS X, some data files are actually bundles, i.e a directory with a special bit telling the Finder to handle the folder as a single file. Keynotes files are bundles with extension .key that contain an XML manifest an the different files included in the presentation. Older Mac OS filetypes would store some meta-data (icons, keywords) in the resource forks. Those things have, as far as I know, no equivalent in the Linux world.
On the other hand, a Linux version would have to cope with the differences between distributions (what source code should be indexed on gentoo machine?) , the different desktop managers (they might store interesting information), and different file format (it would be nice if it could parse tgif files for instance).
In the end, it is all about data, not about licences, APIs or anything else. The whole point of meta-data and searching, for me, is not about indexing my music collection (I keep it organised), but to be able to search my old files, which include Quickdraw 1 Picts and Word 4.0 (DOS) files.
In this case, though, what the non-Apple competition is going to be offering (at least in relation to Spotlight) is much less.
Disclaimer: I've used GDS beta on Windows, and I've used Spotlight on the Tiger WWDC preview. I'm sure what both companies will offer in sucessive versions will be more advanced.
GDS on Windows is a nice idea that's limited by the small number of data formats that it supports. The only file formats it understands are the ones specifically baked into it by Google. There is no way (at current) for a developer to add support for custom file formats, nor does it give you any way to exploit the metadata already present in many very common file formats (e.g. JPEG, PNG, MP3, etc.) In other words, if I had a 1024x768 picture of a Porsche 911 called "Porsche 911.jpg" on my HD, I could find it with GDS by searching for "porsche" or "911" or ".jpg". On the plus side, the formats that Google already knows about (eg AIM logs, Outlook [gack] emails) are well-supported.
Spotlight, however, indexes the inbuilt metadata as well, so not only could I search on parts of the filename, as above, I could also search for "picture files that are 1024 x 768" or have "epson" in their EXIF tags. In addition, if I write a graphics app and use "marmoset's magnificent graphics format" (MMGF) as my native storage format, I can write a Spotlight plugin that tells the OS how to understand the "underpants gnome" tags I've embedded in the images.