Google Experiments With Local Filesystem Search
Teoti writes "No, Puffin is not the next name of your favorite email client, but, according to the New York Times (NSA reg. req.), the project codename for a new Google search application coming directly into your desktop, that will let you search your local filesystem efficiently. This is different from, but complementary of, the Google DeskBar that already lets you search the Web. The article also gives a few words on the end of the stand alone browser in Longhorn."
I certainly hope this isn't a Windows-only thing.
Honey, I shrunk the Cygwin
So, will I get ads based on my data?
I recently searched several hundred thousand files on my work machine. It took nearly 90 minutes to complete the search. I expect Google will be able to significantly improve upon that. They're one of the few companies that I really trust to do the right thing.
will we see 'adsense' words based on which file we are searching for?
there must be a motive for this, some sort of expected gain, or why?
for the most part, google's actions are benign, I believe the claim that gmail scans are automated and innocuous.
but what's the benefit to google for this one?
every day http://en.wikipedia.org/wiki/Special:Random
to go thru the wiki, jpg filenames+exif data, home directories, SQL database, etc. A Google type interface is what I'm looking for.
For those infants out there, Lotus Magellan was the greatest, it was Windows Explorer as it should have been done, it searched any spreadsheet, database, or word processor file.
Gawd, Linux needs this. I would pay ~$250.00 for an industrial strength business version.
[Google] going to reach a point where they stretch their resources too thin?
Google researchers are allotted 20% of their working time to do outside projects or to follow personal interests. Google News and Gmail were both results of work done during this "20%" time. So in short, no, I don't think Google has really stretched their resources any more so than before.
-- Kircle
I remember Alta Vista offered this sort of search-your-own-computer software back in *1998*. This seems to be the most recent version: http://siliconvalley.internet.com/news/article.ph
Since Microsoft considers Google a major competitor and has its target set on Google with Longhorn's capabilities, I think it would be a great idea if Google started distributing their own version of the Mozilla web browser. With Google's reputation, there would definitely be more people making the switch to Mozilla based browsers if Google were to do this. After all, Netscape is considered a failure now by the public and Mozilla to a casual observer lacks credibility no matter how great the product is.
"Right now, somewhere in this world, Scott Baio is plowing a woman he doesn't love," - Peter Griffin, *Family Guy*
You know when I read the line about dispensing with the web browser as we know in their next release, I find myself thinking.... there will never be tabbed browsing in any Microsoft "browser".
I can't imagine not having this feature and it floors me that Microsoft can't imagine anyone ever needing it.
Perhaps Google can fill this void in the pathetic Windows power tool-set ("Windows power tool-set" being close to an oxymoron).
But, despite my love for Google, in these more Orwellian times, I'm glad that I have the tools (not from MS) to monitor port activity.
Sigs are bad for your health.
This is as good idea, so long as it doesn't allow others to search my filesystem.
But what if they could? If google cached, online, the location of MP3s and MPEGs loaded on your system, then allowed others access (with your permission of course). Hmm... sounds like a P2P file sharing system...
-- If god wanted me to have a sig, he'd have given me a sense of humor.
Normally I just need to know file names, so I do something simple like du -ak / > /var/tmp/all so "all" is a catalog of all files.
/var/tmp/all for the files I need and do quick egrep's. Saves me time when I need .conf files that have the line I need, or .hidden files that I need to source or read.
:)
If I need to do text search, I have a little for sh script that will look for a prefix in
If I don't need to hit the FS for finding files, a catalog already speeds this up. I've started doing this in cygwin to speed up searchs also. (Gotta love having unix tools under windows)
Call me crazy, but I actually just keep logically structured directories and make sure to save items into the appropriate location... It's much simpler to take 10 seconds to place a file in the appropriate directory at the start than to hunt for it later.
Even when a file crosses multiple logical groups, (picture, jpg, family, nephews, 2004) if my information categories are sensible, and I use a heirarchy that makes sense to me, I don't need search that often. In fact, I can't recall the last time I had to do a search of my drive to find a file. (I should probably mention that my work requires a lot of information mapping, so creating and maintaining such a structure is trivial for me)
Of course, since Windows search is so inefficient and (sometimes) problematic, I learned long ago not to rely on it.
bluez3
Interested in a Flash-based MAME front end? Visit mame.danzbb.com
Since Googles toolbar and deskbar doesn't work in linux, this software probably also won't. Won't you use for searching the contents of your files in your filesystem in Linux?
Google, well aware of this threat, hired a Microsoft product manager last year to oversee the Puffin project as part of its strategy to compete with Microsoft's incursion into its territory.
That's the first time that I've ever read of it going in a direction away from Microsoft. Usually, it's the other way around, Redmond sucking up the managers and staff if they can't buy or steal the technology.
After the Google appliance, this seems like an expected move. The desktop is certainly key from a marketing sense.
However I don't see a lot of overlap with web search. The major pieces won't work the same:
Crawling: People want fresh information, eg that marketing report that just went out five minutes ago. Many web sites are happy to be crawled once a month. Keeping up with user edits on a filesystem is going to be a lot harder, and users will probably not be happy with heavy reindexing cycles. The ultimate would be heavily integrated with the filesystem, keeping an eye on all file activity, and refreshing the index appropriately. I believe Longhorn's delays are related to this problem.
Indexing: Desktops have a lot of file types, and strange crypts like the Outlook. Certainly Google has some support in this area, but more may be needed. There are also other document units like email messages instead of files, or even database records.
Fetching: Granted, a simple search toolbar will work, but I've been more impressed with, for example, Apple's Sherlock protocol, which allows multiple search "channels", eg Web, News, Stocks, etc., some from third party providers. IIRC this is what Firefox uses.
Ranking: Pagerank is definitely not going to work, although that may not be such a handicap when hit counts are in the one or two-digit range. Still, it's not a competitive advantage.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
I want Google search on my .pst files from Outlook. Searching for a keyword through 2+ years of email takes FOREVER with the built-in search feature in Outlook. We're talking 5 or 10 minutes here.
:-P
And if I had a nickel for every time I had to resend something to a co-worker because they were too goddamned lazy to just search their email for the message I sent them THE FIRST TIME, well, Google wouldn't need an IPO because I'd just buy them outright!
That being said, filesystem searches with Google would be damn nice too.
Mechanik
1. Netscape conquers the browser market...
2. Netscape IPOs and climbs to some insanely high value...
3. Microsoft integrates browser into OS...
4. Netscape crubles...
- - - - fast forward - - - -
1. Google conquers the search market...
2. Google IPOs and climbs to some insanely high value... (coming soon)
3. Microsoft integrates search into OS... [Longhorn] (coming eventually)
Where do you think the rest of this goes?
Back in the late 1990s, I used the AltaVista Desktop personal search software. I used it on my Windows 98 computer back then. It was great, if I need to find something on my computer I would use keywords and it would find all the matching documents instantly. It seemed to already have everything on my computer indexed so it instantly knew were everything was.
Unfortunately, what I downloaded was only a demo version of the program that was only good for 90 days or something like that. When I decided to purchase the software I discovered that there really was not a reasonably priced version available for individual users. All that was available was extremely expensive versions intended for large companies. They did not even make an attempt to market it for users of home computers or small businesses. So even though I loved the software I had to stop using it. If I remember correctly it indexed not only text files but also MS Word documents, HTML, and my e-mail.
When searching for documents on my computer I always used the advanced search feature and did a boolean search using terms such as AND, OR, NOT and NEAR. It was very efficient. I now use Linux instead and have occasionally used grep, egrep, sed, awk and find but would perfer to also love to have the option to use a search engine on my home computer. I hope whatever Google comes up with will be available for Linux or at least will run under WINE or CrossOver Office. Of course, I would only use it if it is implemented in a way that does not invade my personal privacy. By the way, when searching the Internet, AltaVista does not seem to be using the same powerfull search engine with boolean operators that they once used so I recently switched to Google instead.
I also wonder how all this will compare with the new search engine that Microsoft is developing for WinFS under Longhorn. I hope that by then Linux will be offering equally good search capabilities. I seem to recall hearing that Han Reiser is in some way working on upgrading the ability of the Linux ReiserFS file system to be searched. Is that correct?
Way back in the day Altavista had a personal search engine. It ran under win9x and basically brought the features of the search engine to your personal docs. It could index almost all office type docs (no not just MS Office but all three of the major suites), email (Outlook and any mbox application), etc. I kept it running under win2k by doing an in place upgrade but unfortunatly it would not install under 2k or above so when it came time to reformat I lost the ability to use it. The indexer ran on a schedule or could be run manually, it would not only index local files but also one or more websites so before RSS you could use it as a news agregator. Overall it was very cool and I can't wait to see how Google implements the idea. Frankly it makes such a large productivity boost in your workflow that it's almost as big of an upgrade as from win9x->2k+ is.
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
slocate is a great little program to speed up the process of finding files on your *nix computer system, but it's not a full-text indexer. Finding the names of files like slocate does is not the same as finding words that appears within those files. It is a great replacement for "find / | grep $PATTERN" though.
Locate32 is a program that can replace your built in Windows FIND function, including indexed searches.