Google Experiments With Local Filesystem Search
Teoti writes "No, Puffin is not the next name of your favorite email client, but, according to the New York Times (NSA reg. req.), the project codename for a new Google search application coming directly into your desktop, that will let you search your local filesystem efficiently. This is different from, but complementary of, the Google DeskBar that already lets you search the Web. The article also gives a few words on the end of the stand alone browser in Longhorn."
I certainly hope this isn't a Windows-only thing.
Honey, I shrunk the Cygwin
Will Google's search application functions feature Clippy? Or that damned animated XP Dog?
...exactly what "local filesystem image search" will return.
Finally, a way to effectively search through my gigabytes of pr0n!
FS searching has absolutely sucked until this. Find By Content from Apple was a step forward, but it never worked too well. Here's hoping this search will make it into OS X!
Quid festinatio swallonis est aetherfuga inonusti?
Africus aut Europaeus?
Google will also be able to catalogue the contents of your refrigerator, medicine cabinet, and be able to tell you your car keys are between the couch cushions.
Click here for CNET version.
Hmmm.
Perhaps I do not realize the full potential of the Find utility in Windows, but MAN does it suck.
Do you love freedom??? Do you love freedom!!! DO YOU LOVE FREEDOM!!!!!!!!
Wonder whether they'll start serving me ads based on my hard drive contents...
So, will I get ads based on my data?
is it me or has google decided to go off on many different dirrections recently. I know they have been growing very strongly, but are they going to reach a point where they stretch their resources too thin?
30% Troll, 50% Underrated, 10% Interesting
Score:5, Troll
I recently searched several hundred thousand files on my work machine. It took nearly 90 minutes to complete the search. I expect Google will be able to significantly improve upon that. They're one of the few companies that I really trust to do the right thing.
I'll bet it still can't compete with slocate and find.
I'm selling my K5 acct.
X1 seems to be the most popular one out there.
DiskMeta, they had this project in beta for a while, the Windows product went into relese just last week, the site says
DT Search, I remember their ads in bunch of computer magazines, although have never used them myself.
EFS, found it on download.com, supports MS Office and PDF as well as other formats.
No-Reg Link
Your hair look like poop, Bob! - Wanker.
No
If you have followed Microsoft developments around Longhorn you might have noticed that search is one of the top priority features that microsoft is going to integrate directly into the operating system. So once Longhorn is released Microsoft would become the biggest competitor to Google's search applications on the web as well the desktop(with this application)
Search is the next big thing on which a lot of players are concentrating and Microsoft entering the field has skewed the competition towards the desktop and everyone including Google is preparing for the battle.
They aren't competing with Microsoft today. They are competing with Microsoft 2 years from now when Longhorn is, potentailly, supposed to be released. As the article states, Microsoft is looking towards more of a natural language (ie.. Where are my car pictures?) approach rather than simple search terms. It could be a pretty good battle between them, but I think Google might have a bit of an edge.
Hmmm.
The company who puts a cookie on your computer that doesn't expire until 2038, has the ability to see lots of personal information about you, and who is interested in storing and indexing all of your email correspondance until the end of time, now wants to index my hard drive for me?
Call me paranoid, and mod me down because I'm sharing a negative opinion of Google, but I don't think I'm going to be giving this same company the ability to sift through my entire hard drive.
Seems to be like a rehash of the AltaVista Desktop search ...
:-)
I keep looking at Google and thinking "wow, this is just like AltaVista, without the death spiral!"
HERE
a search of localhost
weird huh?
Their insurgence into all aspects of our technology is scaring me. Then again it would be nice to have an index of everything so we could do a verbal search for common everyday items:
"Google, find my car keys."
"Thank you sir,
Google World has located them at:
right where you left them when you came home smashed at 2:30am last night from the titty bars."
I remember Alta Vista offered this sort of search-your-own-computer software back in *1998*. This seems to be the most recent version: http://siliconvalley.internet.com/news/article.ph
Well, first this idea is part of Microsoft's WinFS plans. The idea with WinFS was partially born when Microsoft developers realized that major parts of the web can be searched faster than a user's hard drive. It will be interesting to see how this application will collide with Microsoft's plans, that's for sure. It's basically fast searches and enhanced metadata support that are the key parts of WinFS, which is in turn a key part of Longhorn.
Second, an indexing software that does the same thing is already available today and worked very well when I tried it out. It's actually almost perfect, except for the fact that it causes occasional hard drive thrashing as it tries to keep the index up-to-date. This is unfortunately a rather major downside, but if you can bear with this, you'll get literally instant file searches on your entire hard drive -- it narrows down the possible matches as you type each letter. It even indexes file contents for small files. I'm talking about X1.
Beware: In C++, your friends can see your privates!
all those utilities take a long time when searching on a 200G partition. I'd love to have something blazingly fast. Is that too much to ask for?
Google should ask Microsoft for information it has to provide according to the antitrust settlement so that Google's own program can interoperate with Windows as good as Microsoft's!
Is it me or they just trying to really dumb down computers?
I have 100 gigs on my server and I can find shit I put in there 5 years ago in about 2 minutes or less. I guess some people just aren't organized ;-)
Is this next? http://ergopod.ca/images/googlekeys1.jpg
This image was on fark but I can't find it now. See how long before my server gets /.'d
I'm not anti-social, I'm anti-idiot.
Now, when Google can tell me where I put my keys, then I'll be impressed.
Locate takes a while to build it's database, but after that locate is very quick.
"Not my manner of thinking but the manner of thinking of others has been the source of my unhappiness." - M
As a developer trapped in windows I find this little tool incredibly usefull.
"I can not bring myself to believe that if knowledge presents danger, the solution is ignorance" - Isaac Asimov
(Warning: lack of cynicism ahead)
Seeing as they've built an empire on goodwill, a high-quality free search service, and word-of-mouth name recognition, I'm tempted to guess that their big benefit is continued goodwill and good karma from their userbase.
Yes, this is a novel concept in a business world where most companies look at customers and see numbers. Thing is, it's goodwill and a user-centric business plan have made Google the great company it is.
It could be that the 'catch' you're looking for is that Puffin will further solidify their already strong user relationship.
Obliteracy: Words with explosions
Since Microsoft considers Google a major competitor and has its target set on Google with Longhorn's capabilities, I think it would be a great idea if Google started distributing their own version of the Mozilla web browser. With Google's reputation, there would definitely be more people making the switch to Mozilla based browsers if Google were to do this. After all, Netscape is considered a failure now by the public and Mozilla to a casual observer lacks credibility no matter how great the product is.
"Right now, somewhere in this world, Scott Baio is plowing a woman he doesn't love," - Peter Griffin, *Family Guy*
R&D is what keeps a company from becoming stagnant, and having to try to find new ways to squeeze money out of what it has. [For those companies that sell a tangible, especially a tangible disposable product, it's not as big of a deal].
... for short term, you focus on advertising, to try to convince everyone that you have a superior product, as opposed to actually making a superior product, and waiting for people to come to you]
But to remain profitable in the long term, you diversify -- so you're not as likely to take a massive downfall from a single competing company. And you try to find new products and solutions, to improve what offerings you have (that whole concept of innovation).
Google's got their IPO coming, so they'll have a nice little bit of cash to work with to improve their chances of continuing their current rate of growth. [however, they're looking at long term growth, not short term
Any company with a big R&D section would have some form of review process for projects -- if things change, you might shelve a project, and reassign people, because you're not sure if it's going to be as profitable as you originally thought. Depending on the field, you might have some board meeting every 3-12 months to review the current projects, and reassign resources, to make sure you don't stretch your resources too thin, and to identify which projects could benefit from extra funding.
Build it, and they will come^Hplain.
Perhaps Google can fill this void in the pathetic Windows power tool-set ("Windows power tool-set" being close to an oxymoron).
But, despite my love for Google, in these more Orwellian times, I'm glad that I have the tools (not from MS) to monitor port activity.
Sigs are bad for your health.
This is as good idea, so long as it doesn't allow others to search my filesystem.
But what if they could? If google cached, online, the location of MP3s and MPEGs loaded on your system, then allowed others access (with your permission of course). Hmm... sounds like a P2P file sharing system...
-- If god wanted me to have a sig, he'd have given me a sense of humor.
Normally I just need to know file names, so I do something simple like du -ak / > /var/tmp/all so "all" is a catalog of all files.
/var/tmp/all for the files I need and do quick egrep's. Saves me time when I need .conf files that have the line I need, or .hidden files that I need to source or read.
:)
If I need to do text search, I have a little for sh script that will look for a prefix in
If I don't need to hit the FS for finding files, a catalog already speeds this up. I've started doing this in cygwin to speed up searchs also. (Gotta love having unix tools under windows)
You are thinking from the point of view of trying to find just one single file. Searching is useful for dynamically pulling together all the files you have that are related to a specific subject. Its good to have them organized on your filesystem, but you can only organize a filesystem one way.
Call me crazy, but I actually just keep logically structured directories and make sure to save items into the appropriate location... It's much simpler to take 10 seconds to place a file in the appropriate directory at the start than to hunt for it later.
Even when a file crosses multiple logical groups, (picture, jpg, family, nephews, 2004) if my information categories are sensible, and I use a heirarchy that makes sense to me, I don't need search that often. In fact, I can't recall the last time I had to do a search of my drive to find a file. (I should probably mention that my work requires a lot of information mapping, so creating and maintaining such a structure is trivial for me)
Of course, since Windows search is so inefficient and (sometimes) problematic, I learned long ago not to rely on it.
bluez3
Interested in a Flash-based MAME front end? Visit mame.danzbb.com
b) have fun!
Sunny Dubey
Per the article's comments about Longhorn and the "end of the browser" and how MS is planning to integrate network access with local services and applications to the point where a browser won't be necessary.:
Did I miss something? I thought Microsoft integerated the net with the local pc back in 1997 when they released IE4 and Windows 98 with desktop integration. Hrmmph... Go figure.
Ok, I'm being facetious.
Still, I'm not so certain this is a feature I want. In fact, until someone can demonstrate an example of why it would be useful, I'm certain I don't. I like having the local PC as a distinct domain separate from the net! I like that I have to open a program to access information that isn't stored locally! What am I missing about this -- is their focus group testing indicating that using a browser is just too confusing?
You know what's confusing? Windows HELP -- and not just how you use it, but THAT IT EVEN EXISTS AT ALL! My lusers come up to me all the time with questions that could easily be answered with good ole' F1.
What bothers me is that all of the work going on at Microsoft is pointed at new ways to annoy me. You want to make me a happyuser? Get your lousy freaking vendor partners to stop auto-running useless programs in my system tray; cancel ActiveX (*without* adding the TDMA crap I don't want) and get rid of the Windows registry. My main concern whenever I hear about these new thingamabobbers they're cooking ip is "Eeek! How hard is it going to be to turn *that* off? I sure hope R&D cancels it before Longhorn gets out of beta." I honestly think it's time they consider forking the project, or XP is my last version of Windows. Period.
There's just no joy in Windows anymore, you know what I mean?
Sincerely,
Eagerly awaiting Debian Sarge going stable in Ohio.
"Lawyers are for sucks."
- Doug McKenzie
From the article:
Microsoft believes that Longhorn users will no longer think about where information is stored; they will instead see a unified view of documents stored on both the Internet and on the desktop.
I don't like this idea. At all.
The main problem from my point of view has to do with ownership and control. Generally speaking, what's physically on my machine(s) is *mine*, that is subject to my total control (we'll leave aside intellectual property issues). I can add, change, delete, etc.
Still generally speaking, what's on some machine I access over the net is *not mine* in the sense that my control is reduced. Usually other people can do something with that information (again, add, change, delete) and if the machnine is taken offline, I have no access and no control at all.
As a simple example, consider a web page. In one case I make a local copy of it on my machine. In the other case I just have a bookmark. The difference in control is fairly obvious...
Now, what happens if we make users believe there's no difference between their local hard drive and Internet? That we drill into their heads that they are the same?
Well, you still have no control over information stored on the 'net. Thus, if you were trained to think that the local drive and the 'net are basically the same, then you would expect to have no control over information stored on your hard drive.
Note that by an amazing coincidence, that's also the goal of DRM -- that you have no control over information (that they call content) stored on your hard drive.
Also note that the flip side of the coin -- making your hard drive irrelevant by switching to a subscription service for everything, from OS to applications to content, is also a highly popular idea in Redmond and elsewhere.
So color me highly suspicious with regard to that idea...
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
From the article: "The project was started, in part, to prepare Google for competing with Windows Longhorn, which according to industry analysts will dispense with the need for a stand-alone browser."
Yeah, because IE is such a compelling product today that I have little need for an alternative.
Ryosen
One man's "Troll, +1" is another man's "Insightful, +1".
The Google Booty-bar, which searches your address book late and night and lists womens' numbers that are interested in getting together.
Clearly, you don't use your computer that seriously. I have thousands of files, with many GB of data, accumulated over years, at home. At work, there is a ton of stuff to manage. And guess what? I sometimes have to find something in someone else's files, or they in mine, because the owner is busy. We don't all think alike, after all.
Let's see... then there's project data collections where lots of people are putting things. Employees leave. Some folks just aren't organized. Some people get sent lots of stuff they have to save but not read right then, but which eventually becomes important.
There are lots of reasons that make this a good idea. Yeah, I have homegrown solutions on Linux, but a good, fast tool on any platform is a good idea. We all use Linux at home, but there's no way my wife is going to use grep, find, etc. She hates computers. If she can click on a button, type a word or phrase and get a list, just like any web-based search engine, she'll use that. And I know quite a few folks like that - on every platform with more than a few thousand users.
I have hundreds of word documents, PDF files, text files, e-mails in two different systems, etc.
I purchased Find from <a href="http://www.enfish.com">Enfish</a> and it saves me several minutes everyday. They have fancier products, but $50 for the Find application is all that I needed.
No, Puffin is not the next name of your favorite email client
But how do we know it's not the next name of my favorite web browser?
- Neil Wehneman
My legal education, in nifty podcast format
Google will win this battle.
1. Microsoft doesn't understand that people LOVE Google. Nobody particularly LOVES Microsoft anymore. Product activation, high prices, and security flaws are causing too many headaches.
2. Google is more innovative. What has Microsoft innovated in the past few years? Their products keep changing their look, but what about user behavior? AD changed admin behavior, but how has IE or Word gotten easier to use? Google has all kinds of creative stuff in the pipe. The Google toolbar has not only changed the way many of my users search, but it prevents a lot of popup related spyware installations as well.
3. Google is clean. If I see that damn dog show up one more time I'll kill myself. When I search my file system I don't want to hide the stupid mutt, change my options so that subfolders are searched, then click through three screens to say I want to search my file system. Google will cut through this nonsense because they believe in simple/clean interfaces.
4. The technology Microsoft seeks doesn't exist. Nobody can create a search engine based on current technology that takes plain speech user input and magically transforms it into accurate search results. Everyone I've seen that's tried this has failed to an extent. You can't just try your best to fuzzy match and pass it off as good results.
"Never tell me the odds"
Since Googles toolbar and deskbar doesn't work in linux, this software probably also won't. Won't you use for searching the contents of your files in your filesystem in Linux?
Google, well aware of this threat, hired a Microsoft product manager last year to oversee the Puffin project as part of its strategy to compete with Microsoft's incursion into its territory.
That's the first time that I've ever read of it going in a direction away from Microsoft. Usually, it's the other way around, Redmond sucking up the managers and staff if they can't buy or steal the technology.
This sounds like a great place for Jakarta Lucene.
Lucene is Java and Open Source, so an app written to search a workstation should be able to run on any OS with a Java VM, and you can be sure it's not reporting any personal information to anyone.
I'd love to see it on my task bar. And, heck, it could probably be ready before Puffin
Just RFID tag everything from now on, and have well-placed readers in your house.
Pay for software?
Obviously a new guy.
I was under the impression that recent versions of Windows had fairly good fine grained access controls. Sure, windows 98 doesn't offer a whole lot in terms of security, but 2000 and XP aren't so bad. So I guess I'd have to disagree... Windows does have such a (working) thing. Why do you say it doesn't?
...10,000 Linux systems connected to your local system and it will all run snappy ;)
E.
Never rub another man's rhubarb - The Joker
I hope not... That could get embarassing!
OTOH, I might finally get word about those wild lesbian orgies in my area that I've heretofore only found out about after the fact.
Who did what now?
After the Google appliance, this seems like an expected move. The desktop is certainly key from a marketing sense.
However I don't see a lot of overlap with web search. The major pieces won't work the same:
Crawling: People want fresh information, eg that marketing report that just went out five minutes ago. Many web sites are happy to be crawled once a month. Keeping up with user edits on a filesystem is going to be a lot harder, and users will probably not be happy with heavy reindexing cycles. The ultimate would be heavily integrated with the filesystem, keeping an eye on all file activity, and refreshing the index appropriately. I believe Longhorn's delays are related to this problem.
Indexing: Desktops have a lot of file types, and strange crypts like the Outlook. Certainly Google has some support in this area, but more may be needed. There are also other document units like email messages instead of files, or even database records.
Fetching: Granted, a simple search toolbar will work, but I've been more impressed with, for example, Apple's Sherlock protocol, which allows multiple search "channels", eg Web, News, Stocks, etc., some from third party providers. IIRC this is what Firefox uses.
Ranking: Pagerank is definitely not going to work, although that may not be such a handicap when hit counts are in the one or two-digit range. Still, it's not a competitive advantage.
---- "If we have to go on with these damned quantum jumps, then I'm sorry that I ever got involved" - Erwin Schrodinger
Would people be willing to live with ads sprinkled throughout their search items ?
I loved this product, and I'm pleased to see that Google's going to try a similar product. With 200+GB hard drives commonplace, this can be very useful.
Best Buy can have you arrested
Will Google scan my text files and display relevant text ads? Gasp!
I want Google search on my .pst files from Outlook. Searching for a keyword through 2+ years of email takes FOREVER with the built-in search feature in Outlook. We're talking 5 or 10 minutes here.
:-P
And if I had a nickel for every time I had to resend something to a co-worker because they were too goddamned lazy to just search their email for the message I sent them THE FIRST TIME, well, Google wouldn't need an IPO because I'd just buy them outright!
That being said, filesystem searches with Google would be damn nice too.
Mechanik
microsoft's index server (a service on most installations of win2000/ winxp) does what this google product purports to do, but has a limited and clunky sdk, and i've found it to crap out and delay indexing new pages too much if i try to throttle it's resource use
i had a client who chose an implementation of index server i set up to do searches on his public website, but i have doubts about my solution's resource use
i replaced a guy who wanted to make a complicated mysql/ spidering solution, simply because my solution, apart from the aesthetics of the search page, was largely quick and easy, and it was fairly trivial to demo to the client a rudimentary solution for him using microsoft's index serverwhile the other guy was still in the starting gate
what would be interesting is if google builds an sdk into their local file system search that is more robust than microsoft's index service, and if maybe it can somehow "talk" to google on the web, really leveraging their intarweb leadership position to enhance any possible iis-linked implementation of this new product
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
It's sweet. Some features include...
Find the highest quality and most relevant documents; Google factors in more than 100 variables for each query.
Search for secure information and view only those documents to which you have access; results are returned securely for documents protected by either NTLM or basic HTTP authentication.
Judge relevance of results more easily via dynamically generated snippets showing your query in the context of the page.
Navigate search results easily and clearly using intelligent grouping of documents residing in the same narrow subdirectories.
Avoid missing results through typos or misspellings as Google automatically suggests corrections with startling accuracy, even on company-specific words and phrases.
View search results even when the sites are down via cached copies of pages included in the search results.
Quickly find the most relevant section of a document via highlighted query terms displayed on cached documents.
Glimpse documents without needing the original client application of the file format via automatic reformatting of over 220 file types into HTML.
Access time-sensitive information first via date sorting.
Perform complex and sophisticated queries with over 10 special query terms, including Boolean AND, OR, and NOT searches.
More details are available at the appliance page on Google.
#2 above probably won't show up in the personal desktop version of the search, thouhg it is really is handy for the appliance -- even if you manage a modest sized office.
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
1. Netscape conquers the browser market...
2. Netscape IPOs and climbs to some insanely high value...
3. Microsoft integrates browser into OS...
4. Netscape crubles...
- - - - fast forward - - - -
1. Google conquers the search market...
2. Google IPOs and climbs to some insanely high value... (coming soon)
3. Microsoft integrates search into OS... [Longhorn] (coming eventually)
Where do you think the rest of this goes?
slocate is a great little program to speed up the process of finding files on your *nix computer system, but it's not a full-text indexer. Finding the names of files like slocate does is not the same as finding words that appears within those files. It is a great replacement for "find / | grep $PATTERN" though.
Locate32 is a program that can replace your built in Windows FIND function, including indexed searches.
It may not be my favorite e-mail client, but puffin is definitely my favorite past-time.
(Weed you fools)