Google Experiments With Local Filesystem Search
Teoti writes "No, Puffin is not the next name of your favorite email client, but, according to the New York Times (NSA reg. req.), the project codename for a new Google search application coming directly into your desktop, that will let you search your local filesystem efficiently. This is different from, but complementary of, the Google DeskBar that already lets you search the Web. The article also gives a few words on the end of the stand alone browser in Longhorn."
FS searching has absolutely sucked until this. Find By Content from Apple was a step forward, but it never worked too well. Here's hoping this search will make it into OS X!
Quid festinatio swallonis est aetherfuga inonusti?
Africus aut Europaeus?
Wonder whether they'll start serving me ads based on my hard drive contents...
is it me or has google decided to go off on many different dirrections recently. I know they have been growing very strongly, but are they going to reach a point where they stretch their resources too thin?
30% Troll, 50% Underrated, 10% Interesting
Score:5, Troll
I'll bet it still can't compete with slocate and find.
I'm selling my K5 acct.
They aren't competing with Microsoft today. They are competing with Microsoft 2 years from now when Longhorn is, potentailly, supposed to be released. As the article states, Microsoft is looking towards more of a natural language (ie.. Where are my car pictures?) approach rather than simple search terms. It could be a pretty good battle between them, but I think Google might have a bit of an edge.
Hmmm.
The company who puts a cookie on your computer that doesn't expire until 2038, has the ability to see lots of personal information about you, and who is interested in storing and indexing all of your email correspondance until the end of time, now wants to index my hard drive for me?
Call me paranoid, and mod me down because I'm sharing a negative opinion of Google, but I don't think I'm going to be giving this same company the ability to sift through my entire hard drive.
Yes. Google will help you ogle at your pr0n.
Then why would this system be useful at all? I mean, after all, Windows users could just use the file-hunting animated dog thing...
The Google folks are smart. Surely they've developed something that is more capable than merely find and grep, or file-hunting-dog, or Sherlock...
Honey, I shrunk the Cygwin
Wouldn't the speed of the search be influenced mostly be the capabilities of your own computer?
I haven't seen the code for either the client or the windows find utility, however I would expect that not too much can be done about your problems in there.
That is to say, Google's utility won't cut your search time to 20 minutes just because they have better code.
Then again, you never know with Microsoft...maybe the code is just that bad.
I doubt it though.
Google is a smart company. They're not going to go out of their way and spend resources on an Os that captures a whopping 1-5% of the desktop market. They're growing, profitable, and they make great products. Thus, they wouldn't make such a stupid business move. My guess is definitely: Windows only.
Why grep not working for ya?
Grep and find don't pre-index the files. So searching my machine takes me longer than searching the entire web. Google has indexing and caching down to a science. I can't wait for this to be on the market.
--
Lessons from Microsoft
(Warning: lack of cynicism ahead)
Seeing as they've built an empire on goodwill, a high-quality free search service, and word-of-mouth name recognition, I'm tempted to guess that their big benefit is continued goodwill and good karma from their userbase.
Yes, this is a novel concept in a business world where most companies look at customers and see numbers. Thing is, it's goodwill and a user-centric business plan have made Google the great company it is.
It could be that the 'catch' you're looking for is that Puffin will further solidify their already strong user relationship.
Obliteracy: Words with explosions
R&D is what keeps a company from becoming stagnant, and having to try to find new ways to squeeze money out of what it has. [For those companies that sell a tangible, especially a tangible disposable product, it's not as big of a deal].
... for short term, you focus on advertising, to try to convince everyone that you have a superior product, as opposed to actually making a superior product, and waiting for people to come to you]
But to remain profitable in the long term, you diversify -- so you're not as likely to take a massive downfall from a single competing company. And you try to find new products and solutions, to improve what offerings you have (that whole concept of innovation).
Google's got their IPO coming, so they'll have a nice little bit of cash to work with to improve their chances of continuing their current rate of growth. [however, they're looking at long term growth, not short term
Any company with a big R&D section would have some form of review process for projects -- if things change, you might shelve a project, and reassign people, because you're not sure if it's going to be as profitable as you originally thought. Depending on the field, you might have some board meeting every 3-12 months to review the current projects, and reassign resources, to make sure you don't stretch your resources too thin, and to identify which projects could benefit from extra funding.
Build it, and they will come^Hplain.
And the first one thing most people turn off in existing Windows installs is the indexing service.
They might be windows only, but there is a chance they'll decide to please the rest of us, too.
Allowing an object to be both a file and a directory is one of the features necessary to to compose the functionality present in streams and attributes using files and directories.
To implement a regular unix file with all of its metadata, we use a file plugin for the body of the file, a directory plugin for finding file plugins for each of the metadata, and particular file plugins for each of the metadata. We use a unix_file file plugin to access the body of the file, and a unix_file_dir directory plugin to resolve the names of its metadata to particular file plugins for particular metadata. These particular file plugins for unix file metadata (owner, permissions, etc.) are implemented to allow the metadata normally used by unix files to be quite compactly stored.
f.ex
Searcing meta data should be very fast
You are thinking from the point of view of trying to find just one single file. Searching is useful for dynamically pulling together all the files you have that are related to a specific subject. Its good to have them organized on your filesystem, but you can only organize a filesystem one way.
Per the article's comments about Longhorn and the "end of the browser" and how MS is planning to integrate network access with local services and applications to the point where a browser won't be necessary.:
Did I miss something? I thought Microsoft integerated the net with the local pc back in 1997 when they released IE4 and Windows 98 with desktop integration. Hrmmph... Go figure.
Ok, I'm being facetious.
Still, I'm not so certain this is a feature I want. In fact, until someone can demonstrate an example of why it would be useful, I'm certain I don't. I like having the local PC as a distinct domain separate from the net! I like that I have to open a program to access information that isn't stored locally! What am I missing about this -- is their focus group testing indicating that using a browser is just too confusing?
You know what's confusing? Windows HELP -- and not just how you use it, but THAT IT EVEN EXISTS AT ALL! My lusers come up to me all the time with questions that could easily be answered with good ole' F1.
What bothers me is that all of the work going on at Microsoft is pointed at new ways to annoy me. You want to make me a happyuser? Get your lousy freaking vendor partners to stop auto-running useless programs in my system tray; cancel ActiveX (*without* adding the TDMA crap I don't want) and get rid of the Windows registry. My main concern whenever I hear about these new thingamabobbers they're cooking ip is "Eeek! How hard is it going to be to turn *that* off? I sure hope R&D cancels it before Longhorn gets out of beta." I honestly think it's time they consider forking the project, or XP is my last version of Windows. Period.
There's just no joy in Windows anymore, you know what I mean?
Sincerely,
Eagerly awaiting Debian Sarge going stable in Ohio.
"Lawyers are for sucks."
- Doug McKenzie
From the article:
Microsoft believes that Longhorn users will no longer think about where information is stored; they will instead see a unified view of documents stored on both the Internet and on the desktop.
I don't like this idea. At all.
The main problem from my point of view has to do with ownership and control. Generally speaking, what's physically on my machine(s) is *mine*, that is subject to my total control (we'll leave aside intellectual property issues). I can add, change, delete, etc.
Still generally speaking, what's on some machine I access over the net is *not mine* in the sense that my control is reduced. Usually other people can do something with that information (again, add, change, delete) and if the machnine is taken offline, I have no access and no control at all.
As a simple example, consider a web page. In one case I make a local copy of it on my machine. In the other case I just have a bookmark. The difference in control is fairly obvious...
Now, what happens if we make users believe there's no difference between their local hard drive and Internet? That we drill into their heads that they are the same?
Well, you still have no control over information stored on the 'net. Thus, if you were trained to think that the local drive and the 'net are basically the same, then you would expect to have no control over information stored on your hard drive.
Note that by an amazing coincidence, that's also the goal of DRM -- that you have no control over information (that they call content) stored on your hard drive.
Also note that the flip side of the coin -- making your hard drive irrelevant by switching to a subscription service for everything, from OS to applications to content, is also a highly popular idea in Redmond and elsewhere.
So color me highly suspicious with regard to that idea...
Kaa
Kaa's Law: In any sufficiently large group of people most are idiots.
From the article: "The project was started, in part, to prepare Google for competing with Windows Longhorn, which according to industry analysts will dispense with the need for a stand-alone browser."
Yeah, because IE is such a compelling product today that I have little need for an alternative.
Ryosen
One man's "Troll, +1" is another man's "Insightful, +1".
"Thus, they wouldn't make such a stupid business move. My guess is definitely: Windows only."
Exactly which part of a filesystem search differs enough between windows and linux that you can't use the same program for both?
Ask anyone who writes file or text utilities for both systems -- if you're unlucky, you might need to change three lines to convert a windows util to a linux one.
If searching is such a critical a problem, why does MS keep making their local file search utility less and less useful? Windows 98 had it just right for me -- maybe move the "containing text" box to the front tab, but otherwise perfect. Win2K made it worse by making the "search subdirs", "hidden" and "system files" options non-sticky and hidden. WinXP?! Too much damn clicking, waiting and NON-DOINGSTUFF! Let's just say "thank heaven for TweakUI" or someone in Redmond would have gotten a VERY unpleasent letter and a flaming pile of dog poo from me.
"Lawyers are for sucks."
- Doug McKenzie
The only reason that the google toolbar is IE-Only is that pretty much every other browser already has a built-in Google search box, they aren't making money off of it so why bother duplicating effort? There are other reasons that this will probably be windows-only but this isn't one of them.
Clearly, you don't use your computer that seriously. I have thousands of files, with many GB of data, accumulated over years, at home. At work, there is a ton of stuff to manage. And guess what? I sometimes have to find something in someone else's files, or they in mine, because the owner is busy. We don't all think alike, after all.
Let's see... then there's project data collections where lots of people are putting things. Employees leave. Some folks just aren't organized. Some people get sent lots of stuff they have to save but not read right then, but which eventually becomes important.
There are lots of reasons that make this a good idea. Yeah, I have homegrown solutions on Linux, but a good, fast tool on any platform is a good idea. We all use Linux at home, but there's no way my wife is going to use grep, find, etc. She hates computers. If she can click on a button, type a word or phrase and get a list, just like any web-based search engine, she'll use that. And I know quite a few folks like that - on every platform with more than a few thousand users.
This is one of the silliest notions I've ever heard. If they make no distinction between local files (in user's control) and files "on the internet" (beyond user's control), what kind of crap are we going to have to put up with when people start saying "hey, where's that document I was looking at yesterday?" because they never knew it was on someone else's hard drive and got erased.
If a job's not worth doing, it's not worth doing right.
Google will win this battle.
1. Microsoft doesn't understand that people LOVE Google. Nobody particularly LOVES Microsoft anymore. Product activation, high prices, and security flaws are causing too many headaches.
2. Google is more innovative. What has Microsoft innovated in the past few years? Their products keep changing their look, but what about user behavior? AD changed admin behavior, but how has IE or Word gotten easier to use? Google has all kinds of creative stuff in the pipe. The Google toolbar has not only changed the way many of my users search, but it prevents a lot of popup related spyware installations as well.
3. Google is clean. If I see that damn dog show up one more time I'll kill myself. When I search my file system I don't want to hide the stupid mutt, change my options so that subfolders are searched, then click through three screens to say I want to search my file system. Google will cut through this nonsense because they believe in simple/clean interfaces.
4. The technology Microsoft seeks doesn't exist. Nobody can create a search engine based on current technology that takes plain speech user input and magically transforms it into accurate search results. Everyone I've seen that's tried this has failed to an extent. You can't just try your best to fuzzy match and pass it off as good results.
"Never tell me the odds"
Google has a vested interest in trying to help diminish Microsoft's desktop market share. Doing so increases the relative market value of Google's products relative to Microsoft's products.
To help drive a wedge between Microsoft and their current desktop customers, Google will almost certainly port this kind of tool to other OSes. They would then get into various "enterprise" partnerships with IT solution providers to push pre-canned non-Windows desktops into corporate accounts. This product in particular would help to sell alternative desktops against Longhorn's alleged new filesystem features.
If this strategy were successful, Google would stand to pick up a good bit of revenue and mindshare at Microsoft's expense. My guess is definitely: Cross platform.
The windows indexing service always leaves me feeling like I somehow missed a critical page in the documentation which would make it work just the way I expect it to.
I can tell it's got a lot of power, and being a part of the OS, it's seamless.. but I just can't seem to make it useful to me.
Google would have a winner on it's hands if it would let me organize (and ensure I have a backup of) all the documents on the five computers in my house. I've got probably 6gb of family pictures, but no good way to organize them by where they were taken, who is in them, etc. I was in a full-blown panic when I accidentally wiped the only copy of that directory, and had to restore it from a DVD backup, copies given to relatives, sent mail, and so on. That's worth money to me, but it really needs to be transparent.
I don't think that's a good comparison. It's a lot easier to write a cross-platform website than it is to write cross-platform applications. Sure, some of the underlying code can be reused. But a lot of the code (particularly for interacting with the file system and the GUI bits) will be platform-specific.
Google knows from prior experience that when it comes to new tech, geek acceptance is the first step to general acceptance. They're not going to alienate the knowledgeable early adopters (i.e., pretty much the /. crowd, with the exception of certain Microsoft-shill porn site operators) by making YAWOP (Yet Another Windows-Only Product).
The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
Would people be willing to live with ads sprinkled throughout their search items ?
You got it backwards. The toolbar came first, and it was for Windows/IE only. Mozilla, Opera, etc. duplicated the features because they gave up waiting for Google to provide one for other OSes or browsers.
What floors me is that even in the face of never-ending attacks on their products from the Legion of Blackhats, Microsoft still wants to believe that the internet is a big, happy neighborhood where everybody just gets along. One might think that issuing security patches weekly would have disabused them of this notion. Apparently, one would be wrong.
Mail? Put "slashdot" in the subject to pass the spam filters.
Of course it is, it's just that most people can't do it. I'm sure you've seen "disaster areas" in the real world, where the owner can still pull out anything in specific you ask for. Mental map, baby.
Once you move past that limit, as many do (it's mostly just a matter of magnitude) I find it incredibly limiting to have a hierarchical structure. It's no problem if I know the search key (e.g. name if I do an alphabetical sort).
Particularly I find it difficult to sort well when it comes to "dual-purpose" documents. Like e.g. I'd like to have a folder "Project X", "Project Y" etc. but I'd also like to have a "Promotional material" which is a collection of documents from many projects.
Which sounds trivial enough to split. But then you have a logo made for both. Or something made for a commercial, can we use that in the project / documentation itself? Suddenly you wish they could be in both places at once (which they in theory can with symlinks and shortcuts, but still). It'd be a lot easier to have one file with metadata.
I'm really looking forward to it because it allows me to build an index structure similar to my mental one, rather than force it into something else. After all, my computer is there to help me, not for me to conform to its limitations. Not because I so desperately need it to hold my hand, but because I want to sort it my way (which is not 2D).
Kjella
Live today, because you never know what tomorrow brings
Right... like their toolbar and their deskbar? And Google Compute?
Has Google distributed something that you can install on your Linux or Mac OS computer? Ever?
There are no trails. There are no trees out here.
A search for Operating system produces 11 *nix hits before getting around to Windows. Interesting.
have a once-a-day cron job that runs updatedb, but then you'll just get anoyed at the way it causes your disks to churn for several minutes.
Any tool from google or microsoft or anyone else would need some functional equivalent to updatedb to run at regular intervals. The index has to be made some way or another. Maybe an updatedb type of process that runs whenever there are idle cpu cycles?
You make the mistake of thinking you can educate the fundamental stupidity out of people. You can't.
I'm thorougly of logging onto a network file share, and then having to fumble around with the hierarchial store on many computers to find something - especially given the 5-to-10-second delay in changing folder views, even on a fast network, via WinXP. It's maddening.
Instead, network computers should maintain a low-bandwidth stream of file contents of local computers. When you connect to a network, your computer should auto-spider to locate resources - at very low bandwidth, maybe 5kps. We're really just pulling filenames/sizes/dates, after all. And if you select a particular computer, your spider should immediately map all of its network resources. You can then use a standard search window to find "jethro tull" or whatever.
If every computer offers a low-bandwidth stream like this, even a large network would barely feel the overhead. And it would make finding resources terrifically painless... especially for wireless connections.
C'mon, Microsoft - build useful things like this into Longhorn, not that WinFS relational-database bullshit.
- David Stein
Computer over. Virus = very yes.