Google Experiments With Local Filesystem Search
Teoti writes "No, Puffin is not the next name of your favorite email client, but, according to the New York Times (NSA reg. req.), the project codename for a new Google search application coming directly into your desktop, that will let you search your local filesystem efficiently. This is different from, but complementary of, the Google DeskBar that already lets you search the Web. The article also gives a few words on the end of the stand alone browser in Longhorn."
Click here for CNET version.
Hmmm.
X1 seems to be the most popular one out there.
DiskMeta, they had this project in beta for a while, the Windows product went into relese just last week, the site says
DT Search, I remember their ads in bunch of computer magazines, although have never used them myself.
EFS, found it on download.com, supports MS Office and PDF as well as other formats.
No-Reg Link
Your hair look like poop, Bob! - Wanker.
If you have followed Microsoft developments around Longhorn you might have noticed that search is one of the top priority features that microsoft is going to integrate directly into the operating system. So once Longhorn is released Microsoft would become the biggest competitor to Google's search applications on the web as well the desktop(with this application)
Search is the next big thing on which a lot of players are concentrating and Microsoft entering the field has skewed the competition towards the desktop and everyone including Google is preparing for the battle.
HERE
It works a lot better when you enable indexing.
Or so I'm told. My personal experiences with allowing the Windows Indexing service to run in the background have been that it's more trouble than its worth. Yes, on the rare occasion that it's actually -not- indexing when I search, the search is blazingly fast (compared to a non-indexed search).
But if the index is currently being modified, then the Windows search feature can't use it. Period. So when you search, you get the text "Windows is currently building an index of the files on drive C:" and it falls back to the regular, non-indexed search. In addition, the indexer consumes massive amounts of RAM while indexing, so a search run when the index is being modified ends up being about two times slower than usual.
It also doesn't seem to be able to tell when the user is idle. No amount of tweaking seems to fix this, without leaving you with a days-old index. If the index is complete, but you've saved a file since it was completed, that file will not show up in the search at all. I've had it kick on while in the middle of working on something else so often that I finally just turned it off entirely and have resigned myself to slow(er) searches in Windows.
In the interest of fairness I will say that the search seems to work quite well when searching a remote server that is running the indexing service. But running it locally is just a pain.
End of lesson. You may press the button.
Well, first this idea is part of Microsoft's WinFS plans. The idea with WinFS was partially born when Microsoft developers realized that major parts of the web can be searched faster than a user's hard drive. It will be interesting to see how this application will collide with Microsoft's plans, that's for sure. It's basically fast searches and enhanced metadata support that are the key parts of WinFS, which is in turn a key part of Longhorn.
Second, an indexing software that does the same thing is already available today and worked very well when I tried it out. It's actually almost perfect, except for the fact that it causes occasional hard drive thrashing as it tries to keep the index up-to-date. This is unfortunately a rather major downside, but if you can bear with this, you'll get literally instant file searches on your entire hard drive -- it narrows down the possible matches as you type each letter. It even indexes file contents for small files. I'm talking about X1.
Beware: In C++, your friends can see your privates!
all those utilities take a long time when searching on a 200G partition. I'd love to have something blazingly fast. Is that too much to ask for?
Locate takes a while to build it's database, but after that locate is very quick.
"Not my manner of thinking but the manner of thinking of others has been the source of my unhappiness." - M
This is one of the things that makes Google great: they allow (expect?) their employees to spend 20% of their time working on projects that are unrelated to their main job. Basically, this 20% just needs to be focused on stuff that can benefit Google.
See this article
As a developer trapped in windows I find this little tool incredibly usefull.
"I can not bring myself to believe that if knowledge presents danger, the solution is ignorance" - Isaac Asimov
"My Documents"...
(Not really mine)
Grep and find don't pre-index the files.
:-/
"locate" does, but the index is never up to date.
Javascript + Nintendo DSi = DSiCade
find and grep are oders of magnitude slower than the inverted text index techniques that Google uses.
See Lucene for a good open source inverted text index search engine.
Have you tried ZoneAlarm? It has this basic functionality.
"The natural progress of things is for liberty to yield and government to gain ground." - Thomas Jefferson
If you are worried about your privacy, don't accept these cookies, or regularly clean out your cookies. Maybe Google is being invasive but that doesn't keep you from looking out for yourself.
Yawn.
I have hundreds of word documents, PDF files, text files, e-mails in two different systems, etc.
I purchased Find from <a href="http://www.enfish.com">Enfish</a> and it saves me several minutes everyday. They have fancier products, but $50 for the Find application is all that I needed.
This sounds like a great place for Jakarta Lucene.
Lucene is Java and Open Source, so an app written to search a workstation should be able to run on any OS with a Java VM, and you can be sure it's not reporting any personal information to anyone.
I'd love to see it on my task bar. And, heck, it could probably be ready before Puffin
I was under the impression that recent versions of Windows had fairly good fine grained access controls. Sure, windows 98 doesn't offer a whole lot in terms of security, but 2000 and XP aren't so bad. So I guess I'd have to disagree... Windows does have such a (working) thing. Why do you say it doesn't?
BTW, I'm not sure why you'd want to collect that much porn. I know for a fact that a lot of it he's never seen before, and what I've seen of it suffer's from porn's usual problem, a lot of repetitiveness
Not to mention that if he ever gets raided I am *sure* there has to be at least a few child pr0n photos in there (even accidentally).
I decided long ago that keeping around lots of pr0n is just a bad idea. Binge and purge! That's my new motto!
> It also seems to work with Safari (minus the keyboard shortcuts)
This is a popular misunderstanding. Keyboard shortcuts work with Safari. Try the ctrl key instead of alt.
Ultimately, yes, but there's searching and then there's searching. For example, searching a hashed index is much faster than just searching through files in a filesystem. You could generate an index of data and metadata for all files on the system and incrementally update it during idle times, for example, or do certain kinds of updates on an as-needed basis.
GNOME used to have something like this, called Medusa. I think it was dropped because the existing implementation had performance problems (and possibly security issues?). However, it seems to be under redevelopment, and it looks like it will be quite useful when it gets a bit further along.
The scalloped tatters of the King in Yellow must cover
Yhtill forever. (R. W. Chambers, the King in Yellow
I dunno, I think SWISH++ does a pretty good job ...
...
... sweet!
I've had it running now for a while, and I can't say how much better it feels to have a local, powerful search engine at my beck and call, personally
Plus, it solved the 'endless bookmark menu' problem too, since instead of bookmarking, I get the site spidered by SWISH++, and all my future searches give me what I need
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
I loved this product, and I'm pleased to see that Google's going to try a similar product. With 200+GB hard drives commonplace, this can be very useful.
Best Buy can have you arrested
I wish a could beat the creator of google-watch.org and every person who ever linked to it with a gigantic clue stick.
First of all, the creator of google-watch.org has a really big axe to grind with Google.
Second, HTTP is a stateless protocol. If you want a user's preferences to to persist within a session you need to use cookies or attach a lot of state information to each GET/POST request. If you want the preferences to persist after you close and re-open your browser you have to have the user log in every time and store the prefs on the server or store the prefs on the client side in a cookie like Google does. This simple fact seems to fly right over the head of google-watch.org and their ridiculous cookie conspiracy theories.
But hey, we've been over this in every Google story since the anti-Google FUD crowd started coming out of the woodwork. Here's a thought: if you really need a tinfoil hat then disable cookies, don't use Orkut and sleep better at night. But please stop subjecting people to google-watch.org FUD.
Do you even know anything about perl? -- AC Replying to Tom Christiansen post.
Create HKEY_CURRENT_USER\Software\Microsoft\Windows\Curr
You'll have the old windows 2000 search dialogue.
microsoft's index server (a service on most installations of win2000/ winxp) does what this google product purports to do, but has a limited and clunky sdk, and i've found it to crap out and delay indexing new pages too much if i try to throttle it's resource use
i had a client who chose an implementation of index server i set up to do searches on his public website, but i have doubts about my solution's resource use
i replaced a guy who wanted to make a complicated mysql/ spidering solution, simply because my solution, apart from the aesthetics of the search page, was largely quick and easy, and it was fairly trivial to demo to the client a rudimentary solution for him using microsoft's index serverwhile the other guy was still in the starting gate
what would be interesting is if google builds an sdk into their local file system search that is more robust than microsoft's index service, and if maybe it can somehow "talk" to google on the web, really leveraging their intarweb leadership position to enhance any possible iis-linked implementation of this new product
intellectual property law is philosophically incoherent. it is your moral duty to ignore it or sabotage it
It's sweet. Some features include...
Find the highest quality and most relevant documents; Google factors in more than 100 variables for each query.
Search for secure information and view only those documents to which you have access; results are returned securely for documents protected by either NTLM or basic HTTP authentication.
Judge relevance of results more easily via dynamically generated snippets showing your query in the context of the page.
Navigate search results easily and clearly using intelligent grouping of documents residing in the same narrow subdirectories.
Avoid missing results through typos or misspellings as Google automatically suggests corrections with startling accuracy, even on company-specific words and phrases.
View search results even when the sites are down via cached copies of pages included in the search results.
Quickly find the most relevant section of a document via highlighted query terms displayed on cached documents.
Glimpse documents without needing the original client application of the file format via automatic reformatting of over 220 file types into HTML.
Access time-sensitive information first via date sorting.
Perform complex and sophisticated queries with over 10 special query terms, including Boolean AND, OR, and NOT searches.
More details are available at the appliance page on Google.
#2 above probably won't show up in the personal desktop version of the search, thouhg it is really is handy for the appliance -- even if you manage a modest sized office.
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.