Best Way to Build a Searchable Document Index?
Blinocac writes "I am organizing the IT documentation for the agency I work for, and we would like to make a searchable document index that would render results based on meta tags placed in the documents, which include everything from Word files, HTML, Excel, Access, and PDF's." What methods or tools have others seen that work? Anything to avoid?
We use Google apps at the place I work and it's great. Gmail, search, maps, etc.
:P
Of course, I work AT Google...
I agree with the above. This is what communities are for. I've been a programmer for 8 years, and forums/newsgroups/web have helped me and every other programmer I've ever met alot over the years. For the guy with the "double major" stick in his butt: If you werent such a pompus ass, someone might have hired you already, or better yet, you might still be working at your old job with those "excellent references" of yours. If you read the question, you'd have understood that the guy works in an agency. This means he needs to be a jack of all trades. Agencies ask the impossible from programmers in rediculously short deadlines. You have to get the job done now. You dont have the luxury of taking months for planning, years for coding and months for debugging. You cant know everything about everything. You need to know how to search, and if you cant find what you are looking for, you have to know how to ask questions. Like the guy before me said, he asked a ledgit question for a ledgit problem. If you dont know the answer, dont waste everyones time. There is always someone who knows better than you. If you understood that, you might still be working. I pray to god that the original poster ends up beeing your boss one day so he can fire your ass.
F-ing Tool....
I almost forgot about the question (lol sorry, idiots piss me off): Lucene is the way to go. There's a small learning curve, but its robust and stable. After you've used it once, you'll find it easy for other projects that come along.