i know this will give me flames, but: you might try Oracle Text (also part of Oracle XE).
Supports 140 document formats, has a lot of options and works via SQL. Can build indexes for documents stored in DB or in the file system. You can even join the serach terms from the document with the database records where metadata might be stored by your application. I found that very helpful in similar projects. And it's free.
i know this will give me flames, but:
you might try Oracle Text (also part of Oracle XE).
Supports 140 document formats, has a lot of options and works via SQL.
Can build indexes for documents stored in DB or in the file system.
You can even join the serach terms from the document with the database records where metadata might be stored by your application.
I found that very helpful in similar projects. And it's free.
... the data points the author criticizes are not data points but line decorations for black and white readability.