Platform Independent, Searchable Info On CDROM?
Knuckles asks: "A friend of mine, who is an ethnologist, needs to author a CD-ROM with ethnographic source material (4,500 printed pages) on the indigenous population of Mexico. The CD-ROM should provide a platform independent way to retrieve the information with a simple interface and be fully searchable.
He is computer literate but doesn't know anything about programming.
What solution would you propose? I thought of HTML, but am lost on the question of searchability. Macromedia Director or similar?
He would prefer free software, but would use proprietary if better fit for his goals."
Sherlock and Sherlock 2, the Macs built in Search feature from OS 8.5 and beyond, are capable of indexing the content of many types of files, flat text being one of them, then allow you to search such content on your hard drive. FYI.
www.jackasscritics.com
That's exactly right. In fact, Acrobat has indexing features that makes searching docs quick and easy. The full Acrobat costs around $250 U.S., and Acrobat Reader is free to use and distribute.
Take care
JL
BeOS has perl support, search BeBits.
Best Regards, Ben Abbitt
"AOL, CIA, NSA, whatever, they all collect information, and they are all out to screw the american public"
Er...let's try that again.
There. Sorry for any confusion.
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
If I remember correctly the Intel Developer Insight CD set has a few searchable documents; the entire CD set (a mirror of http://developer.intel.com) is in HTML and the search scripts are in Javascript. I think.
However, you can use Perl script to generate a massive listing and index of the keywords and make this to be HTML files. HTML files are mostly accessable by most computer users. It's also good to include ASCII text files along with PDF files.
One drawback of using only ASCII text files is that if you have a lot of pictures or diagrams in this document and expect the users to print them, you don't really want them to open each image in GIMP or Photoshop and do the printing.
Also, you can try to write CGI scripts and put them on the CDROM along with the HTML files. You can make CGI script possible to do searching text for you.
Aside, you may want to try Perlfect Search 3.08.
Their web page description:"Perlfect Search is a sophisticated, powerful, versatile, customizable and effective site indexing/searching suite available under an open source licence. It comes as a pair of disctinct scripts. The indexer, that automatically, scans and indexes a web site, and the search engine, a cgi script that serves search queries for keywords over the index, and displays results pages in html, in a standard format including title, description and relevance ranking for each matching document. "
Another possibility is PDF format. You can also try PostScript file too. But I'm not sure if you can get any script to search text in PDF or PS format.
For making PDF file, you can buy the Adobe Acrobat, or try HTML DOC from Easy Software Products. It can convert HTML files into PDF and PS files and it's available in Linux, Windows, UNIX, IRIX, NT and Solaris.
Both HTMLDOC 1.8.8 and Perlfect Search 3.08 are in GPL.
Hope this help.
============
Mathematics will always come back to hunt you down, in so many ways
The CDs called Foo in a Nutshell, Deluxe Edition (Foo = {Webmaster, Java}) used JHLsearch, available from a fellow named (I think) John Leach who lives in Italy. I can't find a URL right now.
The subsequent CDs used ASTAware's NetResults. I wasn't really happy with their engine or their grasp of Web standards. Just before I left, we were starting to look into JObjects, which I'd had good recommendations for, but I don't know what became of that.
In short though, if you provide HTML content, the users will be able to use technology of their choice to search the CD. Most users will expect you to provide the tools, but that either means platform-proprietary tools or something based in Java. And even with Java, you'll probably need to provide a VM for the most common platforms, just in case.
...and the associated costs.
Please refer to FAQ
M$: "We're #2!"
I know it sounds ugly, but I believe that you can use Javascript to set up some sort of searching in pages.
Maybe even having a flatfile with keywords to index the search...
-I just work here... how am I supposed to know?
You're kidding, right? I haven't been able to get the search facility to work on any of the platforms I've tried (Netscape on Linux and Solaris, and IE on Windows). Java is enabled for all of them, but the browser either does nothing, or hangs when you select the search option :-(
"The invisible and the non-existent look very much alike." -- Delos B. McKown
Unfortunately, the search plugin isn't present on all OSes (like Linux, last I checked).
When I had to do this, I needed some dynamic functionality beyond simple search - I wound up using Apache+PHP, and simply pointing the browser to http://localhost:4711/. Surpringly, we never had a tech support question related to the actual technology (although we got a few newbie "can I run this cdrom if all I have is a DVD drive?" type queries).
Of course, if it's just documents, Acrobat has a variety of really slick solutions that are *very* platform independant.
--
Evan
"$30 for the One True Ring. $10 each additional ring!" -- JRR "Bob" Tolkien
Acrobat is also prettry slick. I am pretty sure that if you buy the full version of Acrobat Creator from Adobe, it has all sorts of slick search functionality and indexing builtin. And although proprietary, free (beer) viewers exist for nearly every platform.
A wealthy eccentric who marches to the beat of a different drum. But you may call me "Noodle Noggin."
Quando Omni Flunkus Moritati
Coldfusion used to have a java applet that would allow you to index and search the html manuals. You could write him a little applet like that.
/*
HTML is the way to go.
*Not a Sermon, Just a Thought
*/
*Not a Sermon, Just a Thought
*/
For network/Internet access, this isn't the answer, but for sending out CDs, it's almost ideal. There is some setup involved, so it's not a no-brainer but it is close.
A firewall can not protect you from yourself. Turn off what you do not need. Do not use the firewall to do your work.
HTML would probably be the ultimate solution becasue of the WYSIWYG editors on the market today make it very easy for non-computer literate people to make a simple interface into their setup. However, if you had more skills I'd suggest tcl or perl with the tk toolkits to do a better job of it.... If he wants to dish out a small amount of cash I'd be more then happy to write up some scripts :-)
Who's the black private dick, who's a sex machine for all the chicks?
Html is standard ( at least if you test it with more than type of browser ). You can try to use Html for everything, but that will make your search facility more like the index of a book.
If you want dynamic content, you might want to ship the CD with a webserver, like apache( as binaries for the most popular platforms, maybe source too).
Perl as a language is available for all platform but you will have to provide binaries for popular platforms on the CD. ( Of course other languages (python, java, apache+php) could do the job as well. ).
Alternatively, you could use java or python without a browser to write platform independent applications. Might look better, but is more work, and not on all platforms a java virtual machine is often installed.
For the data, you can use either tabular for more (columns separated by tabulators, and a header row naming the columns in front ), or an sql database dump. Database dumps sometimes contain extra information or idioms that are hard to read into other databases, but you can avoid that with a little caution. Tabular format can be imported into a lot of databases too, with moderate work effort, or none.
I'm still trying to figure out what people mean by 'social skills' here.
From your description, it sounds as though the content is plain text. Thus, I would keep it in generic ASCII.
As for the search interface, I'd use the whatever the operating system provides. For Windows, that would be Start->Find-Files which will allow for text-based searching. On *nix, you could use grep. I'm not sure what the Mac choice would be.
The less dense and more format neutral information is, the more likely it is to be useable in the future.
Keep it simple.
InitZero