lower cost rack mount gig e 6 port switches blew, and blew up! I had 5 die just out of warranty out of 8 total. Also they didn't handle large frame tcp.
Personally I wasn't thrilled with the broadcom/3com gig E NICs either and support for a few simple questions took 3com 2 months to respond after repeated requests.
I'll let everyone else test this batch of routers.
geese & peakocks are good choices (peakocks are really noisy through the day). goats can be territorial and convincing and ostrich... they'll rip off your face with their foot if you screw with their territory.
like the pull string firework poppers only with siren shells or 'kick' . not for use without adequate shielding inside a car.
What I want is one of the talking alarms that count down until the alarm. Then during a countdown from 15 once it reaches 12 or something would fire off a stun grenade or something insanely loud.
They handle many nasty smell situations at beef packing and rendering facilities. Solution would probably include some enzymes to chew up most of the stuff and chlorine dioxide to kill off the bacteria, etc. They probably handle the odor control systems for stink exiting the plant too so there might be a tech at your plant every week or so who'd help you.
There are several other companies that handle this type of situation as well.
We know already that SID doesn't comply in spirit with the internet we know and love.
We know spammers are already lined up and using SID, so the system is already polluted. "ya want validated spam with that?"
MS doesn't want OSS/Linux/etc. They have made that quite clear. Right now they need us to support this or the whole thing fails- or they start an apache war or something. MS has enough control already. IMHO they should have no say-so about my email.
Some persons at ms are getting *paid* to deploy this successfully & quickly and they will try very hard to do so. This includes convincing everyone else to support it. (for free?) Hold the ropes boys and girls.
Why would the OSS community care about supporting something that is IP encumbered by ms and in litigation, broken, basterdized, and infested with spammers already? err.. and its by our trustworthy future thinking pal microsoft.
So IIRC if they flick the switch on this thing hotmail and msn will be crippled and only work with SID friendly systems. Boo Hoo. maybe hotmail users will complain to ms since they won't be able to complain to me!
Look-- Every time ms does something like this eg: tcp/ip, kerberos, iis,ie,outlook, etc. it's a train wreck of decaying squid parts. Learn from the mistakes. If they need support for SID stall them:
Tell them you'll put it on an Action List or you'll do it as soon as 'counsel gives you the green light'. Tell them you use drugs and therefore cannot be trusted with such thigs until rehab! or Just lie! They'll never expect it! Better yet make them believe it will soon be supported!
Anyway I hereby claim my disgust and lack of support for sender id and beg all the developers working so hard on interesting things being bothered to support this to not waste their time and keep on inventing.
Q&A forum which could also be built into a series of (or single) FAQs and authoritativly answered by the appropriate dept. Also identify the dept. to relieve the main switchboards and give contact info for said dept. in the answer if more info is needed. Answers could be signed by their authors giving your noxious weed dept. a more personable image.
I'm sure every department has a lot of time tied up in repeating themselves with the same question from citizens espeically seasonal questions (lawn watering, licenses, xmas tree disposal or whatever)
Right after a bad jobs report also considered to be timed just before a bunch of republican national convention hype.
It's hard to doubt that some cases of trickle down will happen but mostly I see something warm yellow and smelly trickling down on the face of the lower tech sector.
I'd like to see more apps and OS's with a user skill level ui extensions designed into it. If you aren't a great ui designer, no problem! Let someone else do the simpler ui skins/templates later. Let others build more usable interfaces use your great app as an engine behind different ui designs, but keep the interfaces inside the same app if at all possible.
This allows users to advance from an easy mode to a more featured but probably more difficult mode with a few clicks.
geekgod - everything you can ask for, cli, a bit cryptic, automatable, no delete confirmation, no hand holding, little thought given to uncluttered prettiness. Just the way you like it.
Advanced/power user/ - better usability starts to take shape and most features are availible. a little more friendly setup that might not require reading all of the code, manuals and forums. some help availible in the app, etc.
Normal user- much more simplified but still usable. think of macintosh apps geared for normal users. Companies, managers, PHB, and the rest of the less techie lot love these! Need more? click User skill -> Advanced! Now the user *feels* like they have learned and advanced their knowledge and skill. It's a powerful thing!
Novice user- Pretty, sparkly very visual but insanely basic. Just the core functions. As much as KPT apps power point & flash stuff annoys me the less technical love it, and understand it. Furthermore they're impressed by it! That is worth so much.
So my fantasy world looks something like this-
All the debate about Openoffice.org chameleoning ms office is replaced by 'MS Office mode' and 'Interesting and way better mode', and 'I think in TEX mode'.
When teaching novices or the elderly how to use email I set the app to "novice mode" and it holds their hand the whole way through every time, giving them advice, etc.
Anyway the benefits are endless.
Let all buding ui designers take a crack at making your app usable by the masses, the advanced and the geeks.
The art of ocr is like working with autistics. give them what they expect. the more surprises, the more episodes.
Don't believe the hype.
Scan black & white to TIFF GROUP IV. OCR systems are optomized for this. Color is new and pretty wacky still. BMP even freaks out in black and white on some packages.
Make sure your background is white and clean, not specled. despeckling tools can be overused and kill ocr results.
3 hole punches regularly show up as o O 0 D staples: ~..// c d
Deskew all images to a line of text, not the page
Scan at 200-300 dpi but not higher than 600 or most apps will choke and produce bad results.
Make a custom dictionary if you can. if you're doing automotive related stuff, look up auto terms and make a dictionary out of it.
To process tiny text (concordences etc) scan at 800dpi and then fool the ocr by scaling the image to 300. sounds nuts right? ok try it the logcal way first and then come back and try this teq.
Shaded text is a new thing in document as is inverted text blocks (thanks word...you make my job hell.)you must remove the shading with something like scanfix by tms sequioa- good tool for small doc cleanup for pre-ocr. requires practice and trial and error. interface needs some work though.
Dot matrix prints should be scanned and some blur added to join the dots (unless you are using something expressedly made for dmp's) as always your milage may vary
Turn off auto rotate (mangle)features. They are not very smart an often have monkeyvision. just review your images before hand and rotate accordingly.
If you're scanning something poster size, or engineering drawing size (not recommended for most ocr) cut it into smaller images. ideally regions of interest not larger than 8.5x11
Remember 99% accurate means 1 character per hundred will be screwed up.
Table of contents pages are an interesting test for ocr especially if they use periods to lead to page numbers. How many identical characters can occur before the ocr system misreads. often quite telling.
OCRing a spreadsheet and using the data with out verifying every character? may the monkeygods help you.
Above applies to processing screenshots, 17'th century print, tabloid print, multi-column, shaded background handwriting w/o special software, modern magazines, etc.
OCR does not like non seriffed fonts much.
The post office spends millions on theirs and they have a nice address DB list to verify against.
Over 90% of banks scan your checks and microfilm them. (and there is some really cool signature verification sofware out there for forgery detection) MICR font at the bottom helps them immensely.
Breaking down your document into areas can be useful. changing fonts and sizes sometimes throw it off . an example would be computer lit with code snippets interspersed.
Do yourself a favor if it applies and use image+hidden text pdf. raw ocr is almost always yucky and all those claims of preserving document layout and format are just that--claims.
If you do use i+HT pdf, or for a larger job for that matter, do it in small chunks so your app doesn't crash. for pdf, join the small documents together in acrobat later or use other tools to do so.
For fun and science, take an old apple newton 100 and trace over some of the text on your page and compare its results to your ocr package.
Anyway i hope that helps someone avoid a few landmines and there are many more tips out there. these are from my experience and off the cuff.
Image + hidden text pdf is now in most lower end ocr products. also in acrobat 4&5 (iirc). Sometime you have to dig to find the feature though.
going from image ->ocr'd text -> text only pdf or text pdf with image snippets usually looks quite awful.
I+HT PDFs though will let you search for and highlight the text behind the image. It isn't perfectly accurate on placement but it's usually adequate.
You're quite right that most of the time 'scan to pdf' is just a bitmap. It's the usual default for most ocr or scan apps. Fujitsu has a small cheap desktop scanner that ONLY scans to pdf (eek!)i can't recall if it allows ocr or not. Many high end scanner software will allow pdf but with no ocr. (kodak fujitsu and kofax based products come to mind) With the high speed scanners they usually don't want to include anything to make their scanner look slow. This thinking is sort of reasonable because if you have 160 page per min. scanner having to wait 15 secs per side of the page to ocr it the whole purpose of high speed is defeated. Not that that sort of problem can't be solved.
(I'm in the document imaging / conversion industry)
The term paperless office is considered a joke, and the funny part of it is this: as soon as someone looks up a document in their doc management system they just print it. Even if just to glance at! Copier/printer companies are thrilled!
There are megatons of paper and microfilm out there left to ocr and process. It's considered a pretty fast growing industry, although stunted recently after the bomb and more by the economy.
Having ocr'd images is very handy. Here's an open secret though-- Image+ hidden text pdf. --Searchable, you have the original doc just as it looked, and the ocr errors don't make such an impact. It's easy to throw into a search engine and the prints look great, and small (b+w use tiff group iv, and jpeg for color jbig is not quite mature yet and only a few apps from cvision do a great job at it)
Anyway, since people just hit print as soon as they find their doc in a system those file cabinets we tried so hard to empty and organize re fill magically.
Also, scanning and setting up an edms (electonic doc managemnt system) is considered a luxury. business move slow with luxury items and usually get to reap the benefits of more mature software and systems (but this is NOT always true!).
Many other slow tech adoption business are just discovering scanning ocr and doc management. Litigation is a great example. xerox was doing quite a few tv ads recently touting that stuff.
The state of ocr itself is strange. There has been a sort of pleague in that industry of 'weird innovation' for years and many buyouts or companys changing the focus of their ocr product to another industry (like web or xml). Even the small office versions ($500 range) are not geared for any sort of reasonable volume or speed without crashing and burning, and usually designed to be babysat. Using these apps leaves the user with a really bad experience. For those not familliar the process goes something like this for a 200 page b+w document: == Scan (or import but import is usually crippled)
gaze at loads of memory hogging eye candy (this is what your upgrade bought you usually)
wait
correct skew (wait for crappy tools) (possibly reboot from crash)
recognize page -slower with each new version even when hardware is so much faster every year. some recognition is improved in some packages. Some of the latest i've tested take over 15 sec and sometimes over 45 sec per page! (crash!?) Correct errors / tune learning engine. (sometimes i swear this effort of teaching goes straight to NULL)
repeat 199 times
Now since you're locked in your desk and finished scanning now it's time to export! (like i didn't know what formats i wanted before i sat down.)
So it chews and chews and maybe crashes causing you to repeat all the above steps. Also note that most of these apps keep all the pages pretty much uncompressed in memory, then create a copy of them in memory for your desired output format. (crash)
2 days of work gone.
==== Most users walk away with the feeling of 'Yikes! all I wanted was a word doc of this. I'll just do something else'
For the home and also small biz market here are some of the 'weird innovations'-- =========== typereader 5 -- pretty good app! doesn't do image+hidden text pdf though. Pitty. has a batch file import and reasonable priced in the $100 range. nice and fast with good results
Typereader 6 and up- file import feature moved to industrial version lots of eye candy less stable minor improvement in recog and a bunch of other silly limits & slow
Omnipage same thing only it's never been great for over 50 pages. horrid workflow and crashes like crazy. very unpredictable! Omnipage version 3 was better in many ways than omnipage 14. (lightning fast on today's equipment too:)
abby finereader - very slow but great recognition, more stable but lame workflow-
These services do exist and some places I've worked with have used them. The usual problem is a poor understanding of the 'gotchas' in english. document structure and names get mangled the most.
My gf convinced me to go. She was at a novice level of computer knowledge but saw a blurb on peekabooty in a newspaper. I retired my security hacking in the 80's back when it was nearly effortless to hack systems. I was apprehensive about going because I felt like I'd be looked poorly on for almost having newbie status on the topics at hand, and assumed she would be in the same boat.
She told me you only live once, so what if they look down on us, if they do, they suck, and we probably won't be the only ones that are rusty or new. She was totally right.
We saw--
Excellent people, a whole newbie series of talks on 'what is a hacker' 'dmca', and this little talk by dimetry skylerov about ebook encryption and plenty more. We even attended the high end stuff and found it to be very accessable. There were plenty of admitted novices there, there were plenty more posers there that quickly admitted to being posers, other girls (gasp!) were there some were gf tag alongs and many weren't.
also your gf will get a taste of being at a place with a few thousand people that are proably similar to you!
She'll see other people with a similar approach to things, mannerisms, dress, music, even annoyances (see! it's not my fault. i was born this way!)
It's always great being at a big place with a bunch of really creative (and somewhat eccentric)people.
afterwards she might think twice about buying that leather trinity jacket / corsett/o'really t-shirt:)
On top of that regardless of her age there's always everything else in vegas. alot of the defcon folk stay at the hard rock hotel down the street for ease of gambling & wi-fi discovery.
I would advise against her hooking up to ethernet of any kind with her laptop if she brings one w/o supervision. The script kiddies are quick and the pro's are really fast.
Judging from the replies so far to this article it seems it piqued the ire of a facet of slashdot that always posts some sort of "why would anyone want to do/use/make/create something like that?!"
[well- why not?]
I wish there was a mod -5 Curmedgeon feature.
I have my own curmedgeon chunk too which said "ooo! an ipod zealot text file. they'll feel so special. If i could only come up with something for cat worshipers that ran on the ipod i'd be rich!"
I just moved to the san francisco from the midwest and I've been noting a bunch of sociatal quirks that make this idea not so bad (at least for san francisco):
Public transport is big here but I have yet to see someone whip out a laptop on the public transport. playing with cel phone games, gameboy, walkman/ipod is ok, but palm pilot is quite rare. Go figure. Riders seem to feel pretty secure and comfortable most of the time on the transport here(compared to new york and chicago) and you can't swing a cat around here without hitting someone wearing an ipod (also the theoretical cat would hit at least 3 dentist offices 2 optical places and 1 walgreens per revolution).
being able to look up free wi-fi on a device i'm already carrying and using would be nice.
introducing ipod wi-fi starbucks junkies to a new place with free wi-fi and a better atmosphere can only be a good thing.
what the hell. i'll go repair my ipod sometime and load up the list and see how it goes.
i'll propbably print the list too.
and build a free wi-fi enabled roller coaster in my apartment
Can't remeber whether it was PopSci, SciAm or BYTE but 10-15 yrs ago I remember an article about ibm researchers doing business card and phone number info between people using a handshake, or having several peopole's devices 'synched' at once using a banister / handrail.
Knowing IBM i'm pretty sure they paid a visit to the patent office.
It was great for crime photos, surveying, construction, etc. IIRC they had a snap on module later.
a l/tib/tib7061.jhtml?id=0.1.14.34.5.110&lc=en
http://www.kodak.com/global/en/service/profession
They seem to hava abandoned it. Silly to do for such a simple and useful feature. Hope the new project takes off.
I relied faithfully on 3coms 100mb products.
lower cost rack mount gig e 6 port switches blew, and blew up! I had 5 die just out of warranty out of 8 total. Also they didn't handle large frame tcp.
Personally I wasn't thrilled with the broadcom/3com gig E NICs either and support for a few simple questions took 3com 2 months to respond after repeated requests.
I'll let everyone else test this batch of routers.
geese & peakocks are good choices (peakocks are really noisy through the day). goats can be territorial and convincing and ostrich ... they'll rip off your face with their foot if you screw with their territory.
catus. lots and lots of mean cactus.
http://www.hi-vel.com/Catalog__17/Perimeter_Alarm
like the pull string firework poppers only with siren shells or 'kick' . not for use without adequate shielding inside a car.
What I want is one of the talking alarms that count down until the alarm. Then during a countdown from 15 once it reaches 12 or something would fire off a stun grenade or something insanely loud.
They handle many nasty smell situations at beef packing and rendering facilities. Solution would probably include some enzymes to chew up most of the stuff and chlorine dioxide to kill off the bacteria, etc. They probably handle the odor control systems for stink exiting the plant too so there might be a tech at your plant every week or so who'd help you.
There are several other companies that handle this type of situation as well.
http://www.ashchem.com/ascc/drewind/
We know already that SID doesn't comply in spirit with the internet we know and love.
.. and its by our trustworthy future thinking pal microsoft.
We know spammers are already lined up and using SID, so the system is already polluted. "ya want validated spam with that?"
MS doesn't want OSS/Linux/etc. They have made that quite clear. Right now they need us to support this or the whole thing fails- or they start an apache war or something. MS has enough control already. IMHO they should have no say-so about my email.
Some persons at ms are getting *paid* to deploy this successfully & quickly and they will try very hard to do so. This includes convincing everyone else to support it. (for free?) Hold the ropes boys and girls.
Why would the OSS community care about supporting something that is IP encumbered by ms and in litigation, broken, basterdized, and infested with spammers already? err
So IIRC if they flick the switch on this thing hotmail and msn will be crippled and only work with SID friendly systems. Boo Hoo. maybe hotmail users will complain to ms since they won't be able to complain to me!
Look-- Every time ms does something like this eg: tcp/ip, kerberos, iis,ie,outlook, etc. it's a train wreck of decaying squid parts. Learn from the mistakes. If they need support for SID stall them:
Tell them you'll put it on an Action List or you'll do it as soon as 'counsel gives you the green light'. Tell them you use drugs and therefore cannot be trusted with such thigs until rehab! or Just lie! They'll never expect it! Better yet make them believe it will soon be supported!
Anyway I hereby claim my disgust and lack of support for sender id and beg all the developers working so hard on interesting things being bothered to support this to not waste their time and keep on inventing.
Thank you.
Have all forms online, and submittable online.
try and collect as many state forms too.
Q&A forum which could also be built into a series of (or single) FAQs and authoritativly answered by the appropriate dept. Also identify the dept. to relieve the main switchboards and give contact info for said dept. in the answer if more info is needed. Answers could be signed by their authors giving your noxious weed dept. a more personable image.
I'm sure every department has a lot of time tied up in repeating themselves with the same question from citizens espeically seasonal questions (lawn watering, licenses, xmas tree disposal or whatever)
Thats the whole beauty of wiki. You don't need to trust the latest version and rarely (or never?) does poisoning the article from the beginning occur.
Does no one remember that they have 747's prepped to transfer the shuttles??!!!?
l =en&lr=&ie=UTF-8&c2coff=1&sa=N&tab =wi
http://images.google.com/images?q=shuttle%20747&h
These used to fly over my house with the shuttles.
I'm suspicious of the timing of this article-
Right after a bad jobs report also considered to be timed just before a bunch of republican national convention hype.
It's hard to doubt that some cases of trickle down will happen but mostly I see something warm yellow and smelly trickling down on the face of the lower tech sector.
Yep. Saw it. Appears to me she's honest.
As a public service to you:
The 18yr 36-24-36 blonde h0tt3 you've been telling all your fantasies to lately is not very honest.
I don't think a
She's still cool in my book, cromag.
Please return to AOL now. This is not the place for you.
Answer the damn question or go read maxim in the bathroom.
n dex.html
She's cool in my book.
http://starbase.globalpc.net/~vanessa/c128tower/i
OMG a nerd on slashdot who's female and not the likely winner of a hotornot contest with a legit question.
Does she deserve a bunch of sexist heckling and brush off answers? Nope.
I'd like to see more apps and OS's with a user skill level ui extensions designed into it. If you aren't a great ui designer, no problem! Let someone else do the simpler ui skins/templates later. Let others build more usable interfaces use your great app as an engine behind different ui designs, but keep the interfaces inside the same app if at all possible.
/power user/ - better usability starts to take shape and most features are availible. a little more friendly setup that might not require reading all of the code, manuals and forums. some help availible in the app, etc.
This allows users to advance from an easy mode to a more featured but probably more difficult mode with a few clicks.
geekgod - everything you can ask for, cli, a bit cryptic, automatable, no delete confirmation, no hand holding, little thought given to uncluttered prettiness. Just the way you like it.
Advanced
Normal user-
much more simplified but still usable. think of macintosh apps geared for normal users. Companies, managers, PHB, and the rest of the less techie lot love these! Need more? click User skill -> Advanced! Now the user *feels* like they have learned and advanced their knowledge and skill. It's a powerful thing!
Novice user-
Pretty, sparkly very visual but insanely basic. Just the core functions. As much as KPT apps power point & flash stuff annoys me the less technical love it, and understand it. Furthermore they're impressed by it! That is worth so much.
So my fantasy world looks something like this-
All the debate about Openoffice.org chameleoning ms office is replaced by 'MS Office mode' and 'Interesting and way better mode', and 'I think in TEX mode'.
When teaching novices or the elderly how to use email I set the app to "novice mode" and it holds their hand the whole way through every time, giving them advice, etc.
Anyway the benefits are endless.
Let all buding ui designers take a crack at making your app usable by the masses, the advanced and the geeks.
That site has a load of goodies! Thanks!
It's slightly off topic but seemed appropriate.
.. // c d
Here's some quick tips/nuggets of crispy wisdom.
The art of ocr is like working with autistics. give them what they expect. the more surprises, the more episodes.
Don't believe the hype.
Scan black & white to TIFF GROUP IV. OCR systems are optomized for this. Color is new and pretty wacky still. BMP even freaks out in black and white on some packages.
Make sure your background is white and clean, not specled. despeckling tools can be overused and kill ocr results.
3 hole punches regularly show up as o O 0 D
staples: ~
Deskew all images to a line of text, not the page
Scan at 200-300 dpi but not higher than 600 or most apps will choke and produce bad results.
Make a custom dictionary if you can. if you're doing automotive related stuff, look up auto terms and make a dictionary out of it.
To process tiny text (concordences etc) scan at 800dpi and then fool the ocr by scaling the image to 300. sounds nuts right? ok try it the logcal way first and then come back and try this teq.
Shaded text is a new thing in document as is inverted text blocks (thanks word...you make my job hell.)you must remove the shading with something like scanfix by tms sequioa- good tool for small doc cleanup for pre-ocr. requires practice and trial and error. interface needs some work though.
Dot matrix prints should be scanned and some blur added to join the dots (unless you are using something expressedly made for dmp's) as always your milage may vary
Turn off auto rotate (mangle)features. They are not very smart an often have monkeyvision. just review your images before hand and rotate accordingly.
If you're scanning something poster size, or engineering drawing size (not recommended for most ocr) cut it into smaller images. ideally regions of interest not larger than 8.5x11
Remember 99% accurate means 1 character per hundred will be screwed up.
Table of contents pages are an interesting test for ocr especially if they use periods to lead to page numbers. How many identical characters can occur before the ocr system misreads. often quite telling.
OCRing a spreadsheet and using the data with out verifying every character? may the monkeygods help you.
Above applies to processing screenshots, 17'th century print, tabloid print, multi-column, shaded background handwriting w/o special software, modern magazines, etc.
OCR does not like non seriffed fonts much.
The post office spends millions on theirs and they have a nice address DB list to verify against.
Over 90% of banks scan your checks and microfilm them. (and there is some really cool signature verification sofware out there for forgery detection) MICR font at the bottom helps them immensely.
Breaking down your document into areas can be useful. changing fonts and sizes sometimes throw it off . an example would be computer lit with code snippets interspersed.
Do yourself a favor if it applies and use image+hidden text pdf. raw ocr is almost always yucky and all those claims of preserving document layout and format are just that--claims.
If you do use i+HT pdf, or for a larger job for that matter, do it in small chunks so your app doesn't crash. for pdf, join the small documents together in acrobat later or use other tools to do so.
For fun and science, take an old apple newton 100 and trace over some of the text on your page and compare its results to your ocr package.
Anyway i hope that helps someone avoid a few landmines and there are many more tips out there. these are from my experience and off the cuff.
Image + hidden text pdf is now in most
lower end ocr products. also in acrobat 4&5 (iirc).
Sometime you have to dig to find the feature though.
going from image ->ocr'd text -> text only pdf or text pdf with image snippets usually looks quite awful.
I+HT PDFs though will let you search for and highlight the text behind the image. It isn't perfectly accurate on placement but it's usually adequate.
You're quite right that most of the time 'scan to pdf' is just a bitmap. It's the usual default for most ocr or scan apps. Fujitsu has a small cheap desktop scanner that ONLY scans to pdf (eek!)i can't recall if it allows ocr or not. Many high end scanner software will allow pdf but with no ocr. (kodak fujitsu and kofax based products come to mind) With the high speed scanners they usually don't want to include anything to make their scanner look slow. This thinking is sort of reasonable because if you have 160 page per min. scanner having to wait 15 secs per side of the page to ocr it the whole purpose of high speed is defeated. Not that that sort of problem can't be solved.
Thanks. My pleasure.
i'm probably just thick headed tonight but whacha mean by
"and for ++ S/N(/.)" ?
Thanks!
every time you don't print trillions of electrons are forced into slavery.
(I'm in the document imaging / conversion industry)
:)
The term paperless office is considered a joke, and the funny part of it is this: as soon as someone looks up a document in their doc management system they just print it. Even if just to glance at! Copier/printer companies are thrilled!
There are megatons of paper and microfilm out there left to ocr and process. It's considered a pretty fast growing industry, although stunted recently after the bomb and more by the economy.
Having ocr'd images is very handy. Here's an open secret though-- Image+ hidden text pdf.
--Searchable, you have the original doc just as it looked, and the ocr errors don't make such an impact. It's easy to throw into a search engine and the prints look great, and small (b+w use tiff group iv, and jpeg for color jbig is not quite mature yet and only a few apps from cvision do a great job at it)
Anyway, since people just hit print as soon as they find their doc in a system those file cabinets we tried so hard to empty and organize re fill magically.
Also, scanning and setting up an edms (electonic doc managemnt system) is considered a luxury. business move slow with luxury items and usually get to reap the benefits of more mature software and systems (but this is NOT always true!).
Many other slow tech adoption business are just discovering scanning ocr and doc management. Litigation is a great example. xerox was doing quite a few tv ads recently touting that stuff.
The state of ocr itself is strange. There has been a sort of pleague in that industry of 'weird innovation' for years and many buyouts or companys changing the focus of their ocr product to another industry (like web or xml). Even the small office versions ($500 range) are not geared for any sort of reasonable volume or speed without crashing and burning, and usually designed to be babysat. Using these apps leaves the user with a really bad experience. For those not familliar the process goes something like this for a 200 page b+w document:
==
Scan (or import but import is usually crippled)
gaze at loads of memory hogging eye candy (this is what your upgrade bought you usually)
wait
correct skew (wait for crappy tools)
(possibly reboot from crash)
recognize page -slower with each new version even
when hardware is so much faster every year. some recognition is improved in some packages. Some of the latest i've tested take over 15 sec and sometimes over 45 sec per page!
(crash!?)
Correct errors / tune learning engine. (sometimes i swear this effort of teaching goes straight to NULL)
repeat 199 times
Now since you're locked in your desk and finished scanning now it's time to export! (like i didn't know what formats i wanted before i sat down.)
So it chews and chews and maybe crashes causing you to repeat all the above steps. Also note that most of these apps keep all the pages pretty much uncompressed in memory, then create a copy of them in memory for your desired output format. (crash)
2 days of work gone.
====
Most users walk away with the feeling of 'Yikes! all I wanted was a word doc of this. I'll just do something else'
For the home and also small biz market here are some of the 'weird innovations'--
===========
typereader 5 -- pretty good app! doesn't do image+hidden text pdf though. Pitty. has a batch file import and reasonable priced in the $100 range. nice and fast with good results
Typereader 6 and up- file import feature moved to industrial version lots of eye candy less stable minor improvement in recog and a bunch of other silly limits & slow
Omnipage same thing only it's never been great for over 50 pages. horrid workflow and crashes like crazy. very unpredictable!
Omnipage version 3 was better in many ways than omnipage 14. (lightning fast on today's equipment too
abby finereader - very slow but great recognition, more stable but lame workflow-
These services do exist and some places I've worked with have used them. The usual problem is a poor
understanding of the 'gotchas' in english.
document structure and names get mangled the most.
http://www.laservideo.com/
Do it!
:)
My gf convinced me to go. She was at a novice
level of computer knowledge but saw a blurb on
peekabooty in a newspaper. I retired my security
hacking in the 80's back when it was nearly effortless
to hack systems. I was apprehensive about going because
I felt like I'd be looked poorly on for almost
having newbie status on the topics at hand, and
assumed she would be in the same boat.
She told me you only live once, so what if they
look down on us, if they do, they suck, and
we probably won't be the only ones that are rusty
or new. She was totally right.
We saw--
Excellent people, a whole newbie series of
talks on 'what is a hacker' 'dmca', and this
little talk by dimetry skylerov about ebook
encryption and plenty more.
We even attended the high end stuff and found it
to be very accessable. There were plenty
of admitted novices there, there were
plenty more posers there that quickly admitted
to being posers, other girls (gasp!) were there
some were gf tag alongs and many weren't.
also your gf will get a taste of being at a place
with a few thousand people that are proably similar to you!
She'll see other people with a
similar approach to things, mannerisms, dress,
music, even annoyances (see! it's not my fault.
i was born this way!)
It's always great being at a big place
with a bunch of really creative (and somewhat eccentric)people.
afterwards she might think twice about buying
that leather trinity jacket / corsett/o'really t-shirt
On top of that regardless of her age there's always
everything else in vegas. alot of the defcon folk
stay at the hard rock hotel down the street for
ease of gambling & wi-fi discovery.
I would advise against her hooking up to ethernet
of any kind with her laptop if she brings one w/o
supervision. The script kiddies are quick and the pro's are really fast.
So go! it'll be great. she'll have a great time.
Judging from the replies so far to this article
it seems it piqued the ire of a facet of slashdot
that always posts some sort of "why would anyone
want to do/use/make/create something like that?!"
[well- why not?]
I wish there was a mod -5 Curmedgeon feature.
I have my own curmedgeon chunk too which said
"ooo! an ipod zealot text file. they'll feel so special. If i could only come up with something
for cat worshipers that ran on the ipod i'd be rich!"
I just moved to the san francisco from the midwest
and I've been noting a bunch of sociatal quirks that make this idea not so bad (at least for san francisco):
Public transport is big here but I have yet to see someone whip out a laptop on the public transport. playing with cel phone games, gameboy, walkman/ipod is ok, but palm pilot is quite rare. Go figure.
Riders seem to feel pretty secure and comfortable most of the time on the transport here(compared
to new york and chicago) and you can't swing a cat
around here without hitting someone wearing an ipod (also the theoretical cat would hit at least 3 dentist offices 2 optical places and 1 walgreens per revolution).
being able to look up free wi-fi on a device
i'm already carrying and using would be nice.
introducing ipod wi-fi starbucks junkies to a new
place with free wi-fi and a better atmosphere can
only be a good thing.
what the hell. i'll go repair my ipod sometime and load up the list and see how it goes.
i'll propbably print the list too.
and build a free wi-fi enabled roller coaster in
my apartment
because it would be fun.
MoZuki???
Can't remeber whether it was PopSci, SciAm or BYTE
but 10-15 yrs ago I remember an article about ibm
researchers doing business card and phone number
info between people using a handshake, or
having several peopole's devices 'synched' at once
using a banister / handrail.
Knowing IBM i'm pretty sure they paid a visit
to the patent office.