Aussie Government Gives PDF the Thumbs Down
littlekorea writes "The central IT office of the Australian Government has advised its agencies to offer alternatives to Adobe's Portable Document Format to ensure folks with impaired vision are able to consume information on the Web. A Government-funded study found that PDFs can present themselves as image-only files to screen readers, rendering the information contained within them unreadable for the vision impaired."
Couldn't they have just required that the text portions of a PDF files are actually text?
A thumbs down in the southern hemisphere is the same as a thumbs up in the northern hemisphere, as long as you name the file bruce.pdf. It saves confusion.
So can a webpage, or a word document.
I suppose a pure text file cannot, but at the expense of other meta-data. Why not require PDFs to have word position OCR done (part of Acrobat Pro, so hardly a chore), and keep info like page number and position on page for scans. For non-scans it would take effort to destroy the text data.
Hell, even in ASCII I could use something like figlets to generate large letters (for easy reading), and destroy assessibility.
This sounds like bozo official had a scanned hard-copy in PDF, ran into trouble, and blamed the format (even though it would offer a good way to handle the situation built in) rather than the other bozo that scanned it, and didn't use the built in OCR function. I'm pretty sure these people would do the same with HTML, OOXML or ODF; it's not the formats fault.
Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
Other than plain text, are there really many other alternatives which don't endure levels of difficulty. Only other options I can see out there at the moment are ePub, simplified HTML or RTF - but of course then they all fall short of the possibly desired 'fancy formatting'.
As someone will likely also mention, why not just mandate that the PDF contents are actually text, as opposed to images (which is annoying to anyone!).
That is the case with badly done PDFs where pages are rendered as images. PDFs done via the office plugin or Openoffice or any other proper authoring package at the default settings have the text present and the fonts embedded instead so should work fin as far as accessibility.
How about enforcing some computer literacy on document publishers instead?
Baker's Law: Misery no longer loves company. Nowadays it insists on it
http://www.sigsegv.cx/
As there are a bunch of tools that can convert a PDF to an audio file, I find it hard to believe that there are no screen readers that can handle a PDF file.
Morons. Regardless of the destination file format, if you scan something it will be inaccessible unless some special processing is done (OCR).
Acrobat sucks, but PDF is actually a decent format (if you have a decent reader)
And it helps if the PDF authors aren't incompetent.
The file format is not to blame. Morons who scan text-based documents into PDF files, saving each page as an image are to blame. Even in 1995 or so, when I was first exposed to OCR technology, it worked "fairly well." Anyone converting text to PDF by scanning pages in as images these days is a complete moron, and a huge variety of applications now support exporting text-based documents directly to PDF format with full text search and indexing capabilities intact, along with fancy formatting like gasp italics, bold script, superscript, subscript, numbers, fairly complex mathematical expressions, etc. Hell, images can even be embedded in PDF docs that are largely textual content (holy wow, the technology!), along with alternate text and hyperlinks. In other words, "WTFMATE."
512 MB RAM, 20 GB disk, 200 GB transfer, five datacenters. $19.95/month.
Possiblly not a bad thing given the vast amount of security flaws and exploits that PDF has been hit with, especially over the last few years.
I really like PDF's ability to retain the font and display of the document without worrying about fonts and the application.
Since I have to distribute documents that are read on a variety of systems, including Linux, OSX, iPhone/Pad and Windows, PDF really beats all other alternatives in compatibility.
Adobe should really work on creating a text/image-only version of PDF without their fancy password protecting features and what-not.
If they don't, perhaps an open source group can take on the challenge.
Look at this page. It's for a local police department in a city that has lots of blind people because of the presence of the California School for the Blind. This is the first page that Google lists for the site. I can't imagine that a screen reader can make anything of the front page and there are no navigation buttons.
The real "Libtards" are the Libertarians!
that they made a PDF version of the report.
Missing from the statement is what the preferred format is.
I would expect a Microsoft format from our illustrious leaders.
Reads like a fairly dumb statement which is what I always
expect from our government.
Sounds like a lead up to them locking themselves (us) into
using a proprietary, expensive, unusable system.
Who , me , negative ,
yep
Go well
And the rest of us say "Get rid of it". We do not access government documents to be blown away by their totally rad page style. We access them for information, and extracting the information from the glumph that encases it is sometimes hard for the best of us.
html all the way. Any formatting you cannot fit in a simple stylsheet can get left out.
Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
The key problem that the majority seem to be overlooking here is that the people affected by this are disabled (mostly the blind). 1 you’re either blind or mostly blind, this is pretty bad, life has already given you the short-end. Screens are designed to be read, this is a fact. 2 your blind and thus probably not they most computer savvy person, your probably getting your friends son to fix this up for you, by installing software meant to fix this. 3 the tools made to help these people are not very well made, most are just providing magnification, or doing text to speech. 4 the office-person that takes a written document , scans it in the office scanner and then puts the result on the web, are not thinking about the poor blind barstard that can see it, its not in their job description.
Finally! Hopefully now we won't have to use those hideous Interactive PDFs that the Electoral Commissions force us to use for digital submissions.
I'm pretty sure that 90% of all documents on the internet need nothing more fancy than RTF encoding or even a very simple set of BBCode tags to be usable. I know PDFs are supposed to have tons of features but why not just be simple and stick with ASCII?
Hire me...
What does it matter that they can't read the text? PDFs aren't about content, they are about preserving the layout. At least that is what it seems like to me when I am foolish enough to try and read PDFs on a device with a different number of pixels than the person who made the PDF file.
If the content matters at all, someone should invent a technology that allows text to be tagged somehow with indicators of the MEANING of that portion of text, like 'this is a title', and let the display device render the text according to how the reader can best view it. It sounds crazy, and it may take a few decades to do, but think of the benefits.
They whose government reduces their essential liberties for temporary security, receive neither liberty nor security.
The Aussie government failed to recommend a standard that supplants PDF in such a way that it handles all the cases one would expect to handle. So what's the point of this exercise that the OZ gov't did other than basically say without words... 'we should publish everything in XML documents since at least those can be parsed to some degree?
You know, there should be an industry-standard sheet of paper (Letter/AF) that meets the JAWS difficulty test, much in the same way there are test HTML pages that test web browser compliance with HTML 1.1/5.0.
Needless to say, blind people already have solutions for reading printed text that is not braille. Print the PDF and then scan it back into OCR-to-speech software. I'm sure someone by now has invented the OCR-capable print driver that eliminates the need to print to paper to reach the step of reading scanned paper.
Create a PDF document that has radially-printed text, "The green fox slept and fellated the brown dog." printed in a straight line, then printed in a spiral, and then printed upside down.
Then for Hebrew and Arabic (RTL languages), the same type of sentence... printed in RTL in various configurations.
Then the newsprint column layout, etc. etc. etc.
Point JAWS at the PDF, or use the PDF reader's built in speech interpretation, and let PDF vendors attain for certified compliance from the accessibility software industry.
Problem solved.
Who writes these idiot gloom and doom headlines. I truly hate misleading BS like this!!!!
Remember the Sydney Olympic Games website being non-readable?
Did they learn anything? Nooooo.
And many .gov.au sites still depend on IE6 - they are frozen to a defunct standard, and applications standardized around 17' in LCD monitor resolution.
The Australian AG's office nearly mostly password protects and bitmaps all its corro to it clients
for the sole reason to make things harder. Brain dead.
This is forgetting all the very real and stark security holes associated PDF's and ADOBE.
Now some have gone a step further and sharepointed things.
The ANAO (Audit Office) should simply go around and give Dept's 'F' for disability considerations, and substandard policy setting.
That would just shift the burden from blind people not being able to read the document (which is bad) to even more people becoming blind by reading through grotty XML (which might be considered as worse).
PDF's main goal is to make sure that a document always *looks* the same(if you have eyes that can look). But what's the point of that? Who cares about the precise graphic layout? Most PDFs that we encounter could have served their purpose better by being HTML documents. For gov documents, it's highly unlikely that they contain complex math equations that require careful layout.
Well, now here's a rich story. A story about lack of accessibility...on Slashdot. Surely this site is highly qualified to criticize others.
Shutting down free speech with violence isn't fighting fascism. It IS fascism!
Portable Document Format... Format.
Specialist Mac support for creative pros, Melbourne
It's not the format wrong, it's users. We in Poland have the same problem with gov's documents. Those morons write documents in ms word, then print them, then scan the printed document and embed scanned image in PDF. PDF *can* contain and preserve the content as text, with format and layout. the user who choose to misuse it is the problem.
The authors of the report say as much in their summary:
And while both the article summary and the report itself stress the need to provide alternate formats alongside (or in place of) PDF, the full report is scant on details or comparative tests of other formats. HTML and RTF seem decent options, as they permit some text formatting options (but are not wedded to them) and are platform-independent. But when you start adding graphics to the mix (as sometimes must happen) their portability tanks. They also cannot prevent the same problem that plagues PDFs: when some dipshit just scans a document and spits out an image-only file.
(PS - would it have killed the submitter and editors to link to the main report page, rather than only to a second-hand link from ITNews Australia?)
So basically they are saying that *because* it is possible to produce a shoddy PDF file which is basically an image dump, that this is reason enough not to use the format?
By this same reckoning, you could produce a really shoddy HTML page which also consists of images and no text... Virtually any format could be misused in this way.
So what's the alternative? That we all revert back to ASCII text since its incapable of holding graphics?
Personally i hate seeing poorly designed websites or pdf files as i described here, where the text is actually an embedded image (or worse - a flash file) and there is no clickable index etc.
We should probably start naming and shaming pdf creation software, and those who use (or misuse) such tools.
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
Actually, $(SUBJECT) says it all.
Nevertheless, I'm sick of the heavy pandemia of XMLitis our craft is going through.
______
(( ____ \-^ChTrSrFr
(( _____
((_____
((____ ----
/ /
(_((
Thumbs DOWN!
Seriously why do governments feel they need to do this B.S. It sounds like the Aussie government is just as retarded as my U.S. one. I am sure Austrailians feel better now that adobe is required to design a system whereby blind people can see.
The desire to create a world where everybody has equal access to everything is a pipe dream. Blind people will never have equal access to the world in which we live. The reason has something to do with the fact that they are BLIND!. You see blind people can't see. This is probably the quintessential point to being BLIND. I figured this out on my own without a government salary. I also did it being an U.S.ian, and you all know how stupid we are. So does this mean stupid usians are smarter the aussies?
Seriously when can we get julian assage to investigate the Austrailian Government. I can see that it is probably rife with stupidity and undoubtable has some deep dark secrets.
-P.S. I no i mispelled Australian, but i don't care because i am a greedy usian imperialist, and it would go against my nature to not try and commandeer the english language.
That's the one they choose? It wasn't the gaping security holes, the incessant patch requests (that are never even 6 steps behind the security holes) or the laborious installation/upgrade process? I'm sorry, I know blind people have it tough on the internet, but this is really the dumbest of the reasons I could imagine you would switch away from a nearly universally accepted format.
someone should invent a technology that allows text to be tagged somehow with indicators of the MEANING of that portion of text, like 'this is a title', and let the display device render the text according to how the reader can best view it.
Sorry, but SGML already fell by the wayside because it was "too hard" to implement.
Now all we have are pre-printed flash cards (HTML) that do not allow you to indicate most of the possible meanings of a particular portion of text; it would be nice to have the alphabet (SGML) back.
Or, to put it in computer-nerd terms, its as if we replaced our Turning-complete language with one that's not Turning-complete.
why not just have the great google automagically OCR any images it finds in PDFs and generate a vision-impaired-friendly version of the PDF?
It then can append a footer to each page stating "the creator of this PDF is a google-certified nimrod".
(I've always found it a bit galling that some paper catalog companies I've dealt with thought it reasonable to create a web presence by posting PDFs with scans of each page their physical catalog. Good luck searching through that!)
the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff
This sounds like a problem that the text browser should be solving, since it appears that whatever browser is being used incorrectly identifies a PDF file as an image and fails to convert it to readable text. It's not like there aren't FOSS tools to do that. Perhaps the only mandate should be for government files to dispense with fancy formatting so the information is easily read.
The government's role should be to 1) make the information easily read as well as easily available and 2) file a bug report with the browser maintainer.
Vanilla HTML is a much better answer. Let the reader control the format - separate the markup from the content, let the reader control the fonts, how emphasis displays, even link colors. Or move one step forward and use (basic!) CSS. PDF is overweight, slow, seriously buggy, can lock content, and is not available for all platforms. HTML readers are ubiquitous, fast, highly compressible and wide open. Heck, I can display and edit a basic HTML file, formatted nicely according to the HTML, on my 1970's-era 64k 6809 machine using a text-based terminal. Now that is good compatibility! And it didn't take long to write, either. Try supporting PDF in a 64k environment. Good luck.
I have never allowed PDF to be used as any form of outgoing documentation for our products; and I've never regretted that decision.
I've fallen off your lawn, and I can't get up.
There isn't one good reason in the entire world to make sure a document "looks the same everywhere."
What we need is that the document is (1) readable, (2) orderly, and (3) conforms to the reader's needs.
When you have someone with poor vision, you don't want some tiny font used for anything, and zooming the page blows the context right out the window. The reader needs to be able to set the font, and the color(s), and the link colors, if any, and the document width, and quite a few other things.
PDF is unfriendly and the very idea that the author has to set the absolute look of the document reeks of elitism, misplaced "artistic" intent at the expense of readability and usability.
And then there is editing -- a document you can't edit and/or annotate is crippled -- and PDF encourages this unfriendly behavior.
The ideal solution at this point in time is, has been, and is likely to remain, HTML, which resolves every one of those critical problems.
I've fallen off your lawn, and I can't get up.
It's hard to read when your wasted, right Australia?
Isn't the web a visual medium and not suited to the blind? What is going to happen to audio files for the deaf? What about big words for those without education? This is a very precarious and slippery slope. Once you bend over backwards for one minority you have set a precedent. Don't think that this won't be used to help destroy the net as we know it today. It wont be too long before your blog won't be allowed until it has been thoroughly scanned by software to determine its PC friendliness. Your birdwatching blog wont be allowed as it isn't available to the blind or maybe your heavy metal appreciation blog either as it isn't deaf friendly. If you haven't worked out that Net ?Neutrality isn't what you should be worried about then you aren't paying attention. It's the PC police you need to worry about.
The new right fascists are bilingual. They speak English and Bullshit.