Slashdot Mirror


Aussie Government Gives PDF the Thumbs Down

littlekorea writes "The central IT office of the Australian Government has advised its agencies to offer alternatives to Adobe's Portable Document Format to ensure folks with impaired vision are able to consume information on the Web. A Government-funded study found that PDFs can present themselves as image-only files to screen readers, rendering the information contained within them unreadable for the vision impaired."

179 comments

  1. A subset of PDF files? by 0olong · · Score: 1

    Couldn't they have just required that the text portions of a PDF files are actually text?

    1. Re:A subset of PDF files? by sjames · · Score: 4, Insightful

      Given the number of times government officials around the world have failed to understand the difference between removing text in a PDF and replacing it with black and just covering the text over with black, they'd probably get it wrong about half the time even with best intentions.

    2. Re:A subset of PDF files? by Yvan256 · · Score: 0

      Yeah, that's like asking alternatives to websites because the text could be inside a graphic file.

    3. Re:A subset of PDF files? by c0lo · · Score: 3, Insightful
      Yes, but what easier way for a bureaucrat than: printing the document, inserting into a scanner (err.. document center) and ... voila, job done.

      Learn how to operate another program? Spend from the budget for another set of licenses? (the horror)... start to use Open Office or the like?

      --
      Questions raise, answers kill. Raise questions to stay alive.
    4. Re:A subset of PDF files? by wiredlogic · · Score: 4, Informative

      ISO already has created the standardized PDF/X subsets used widely in the publishing industry. They lack support for extra features like scripting and other extensions.

      The main problem with PDF for document archives is that it is a presentation format and doesn't adequately preserve text structure since everything is broken down into lines of text or individually placed glyphs. Analysis of a page layout can only bring back so much. There are better ways to store data that offer more versatility.

      --
      I am becoming gerund, destroyer of verbs.
    5. Re:A subset of PDF files? by Anonymous Coward · · Score: 1, Informative

      Adobe does not use the operating system functions to render the text, I guess that's the root of the problem.

    6. Re:A subset of PDF files? by davester666 · · Score: 2, Funny

      XML to the rescue!

      --
      Sleep your way to a whiter smile...date a dentist!
    7. Re:A subset of PDF files? by Anonymous Coward · · Score: 0

      This is the dumbest government ever. If there's text, there is TEXT. It's about people who MAKE the content.

    8. Re:A subset of PDF files? by Wingit · · Score: 1

      Yes! I don't always want the slow rendering of a pdf. I want content. One format of convenience for the publisher should work for those of us that want the content. I avoid pdf files on a regular basis because a simple html file will work for many things. The overhead and long pause for a less-than-usable pdf is silly much of the time.

      --
      We win together or suffer without.
    9. Re:A subset of PDF files? by martin-boundary · · Score: 1

      Nope, it's more like telling your webmasters not to put graphical text banners on the company website.

    10. Re:A subset of PDF files? by Kizor · · Score: 3, Insightful

      I expect they could require that all they wanted, and it still wouldn't happen.

      If my usability manuals are to be believed, people have neglected the safeties of nuclear reactors because those things are a chore and do nothing anyway. If you don't want your users to do something, then you design your system so that they never get the option.

    11. Re:A subset of PDF files? by PhunkySchtuff · · Score: 1

      There is also the PDF/A standard, which is designed for exactly this purpose. It's a subset of the PDF spec for long term archiving of documents and it disallows a lot of things like scripting, similar to PDF/X.

      http://en.wikipedia.org/wiki/PDF/A

    12. Re:A subset of PDF files? by TCDown · · Score: 2, Insightful

      I don't understand the comparrison between websites and PDF's? Graphical text banners, or images that contain text, are perfectly acceptable under WCAG, as long as alt text or long descriptions are used correctly. And if a PDF is correctly created then text can easily be read by a screen reader.

    13. Re:A subset of PDF files? by Bert64 · · Score: 1

      The problem is as usual, one of incompetence and ignorance.

      Incompetent users create websites without appropriate alt tags, and those same users create PDF files which are also incorrectly created...

      Ignorant users then view these files and don't notice, or don't care, that they have not been created correctly.

      Because only a very small minority of users actually do bother to check, they simply get ignored.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    14. Re:A subset of PDF files? by sourcerror · · Score: 1, Informative

      Last time I checked Adobe reader had built-in OCR and text-to-speech even in the free Acrobat Reader. The IT director was just plain lazy, or there's some lobbying.

    15. Re:A subset of PDF files? by node_chomsky · · Score: 1

      Since when did anyone described as a 'government official' find a graceful solution to problem?

    16. Re:A subset of PDF files? by Anonymous Coward · · Score: 0

      Last time I checked Adobe reader had built-in OCR and text-to-speech even in the free Acrobat Reader. The IT director was just plain lazy, or there's some lobbying.

      Who exactly is to use the OCR or text-to-speech? The bureaucrat (/IT director) or the vision-impaired person?

    17. Re:A subset of PDF files? by SanityInAnarchy · · Score: 2, Insightful

      No, that would be analogous to allowing PDF, but requiring the text portions actually be text.

      And that would actually be reasonable.

      --
      Don't thank God, thank a doctor!
    18. Re:A subset of PDF files? by SanityInAnarchy · · Score: 1

      Which is not at all a reason to not allow PDFs.

      After all, if people aren't going to follow the rules anyway, what makes them think banning PDFs will prevent government agencies from using PDFs? If the rules will actually be enforced, why not simply add a rule that the PDF in question be accessible?

      --
      Don't thank God, thank a doctor!
    19. Re:A subset of PDF files? by tixxit · · Score: 2, Insightful

      Working as a web developer for the Canadian gov't, we had some similar rules for content. Mainly, you always had to provide it in the most accessible form possible. This usually meant HTML > PDF > Office Document. However, it was always on a best effort/convenience basis. So, if you posted PowerPoint slides, you also had to post the PDF versions, since making a PDF version was dead simple. However, we certainly weren't required to go all out and make a usable HTML version as well.

      We also offered many things (eg. transcription or translation) on an "as requested" basis, since technically we were suppose to offer them, but realistically we didn't have the budget to do it for everything. This worked well.

      I think just flat out banning PDFs is stupid. Require accessibility (best-effort), but allow for wiggle room. Yeah, it would be great if all PDFs had real text in them, but if the choice for some gov't agency is to either post an inaccessible version of the document or post nothing at all (because the time/cost required to make it accessible is too high), then they should be able to post the inaccessible version.

    20. Re:A subset of PDF files? by wvmarle · · Score: 1

      Is that a problem of the pdf format or a problem of one specific pdf reader?

    21. Re:A subset of PDF files? by u17 · · Score: 1
      You might think this is funny, but I encourage you to try out strings file.pdf | less on a couple of pdf files. Turns out there actually is xml embedded inside some pdf files:

      <</Subtype/XML/Length 3643/Type/Metadata>>stream
      <?xpacket begin="
      " id="W5M0MpCehiHzreSzNTczkc9d"?>
      <x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 4.0-c316 44.253921, Sun Oct 01 2006 17:08:23">
      <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <rdf:Description rdf:about=""
      xmlns:xap="http://ns.adobe.com/xap/1.0/">
      [...]

    22. Re:A subset of PDF files? by juasko · · Score: 0

      The problem is the Microsoft way of displaying fonts.

      I don't like the way Adobe normally have there software maid. Cant stand flash or their reider. Nor is the CS suit any good. But on this issue they do something good.

      You might not like how Apple display text on their screens, feels blurry maybe. However the fonts are far more correct and give u close to WYSIWYG. Here the Adobe reader actually portray text good, on a windows machine too.

      The Apple way i prefer, but that will not bee good until screens have similar ppi as the Apple iPhone 4. Todays 126 ppi is way to low.

      But for the screens to get better, we need a resolution independent OS. Not sure who is closest to make such one, Apple with iOS? But if screen resultion would get closer to 300ppi, then the Apple way of displaying fonts would be rally great. And microsoft could then easyly adopt same algorithms, as they are simpler than the one Microsoft uses, but still true to the shape of the charactes. Which Microsofts way of displaying text isn't.

      Adobe does here a good compromize, close to real WYSIWYG, but still quite sharp text.

    23. Re:A subset of PDF files? by Anonymous Coward · · Score: 0

      FTFY:

      Require safety procedures, but allow for wiggle room. Yeah, it would be great if all drilling rigs followed the safety procedures, but if the choice for some company is to either drill in the Gulf without the safety procedures or don't drill for that oil (because the time/cost required to do it safely is too high), then they should be able to drill in the Gulf without safety procedures.

      Now do you see the fallacy in your argument?

      If you "can't afford" to do it safely then we're not going to let you do it. If something actually needs to be done, then there is always the time & cost to do it safely/accessibly; you just have to accept that something else which is less important won't get done at all.

    24. Re:A subset of PDF files? by sribe · · Score: 1

      There are better ways to store data that offer more versatility.

      True. But few ways that offer absolute fidelity to an original paper document. And sometimes, thanks to opportunistic lawyers, an absolutely accurate rendering is more important than text structure. Try defending a medical malpractice lawsuit, and claiming that "well the text most certainly is the same and has not been tampered with, even though the document in our system looks a little bit different than the one in your hand, so it really is the same document"...

      Sad, but true.

    25. Re:A subset of PDF files? by bjourne · · Score: 3, Informative

      To make the documents accessible, they will need to create them in such a way that the screen reader can read the text for the blind person. Believe it or not, extracting the text contents from a pdf file is actually a very non-trivial problem. Mostly the problems are caused by pdf authoring tools that render each glyph separately. The text extractor then has no idea about which characters belong to each line and has to guess based on the baseline of the character. Another problem is non-ascii characters and how the authoring tool decides to render them. The venerable free software tool pdflatex uses composite characters (basically it renders multiple glyps on top of each other) which makes it impossible to accurately extract the text.

      So no, it is not about stupidity or bad Microsoft softare. PDF just is unsuitable for accessable documents.

    26. Re:A subset of PDF files? by N+Monkey · · Score: 1

      Last time I checked Adobe reader had built-in OCR and text-to-speech even in the free Acrobat Reader. The IT director was just plain lazy, or there's some lobbying.

      Built-in OCR? I can see the "read out loud" option in version 8.2.5 but I'll be damned if I can see anything like OCR.

      The only "free" (note quotes) OCR package I've ever got to work reliably is the one that is built-in to Microsoft's "Document Imaging" (.mdi) application.

      [disclaimer]It's "free" if you already have access to Microsoft Office. I don't think it's widely known that there is built-in OCR functionality. Now if only there was a free mdi to pdf converter that keeps the ocr information. (sigh) [/disclaimer]

    27. Re:A subset of PDF files? by Inda · · Score: 1

      You need the pro version, not the vanilla reader.

      And the OCR engine is the worst I've ever used. It takes me back to the days of 1998.

      --
      This post contains benzene, nitrosamines, formaldehyde and hydrogen cyanide.
    28. Re:A subset of PDF files? by tixxit · · Score: 1

      That is a silly analogy. In your case, we are talking about a VERY high cost for not following the safety procedures. In mine, we are talking about, at worst, a no-cost scenario, where a disabled user cannot access a document that would not have been available to him otherwise. He has lost nothing, but the vast majority of users have gained.

      Let me offer you analogy. Someone offers you a lottery ticket for free. If you take it you either lose the lottery, in which case you are exactly where you started. If you win, you'll have a bunch of money. There is NO reason not to take the lottery ticket. The cost to you in either case is 0, but one (improbable) case has a very high pay out.

      Now, suppose this same guy hands you a revolver w/ 1 bullet and says that if you survive, you'll get $1000000. However, if you lose you die. Even though there is a high chance you'll win, there is still a case where you may pay a VERY high cost. Suddenly you are in the position where you have to make a risk assessment. If you value your life, then the cost will probably be too high to take the chance.

    29. Re:A subset of PDF files? by fyngyrz · · Score: 1

      Of course, if you had simply used HTML as your only target, no one anywhere would have a problem, and the job would be enormously simplifed. You should be asking yourself, "Why is PDF even used?" And the answer is, "No good reason." All PDF does is lock the display format, offer the opportunity to make the document read-only and less accessible, all of which are *entirely* bad things.

      --
      I've fallen off your lawn, and I can't get up.
    30. Re:A subset of PDF files? by fyngyrz · · Score: 1

      So use paper. Don't expose the rest of us to a shitty format in order to excuse further legal excess. I find my sympathy level for doctors, litigants, judges and lawyers is approximately equal: zero. That green stuff you see when you look up? That's pond scum.

      --
      I've fallen off your lawn, and I can't get up.
    31. Re:A subset of PDF files? by sribe · · Score: 1

      So use paper.

      Oh yeah, that's a most excellent way to store medical records. For sure. Thanks for the helpful suggestion. Not.

    32. Re:A subset of PDF files? by fyngyrz · · Score: 1

      Look, medical records should be fields in a database. Not on paper, not in PDF, not in text or HTML files. If there is text, it should be in a text or HTML field. Images embedded in the database should be JPEG or lossless PNG. If you don't put the medical records in a database, you have enormously compromised their primary utility: to be employed for the health of the patient (and others with similar problems.) To insist that PDF is required for medical records is to insist that things be the absolute least functional they could be. Don't go there. If you need a legal document, you print something from the database. If that's not acceptable, change the rules -- don't screw up the database to accommodate litigation, for crying out loud. That's a case of the putting the cart miles before the horse. Medicine is for the patients. Not the lawyers.

      --
      I've fallen off your lawn, and I can't get up.
    33. Re:A subset of PDF files? by sribe · · Score: 1

      To insist that PDF is required for medical records is to insist that things be the absolute least functional they could be. Don't go there.

      Wow, just wow.

      Yes, much of the data should be fields in a database.

      However, there is this notion of a "report" from a "specialist" which might gather together much information, some of it data, some of it image, and some of it narrative, in order to be sent to a "referring physician", either "primary care" in many cases, or even to several different "specialties" in more complex cases. Such a "report" may contain, in addition to the data and/or narrative description of the particulars of the patient condition, explanations of prognosis in the patient's specific case, recommendations regarding tolerance for treatments, description of social/familial situation that may support or limit the types of care that would be appropriate...

      So no, you can't stuff everything into a structured field. Yes, you can stuff it into a bunch of text fields. But for some things that is far less useful to the providers than a well-structured and well-formatted letter. And yes, if you send it to the patient's other doctors, you damn well better be able to produce that exact document later. And no, store it on paper is not an acceptable answer.

      If you need a legal document, you print something from the database. If that's not acceptable, change the rules.

      I don't know what country/planet you're in, but here in the U.S. that answer is laugh-out-loud silly. I wish.

    34. Re:A subset of PDF files? by tixxit · · Score: 1

      1) 95% of the people in the agency couldn't write HTML.
      2) The majority of stuff we posted was not made by us.
      3) A large number of documents were made before personal computers existed and are scanned in.
      4) HTML does not make a good presentation format. Especially for the presenter who doesn't know that a world exist beyond what is on his desktop.
      5) Read-only isn't necessarily bad, especially in regards to #2. Keep in mind this content also includes video and audio (although a large number of these are transcribed).
      6) MS Word and other office applications are easy to use and effective.
      7) Most of these documents have to be shared/edited amongst a dozen or more people (many who are not gov't employees). Working in Word or something similar is a necessity. Though some folks have started using Google Docs, there are all sorts of issues in Canada with having certain kinds of information on servers outside Canada.
      8) Having a web developer convert all your documents cost a LOT of money.

      All that said, a surprisingly large amount of content is in HTML or other highly accesible formats (they do take this very seriously). However, if we are talking about a dry, 100 page, outdated and obsolete report from 80 years ago that no one reads, then no, probably not worth it to worry about.

    35. Re:A subset of PDF files? by peterbye · · Score: 1

      You should be asking yourself, "Why is PDF even used?" And the answer is, "No good reason."

      And my answer is, "No good reason."
      Fixed that for you

    36. Re:A subset of PDF files? by uninformedLuddite · · Score: 1

      How's the weather up there?

      --
      The new right fascists are bilingual. They speak English and Bullshit.
    37. Re:A subset of PDF files? by bandmassa · · Score: 1

      The same numpties who rule that Australian Govt services and web sites shall only support Windows and MSIE, so when you have a Mac at home you HAVE to run a Bootcamp partition simply to run the most recent version of MSIE in order to register for certain government services.

      It stems from agreements signed by John Howard and Bill Gates in the late 90s.

      --
      "I hope you like Guinness, Sir. I find it a refreshing substitute for, er... food." Col. Jack O'Neil, SG-1
    38. Re:A subset of PDF files? by icebraining · · Score: 1

      1) They can't use Word or OOWriter? Because both can save as HTML. And there are much more WYSIWYG editors for HTML out there.
      2 & 3) You should put notices on the pages of such documents to request people's help re-type and convert to HTML. The Australian national library as had immense success with crowdsourcing.
      4) Are you kidding? The same HTML file can bet used from a mobile phone to projectors. Automatic reflowing on text zoom, dynamic CSS rules, it's much better than PDF. There's a reason why plenty of people prefer ePub to PDF for ereaders.
      5) If I really want the document, it won't work. If I can see it, I can copy it. Copying and editing a few words in an image isn't difficult at all, while reading an image in JAWS is impossible
      6) They can export to HTML
      7) You can work in Word and then export to HTML.
      8) You don't need a web developer

  2. southern hemisphere note! by Tumbleweed · · Score: 3, Funny

    A thumbs down in the southern hemisphere is the same as a thumbs up in the northern hemisphere, as long as you name the file bruce.pdf. It saves confusion.

    1. Re:southern hemisphere note! by Frosty+Piss · · Score: 1

      as long as you name the file bruce.pdf. It saves confusion

      This one?

      --
      If you want news from today, you have to come back tomorrow.
    2. Re:southern hemisphere note! by ForresterInc · · Score: 1

      It's a Monty Python reference... http://www.youtube.com/watch?v=_f_p0CgPeyA

  3. So can any format by AvitarX · · Score: 1, Troll

    So can a webpage, or a word document.

    I suppose a pure text file cannot, but at the expense of other meta-data. Why not require PDFs to have word position OCR done (part of Acrobat Pro, so hardly a chore), and keep info like page number and position on page for scans. For non-scans it would take effort to destroy the text data.

    Hell, even in ASCII I could use something like figlets to generate large letters (for easy reading), and destroy assessibility.

    This sounds like bozo official had a scanned hard-copy in PDF, ran into trouble, and blamed the format (even though it would offer a good way to handle the situation built in) rather than the other bozo that scanned it, and didn't use the built in OCR function. I'm pretty sure these people would do the same with HTML, OOXML or ODF; it's not the formats fault.

    --
    Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    1. Re:So can any format by nedlohs · · Score: 3, Informative

      No it doesn't sound like a bozo official since that style of pdf was specifically excluded from the user study they ran.

      You could of course skim the report and know that, but I guess that would mean you couldn't launch into meaningless rants.

      Of ocurse if you did that you'd know the report is available in PDF format which I guess would just launch you on a different meaningless rant.

    2. Re:So can any format by Barny · · Score: 3, Informative

      You do know that in Australia it is law that a company make their website accessible for vision impaired if at all possible.

      --
      ...
      /me sighs
    3. Re:So can any format by AvitarX · · Score: 1

      OK, so the report is not bad, even though the summary and article made it sound like that. Additionally the number one problem is bozos setting up files, the number two is screen reader software, and number three is PEBKAC according to the study (from the summary)

      Importantly, the Study also highlighted that the issues contributing to the inaccessibility of PDF files, when used with assistive technologies, are not in general directly attributable to the Portable Document Format itself. The issues that result in an inaccessible PDF file are, in order of impact:
        the design of the PDF file by the document author to incorporate the correct presentation, structure, tags and elements that maximise accessibility;
        the technical ability of the assistive technology to interact with the PDF file (via the relevant PDF Reader); and
        the skill of the user and their familiarity with using their assistive technology to interact with a PDF file.

      The article specifically mentions the scanned image issue (well calls it image only, but that's the only time I've seen that). I would argue (as I did) for scanned documents PDF is a good format, with acrobat offering tools to OCR somewhat (it's pretty good in my experience), while keeping the image in tact. I don't know what the wonderful alternative to providing things that have no value until scanned (signed ordinances in my city for example which also has unsigned copies as PDFs generated from the native application).

      I will go as far as to say screen reader software and not Adobe, or PDF is at fault for bullet two (and as the article notes at the very least in general are not the fault of the format). Maybe my skimming was inferior to yours, and you are correct, the study does not appear to be rooted in the bozodem the the summery and linked article implied.

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    4. Re:So can any format by AvitarX · · Score: 1

      Are the PDFs on the website? It sounds like efforts should be made to make them accessible either way. It's not the formats fault that people are making image only PDFs without OCR text. The format itself makes an accessible scanned document easier than others that I know of off the top of my head (in HTML for example one would need to right an application that OCRd the scanned image and added it as an alt property to the image, if one were to use a word document for such a purpose I don't even know that it is possible).

      --
      Wow, sent an e-mail as suggested when clicking on "use classic" banner, and got a fast response that addressed my msg
    5. Re:So can any format by afidel · · Score: 1

      It will be soon in the US as well, the new ADA rules for websites go into affect March 15th 2011. link Of course the ADA website has most everything rules related in both HTML and PDF format so they obviously don't have a problem with PDF =)

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    6. Re:So can any format by vegiVamp · · Score: 1

      The report saying PDF is no good is provided in PDF ?

      *snort*

      --
      What a depressingly stupid machine.
  4. Plain text? by inflex · · Score: 2, Interesting

    Other than plain text, are there really many other alternatives which don't endure levels of difficulty. Only other options I can see out there at the moment are ePub, simplified HTML or RTF - but of course then they all fall short of the possibly desired 'fancy formatting'.

    As someone will likely also mention, why not just mandate that the PDF contents are actually text, as opposed to images (which is annoying to anyone!).

    1. Re:Plain text? by Anonymous Coward · · Score: 0

      As someone will likely also mention, why not just mandate that the PDF contents are actually text, as opposed to images (which is annoying to anyone!).

      I work at a government agency where the people in charge of the effort to scan millions of pages of records don't even know what OCR is. Every single damn document in the whole system is just a PDF JPG. It's fucking awesome looking through two or three million records for something with no ability to search text.

  5. Throwing out the baby with the bath water by arivanov · · Score: 2, Interesting

    That is the case with badly done PDFs where pages are rendered as images. PDFs done via the office plugin or Openoffice or any other proper authoring package at the default settings have the text present and the fonts embedded instead so should work fin as far as accessibility.

    How about enforcing some computer literacy on document publishers instead?

    --
    Baker's Law: Misery no longer loves company. Nowadays it insists on it
    http://www.sigsegv.cx/
    1. Re:Throwing out the baby with the bath water by robbak · · Score: 4, Informative

      Not necessarily. PDF does not preserve text flow. It breaks up paragraphs into lines (or less if kerning has been altered), and places them accurately on the page. If you have a multi-column layout, then a pdf-to-text algorithm (first step in screen reading) is likely to put column-2-line-1 between column-1-lines-{1 and 2}. Best of luck sorting that out.

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    2. Re:Throwing out the baby with the bath water by peppepz · · Score: 4, Informative

      Not necessarily. PDF does not preserve text flow. It breaks up paragraphs into lines (or less if kerning has been altered), and places them accurately on the page.

      This is not true. PDF is capable of preserving text flow if the document contains such information. See this as an example: if you open it in acrobat reader and move the text cursor using the down arrow, you'll see it travel correctly among columns and paragraphs.
      No page description format will help if the page has been generated in a broken way: for instance, try extracting text from the tables of an html page generated by javascript.

      If you have a multi-column layout, then a pdf-to-text algorithm (first step in screen reading) is likely to put column-2-line-1 between column-1-lines-{1 and 2}. Best of luck sorting that out.

      In this case it is the pdf-to-text algorithm to be broken, and should be fixed.

    3. Re:Throwing out the baby with the bath water by Taxman415a · · Score: 3, Insightful

      This is not true. PDF is capable of preserving text flow if the document contains such information.

      Yes, this can be done, but it is almost universally not done. Of all the pdfs out there, almost all of them that have anything but single column text flow incorrectly. The answer is of course to include this information every time, but I don't see how you can mandate that if the standard doesn't include it and most or all current software creates pdfs that don't have it.

      If you have a multi-column layout, then a pdf-to-text algorithm (first step in screen reading) is likely to put column-2-line-1 between column-1-lines-{1 and 2}. Best of luck sorting that out.

      In this case it is the pdf-to-text algorithm to be broken, and should be fixed.

      I'm not sure that you can always figure out the text flow correctly a posteriori. Once the correct text flow information hasn't been encoded in the document, it's a bit of a crap shoot in some cases to figure out what was intended. Where should that floating box go? Many pdfs have text flow broken up so badly that they appear to read randomly. A few bits from one sentence, then a few words or parts from the middle of another paragraph. Literally the best option for some pdfs is to export them as images and import those to an ocr program.

    4. Re:Throwing out the baby with the bath water by Anonymous Coward · · Score: 0

      Why is parent marked as flamebait?

    5. Re:Throwing out the baby with the bath water by StuartHankins · · Score: 1

      PDF supports streams, which -- in the context of text as opposed to audiovisual or other binary streams -- can be individual lines of text or entire paragraphs / columns / pages. The fact that a stream is usually a line of content is a problem in the PDF generation software, not the format per se.

    6. Re:Throwing out the baby with the bath water by clone52431 · · Score: 1

      PDF is capable of preserving text flow if the document contains such information. ... move the text cursor using the down arrow, you'll see it travel correctly among columns and paragraphs.

      Minor nit-pick here, but that’s not really the complete preservation of text flow. It’s just the positioning of the lines of text within the document. Yes, the relative position of each word between its neighbouring words is preserved, but paragraphs are not preserved.

      I call it a minor nit-pick because paragraphs don’t really affect the way a screen-reader scans the text.

      However it does mean that (for example) there’s no way to jump to the 5th paragraph of the document. The information isn’t there. You could try to re-create the paragraph structure by judging the absolute position of each line relative to its neighbours, but that’s a hackish work-around.

      --
      Distributed Denial of APK: It takes 15 seconds to reply to him anonymously, but wastes tons of his time if we all do it.
    7. Re:Throwing out the baby with the bath water by peppepz · · Score: 1

      paragraphs are not preserved.

      Yes, if you want you can preserve those and much more. See for yourself, there are standard tags to describe most of the document structure.

    8. Re:Throwing out the baby with the bath water by Anonymous Coward · · Score: 0

      Open that PDF document with Okular or Evince and you only notice the text flow does not work.

      The okular is widely used as replacement for unsecure adobe reader. Example the German Embassies use it all around the world.

      Evince does better job but it has problems as well. Some columns causes that all of them gets selected.

      But still there is a problem.

  6. Really? by Anonymous+Squonk · · Score: 0

    As there are a bunch of tools that can convert a PDF to an audio file, I find it hard to believe that there are no screen readers that can handle a PDF file.

    1. Re:Really? by Anonymous Coward · · Score: 0

      perhaps if you actually read the article you would understand. it is not reading the PDF file, it is reading ones tthat have been specifically done as image only. please point to which of your tools there will read a picture to the person?

    2. Re:Really? by robbak · · Score: 3, Interesting

      Also consider pdfs with complex page layouts. Deciphering the text flow from them is often hard for eyeballs, let alone computers.
      2 columns is enough to throw out many screen readers.

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    3. Re:Really? by Anonymous Coward · · Score: 0

      The PDF format has the ability to store the logical structure of the document. Called a "Tagged PDF" it contains tags specifying the structure of the document eg paragraphs, headings, tables, columns, lists etc. It allows the document to be reflowed to fit different size screens, and better support accessibility software. It may be the case that text to speech software does not have good support for tagged PDF or that most of the PDFs on the Internet are untagged. But the PDF format itself is not the problem.

    4. Re:Really? by Barny · · Score: 2, Insightful

      Yes it is, these shouldn't be features, it should be simple for a text-speech program to follow without having some tacked on standard that you now have to expect everyone to follow.

      The layout should compliment the data, not vice versa. If you have to think for one second "will my document be able to be accessed by vision impaired" then that is one second more than it should be, if you type three columns of text in a continuous flow, it should be able to read it back as such without having to go over it later and mark it up.

      --
      ...
      /me sighs
  7. OCR by Anonymous Coward · · Score: 0

    Morons. Regardless of the destination file format, if you scan something it will be inaccessible unless some special processing is done (OCR).

    Acrobat sucks, but PDF is actually a decent format (if you have a decent reader)
    And it helps if the PDF authors aren't incompetent.

  8. This is ridiculous. by palegray.net · · Score: 0, Troll

    The file format is not to blame. Morons who scan text-based documents into PDF files, saving each page as an image are to blame. Even in 1995 or so, when I was first exposed to OCR technology, it worked "fairly well." Anyone converting text to PDF by scanning pages in as images these days is a complete moron, and a huge variety of applications now support exporting text-based documents directly to PDF format with full text search and indexing capabilities intact, along with fancy formatting like gasp italics, bold script, superscript, subscript, numbers, fairly complex mathematical expressions, etc. Hell, images can even be embedded in PDF docs that are largely textual content (holy wow, the technology!), along with alternate text and hyperlinks. In other words, "WTFMATE."

    1. Re:This is ridiculous. by nedlohs · · Score: 1

      Why no try looking at the study before jumping to your conclusion?

    2. Re:This is ridiculous. by palegray.net · · Score: 1

      Why are you assuming I didn't review the study? I did, and again, the conclusions are deeply flawed. The appropriate course of action would be to instantiate improved policies for the production of documents that appear in PDF format for general consumption. Once again, the file format itself is not the problem.

    3. Re:This is ridiculous. by nedlohs · · Score: 1

      Because the case you stated was the one the explicitely excluded so either you didn't review it or you are just trying to confuse things on purpose.

  9. Security by atomicstrawberry · · Score: 1

    Possiblly not a bad thing given the vast amount of security flaws and exploits that PDF has been hit with, especially over the last few years.

    1. Re:Security by Anonymous Coward · · Score: 0

      Paragraphs, please.

    2. Re:Security by palegray.net · · Score: 1

      Wow, the GP response occupies maybe 6x2 inches on my MBP display, and that's with the browser window occupying perhaps only 3/5 of the horizontal space available on the LCD. I think perhaps the issue lies with your particular parsing of the content.

  10. PDF has its merits by tsj5j · · Score: 1

    I really like PDF's ability to retain the font and display of the document without worrying about fonts and the application.
    Since I have to distribute documents that are read on a variety of systems, including Linux, OSX, iPhone/Pad and Windows, PDF really beats all other alternatives in compatibility.

    Adobe should really work on creating a text/image-only version of PDF without their fancy password protecting features and what-not.
    If they don't, perhaps an open source group can take on the challenge.

    1. Re:PDF has its merits by palegray.net · · Score: 1

      Many applications can already export directly to PDF on exactly the terms you've described, and there are things like CutePDF that will allow you to "print" from any application to a PDF file with a couple of clicks under Windows. On Mac OS X and Linux platforms, you can typically just save any document as a PDF file, at least from most native apps. The capabilities you're describing are already in place, and there's no need to worry about strictly text and image-based docs you've created falling prey to any sort of vulnerability, at least not in the scope you've described.

  11. What about Flash? Check out this site: by whoever57 · · Score: 5, Interesting

    Look at this page. It's for a local police department in a city that has lots of blind people because of the presence of the California School for the Blind. This is the first page that Google lists for the site. I can't imagine that a screen reader can make anything of the front page and there are no navigation buttons.

    --
    The real "Libtards" are the Libertarians!
    1. Re:What about Flash? Check out this site: by Yvan256 · · Score: 1

      No navigation buttons? It's worst than that. Without plug-ins, all you get is a gradient in the background of an otherwise empty page.

    2. Re:What about Flash? Check out this site: by noidentity · · Score: 2, Funny

      Nonsense! When I visit that site, I see a HUGE button and some normal, selectable text ("Click here to get the plug-in"). A screenreader would do fine with that. Oh, wait...

    3. Re:What about Flash? Check out this site: by Anonymous Coward · · Score: 0

      The screen reader skipped right past that to this page: http://fremontpolice.org/index.html
      All the image buttons have hover-over text that is read and it is all simple HTML.

    4. Re:What about Flash? Check out this site: by Anonymous Coward · · Score: 0

      epic fail from my home town! woot!

    5. Re:What about Flash? Check out this site: by Anonymous Coward · · Score: 0

      a city that has lots of blind people because of the presence of the California School for the Blind

      Remember, correlation does not imply causation!

    6. Re:What about Flash? Check out this site: by Anonymous Coward · · Score: 0

      No alt-text on the main page either. Did they hire a highschool kid to make this site or something?

    7. Re:What about Flash? Check out this site: by clone52431 · · Score: 1

      No alt-text on the main page [fremontpolice.org] either.

      Um... what?

      <a href="records/faq_answers/2010_holidays.pdf" target="_blank"><img src="images/lobby.gif" width="390" height="98" alt="New Lobby Hours" border="0"></a>

      <a href="policereports/start_report.html"><img src="images/reporting.gif" width="390" height="98" border="0" alt="Filing an Online Report"></a>

      <a href="http://www.crimereports.com/map/index/?search=+Fremont+CA" target="_blank"><img src="images/CrimeReports.gif" width="390" height="98" alt="Crime Reports" border="0" align=""></a>

      <a href="press/press_nixle.html"><img src="press/images/BNpress.jpg" alt="Police Blotter - NIXLE" height="98" width="390" border="0"></a>

      <a href="jail/jail.html"><img src="images/livescan.jpg" width="390" height="98" alt="New LiveScan Services!" border="0"></a>

      <img src="images/IA.gif" width="390" height="98" alt="Office of Professional Standards and Accountability" border="0"></a>

      --
      Distributed Denial of APK: It takes 15 seconds to reply to him anonymously, but wastes tons of his time if we all do it.
  12. It's nice of them by Anonymous Coward · · Score: 0

    that they made a PDF version of the report.

  13. What format by bigtreeman · · Score: 2, Insightful

    Missing from the statement is what the preferred format is.

    I would expect a Microsoft format from our illustrious leaders.

    Reads like a fairly dumb statement which is what I always
    expect from our government.

    Sounds like a lead up to them locking themselves (us) into
    using a proprietary, expensive, unusable system.

    Who , me , negative ,
    yep

    --
    Go well
    1. Re:What format by headLITE · · Score: 1

      Of course a Word document is better suited. So is anything else that preserves the text itself, as opposed to preserving its rendered form. HTML is pretty good for this too. With PDF it can be hard to even figure out where the next word in a sentence is. It doesn't have anything to do with proprietary or not, there are enough free or open formats that work, it's just that PDF is not one of them.

    2. Re:What format by Daniel+Dvorkin · · Score: 3, Insightful

      I would expect a Microsoft format from our illustrious leaders.

      Bingo. Anyone who doesn't see Microsoft's hand in this is hopelessly naive.

      --
      The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
    3. Re:What format by Anonymous Coward · · Score: 0

      Bingo. Anyone who doesn't see Microsoft's hand in this is hopelessly naive.

      it is lucifer fool. dont you see his big burning claw?

  14. So the problem is fancy formatting. by robbak · · Score: 5, Insightful

    And the rest of us say "Get rid of it". We do not access government documents to be blown away by their totally rad page style. We access them for information, and extracting the information from the glumph that encases it is sometimes hard for the best of us.

    html all the way. Any formatting you cannot fit in a simple stylsheet can get left out.

    --
    Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    1. Re:So the problem is fancy formatting. by drinkypoo · · Score: 1, Insightful

      Congratulations, you have jut declared that you do not wish to be able to download forms over the internet.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    2. Re:So the problem is fancy formatting. by Anonymous Coward · · Score: 0

      And the rest of us say "Get rid of it". We do not access government documents to be blown away by their totally rad page style. We access them for information, and extracting the information from the glumph that encases it is sometimes hard for the best of us.

      html all the way. Any formatting you cannot fit in a simple stylsheet can get left out.

      Whoa, does your government somehow not use standardized printed forms? Mine seems to exist primarily to churn those out!

      What government is this, and how can I move there?

    3. Re:So the problem is fancy formatting. by Abcd1234 · · Score: 1

      Congratulations, you have jut declared that you do not wish to be able to download forms over the internet.

      Woah woah! When did it become impossible to print HTML documents???

    4. Re:So the problem is fancy formatting. by StuartHankins · · Score: 2, Insightful

      Most contracts and many forms require rendering with specific type sizes, specific layouts etc. That isn't currently possible with CSS / HTML, which is why PDF is such an important format to many industries where legal compliance with a national agency, standards body, regulated industry body, or governmental standard is necessary.

    5. Re:So the problem is fancy formatting. by mcgrew · · Score: 0

      Why should a form have fancy formatting? If you're accessing it over the internet, why do you need a form at all? And you can download a web page, you know.

    6. Re:So the problem is fancy formatting. by drinkypoo · · Score: 1

      Why should a form have fancy formatting?

      For machine scanning, of course. Ideally we'd do away with them and move to digital signatures, but there continue to be issues to work out.

      If you're accessing it over the internet, why do you need a form at all?

      Because governments resist change until they can figure out how to make it work for them.

      And you can download a web page, you know.

      Let's not be overly obtuse here. PDF exists for a reason, and that reason is to permit the precise position of each element in a way that is not possible even with CSS.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    7. Re:So the problem is fancy formatting. by istartedi · · Score: 1

      Why should a form have fancy formatting?

      I already feel like hanging myself when I do my taxes. I shouldn't feel like gouging my eyes out too.

      --
      For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
    8. Re:So the problem is fancy formatting. by Anonymous Coward · · Score: 0

      Forms are the previous-century way of going about this. Of course I don't want to download forms, I want to fill them in online!

      Most old-school forms suck balls - hard to read, hard to understand, hard to fill in. Good riddance when the last one is dead.

    9. Re:So the problem is fancy formatting. by fyngyrz · · Score: 1

      Most contracts and many forms require rendering with specific type sizes, specific layouts etc.

      Instead of just stating this, you should be asking, "Why?" Because there is no good reason for it. If you need to reference a particular section, use section numbers. If you need to reference a particular sentence, then you could even number the sentences if you truly believe they are uncountable otherwise. Likewise the illustrations, tables, etc. You see, as it turns out, there is no good reason for rigid document formatting, and a whole bunch of reasons to avoid it.

      Somewhere, someone who cannot think their way out of a paper bag said "documents for our purpose must be..." you just need to find that person and crack them over the head with a clue-bat.

      --
      I've fallen off your lawn, and I can't get up.
    10. Re:So the problem is fancy formatting. by fyngyrz · · Score: 1

      PDF exists for a reason, and that reason is to permit the precise position of each element in a way that is not possible even with CSS.

      That's not a reason. That's a mistake.

      --
      I've fallen off your lawn, and I can't get up.
    11. Re:So the problem is fancy formatting. by fyngyrz · · Score: 1

      Again, wrong problem, wrong solution. Proper solution: fairtax. No tax forms. No filling out tax forms. No more regressive abuse of the low income folks. And no PDFs.

      --
      I've fallen off your lawn, and I can't get up.
    12. Re:So the problem is fancy formatting. by StuartHankins · · Score: 1

      In the insurance industry, contracts and forms must be approved through various agencies. There are laws regarding minimum font sizes and each form must be re-approved each time it is changed. This is done because of regulatory controls and laws.

      The insurance industry is not alone in this regard. In the legal sector, specific page and font formatting is required for many court documents as well as correspondence. You need it to appear as intended, not changed based on the reader's capabilities.

      There are many other examples. Word wrapping, font substitution, document reflow, or resizing of contents is not acceptable or appropriate in most cases.

      Why? Does it really matter? You're not going to overturn the independent decisions of multiple industries (and all the regulatory agencies) without a *very* good reason, and even then you would have to drag them kicking and screaming into doing something differently.

      Some examples I can think of offhand for these rules: people lost context or documents lost meaning because of reflow (i.e. the signatories weren't on every page), or people couldn't read the documents (in many cases where you have large numbers of documents submitted, they must be formatted in a standardized way or you would have chaos), or maybe because someone decided that having different text sizes and fonts made the document easier to read (i.e. footnotes, diagram captions, etc).

      If it's content made for human consumption, formatting is a priority. If it's content made for machine consumption, almost any data format will work. Most of these things are targeted at humans, not computers.

    13. Re:So the problem is fancy formatting. by drinkypoo · · Score: 1

      PDF exists for a reason, and that reason is to permit the precise position of each element in a way that is not possible even with CSS.

      That's not a reason. That's a mistake.

      I'd ask you to explain what the hell you're talking about, but I suspect I'm going to learn that you're batshit crazy no matter what the answer is.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    14. Re:So the problem is fancy formatting. by istartedi · · Score: 1

      That's outside the scope of the problem.

      We're not living in fantasyland. We can't snap our fingers and make everything fair and efficient. For the forseeable future, we have forms. As long as we have forms, they should be easy on the eyes.

      --
      For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
    15. Re:So the problem is fancy formatting. by drinkypoo · · Score: 1

      Most contracts and many forms require rendering with specific type sizes, specific layouts etc.

      Instead of just stating this, you should be asking, "Why?" Because there is no good reason for it.

      Once you understand that there's no reason to ever have anyone fill out a form for anything ever but greed then questions like this become a big wankoff jerkfest of mental masturbation. The simple truth is that this kind of requirement exists, and software like Adobe Acrobat and formats like PDF have cropped up to fill it. If you want to go live in a dumpster someplace and let someone else meet project requirements, that's cool.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    16. Re:So the problem is fancy formatting. by mcgrew · · Score: 1

      I don't know, without all the formatting my tax forms would be more legible.

    17. Re:So the problem is fancy formatting. by mcgrew · · Score: 1

      The problem with "faitax" is it's not fair. Those who recieve more benefits from government (the rich) should pay a higher percentage of taxes than those who recieve nothing at all from government (those barely above poverty).

      I do agree that the tax system is fucked. The Capital Gains Tax should be repealed, and the people "earning" their living gambling on the stock market shouldn't be paying lower taxes than someone earning the same amount of money in a construction business.

      There shouldn't be loopholes and deductions and credits. Why should someone who's buying a house which he'll eventually own get a tax break, while someone who can't buy a house and has to rent doesn't?

      Why should a married couple with no children whose joint income is the same as a single mother pay lower taxes just for being married?

      Rest assured, if your fair tax ever becomes reality, the rich and corporates will wind up fucking us worse than they do now.

    18. Re:So the problem is fancy formatting. by drinkypoo · · Score: 1

      A fair tax system would tax all income at a fixed level. Make more, pay more. But is that what we really want? Most people seem to want a system where you make more, and then pay STILL more. Which mind you, would be okay with me, and which worked fine in this country until the feds got out of control.

      --
      "You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
    19. Re:So the problem is fancy formatting. by vegiVamp · · Score: 1

      Please define "simple".

      --
      What a depressingly stupid machine.
    20. Re:So the problem is fancy formatting. by uninformedLuddite · · Score: 1

      And just how is a blind person meant to fill out a form online. They will need to print it out and then... oh wait. I propose a law for all website forms to be in braille using that new lumpy technology some dipshit patented recently.

      --
      The new right fascists are bilingual. They speak English and Bullshit.
  15. think of the blind people by __aaeuwj6541 · · Score: 1

    The key problem that the majority seem to be overlooking here is that the people affected by this are disabled (mostly the blind). 1 you’re either blind or mostly blind, this is pretty bad, life has already given you the short-end. Screens are designed to be read, this is a fact. 2 your blind and thus probably not they most computer savvy person, your probably getting your friends son to fix this up for you, by installing software meant to fix this. 3 the tools made to help these people are not very well made, most are just providing magnification, or doing text to speech. 4 the office-person that takes a written document , scans it in the office scanner and then puts the result on the web, are not thinking about the poor blind barstard that can see it, its not in their job description.

    1. Re:think of the blind people by Ol+Olsoc · · Score: 1
      Yup, PDF is an issue for blind people - Well put, Mordie.

      On my web pages, I use pdf, and a lot of it. It's an Amateur radio contest site. I do have some blind folks that use the pages.

      After some email where a blind Amateur told me about the problem, and then a phone call to iron out details. We fixed the problem.

      Now I make available PDF, Doc, and html. The person just uses the format that works for them, blind or not. It really wasn't that hard a solution, and I have the fellow looking out for any other problems I might make. Forms are a little tougher, but can still be made accessible. I have to give up on form math functions, but this is about making something available, not that everyone has to use PDF's. Andlast time I checked, PDF's are always made from something else.

      --
      Why is this even on SlashDot?... Why is this even on Slashdot?...Why is this even on Slashdot?
  16. Death be to Government PDFs by Anonymous Coward · · Score: 0

    Finally! Hopefully now we won't have to use those hideous Interactive PDFs that the Electoral Commissions force us to use for digital submissions.

  17. Old School by jeremiahstanley · · Score: 1

    I'm pretty sure that 90% of all documents on the internet need nothing more fancy than RTF encoding or even a very simple set of BBCode tags to be usable. I know PDFs are supposed to have tons of features but why not just be simple and stick with ASCII?

    1. Re:Old School by Anonymous Coward · · Score: 0

      Most documents need tables, this is where BBCode fails.

      PDFs preserve page layout (this precious, manually adjusted with spaces and line breaks, page layout). Before reintroducing anything other than MS Word or PDF, one should recondition army of clerks, to care about meaning, not looks of documents.

  18. WTF? by zmollusc · · Score: 3, Funny

    What does it matter that they can't read the text? PDFs aren't about content, they are about preserving the layout. At least that is what it seems like to me when I am foolish enough to try and read PDFs on a device with a different number of pixels than the person who made the PDF file.
    If the content matters at all, someone should invent a technology that allows text to be tagged somehow with indicators of the MEANING of that portion of text, like 'this is a title', and let the display device render the text according to how the reader can best view it. It sounds crazy, and it may take a few decades to do, but think of the benefits.

    --
    They whose government reduces their essential liberties for temporary security, receive neither liberty nor security.
    1. Re:WTF? by adamofgreyskull · · Score: 1

      If the content matters at all, someone should invent a technology that allows text to be tagged somehow with indicators of the MEANING of that portion of text, like 'this is a title', and let the display device render the text according to how the reader can best view it. It sounds crazy, and it may take a few decades to do, but think of the benefits.

      Yes, everyone, this is possibly the richest seam of sarcasm ever discovered on /.

    2. Re:WTF? by Anonymous Coward · · Score: 0

      I, too, format all my documents in WTF, and support your suggestion of it as a standard.

    3. Re:WTF? by Bigjeff5 · · Score: 1

      What does it matter that they can't read the text? PDFs aren't about content, they are about preserving the layout.

      Just a guess here, but that is probably the exact reason they don't want government agencies to use PDFs for all their forms.

      --
      Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
    4. Re:WTF? by Bigjeff5 · · Score: 1

      Rich my ass, it's a mediocre jab with an unclear target followed by a solution the target is almost certainly considering.

      Is he making fun of Adobe? Or Australia? If it's Adobe, the initial jab is not sarcasm, it's simply accurate. If the target is the Aussie govt., it's not effectively making fun of their decision, because they are doing exactly what the sarcastic remark suggests before the remark is made, and thus the remark makes no sense at all.

      If by some chance he's actually mocking the decision to move away from PDFs (it really doesn't sound like it, the second half would make no sense), then he's an idiot.

      It would be wonderful sarcasm if the Aussie govt. had just decided to use PDF exclusively in spite of poor accessibility for the blind - the target would be clear, the remark would be scathing, and the feigned ignorance of a solution would be funny, but it doesn't work at all in this case.

      Seriously sarcasm seems like a lost art these days.

      --
      Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
    5. Re:WTF? by afidel · · Score: 1

      Selecting Text when viewing a PDF on my Blackberry works ~90% of the time though complex tables like a bill of sale can get munged.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    6. Re:WTF? by Ol+Olsoc · · Score: 1
      Let's try this again.

      People some times get the process mixed up with the goal.

      To me, that was the gist of the joke. And it's one that pulls you in for the first sentence, then hits you with what s/he wanted to make as the point. I know people who would agree that preserving the layout is job one. Probably a temperament thing. I personally like seeing all the info on my PDF forms showing up at exactly the right place. I really like PDF's But that isn't the most important thing.

      The communications are the goal, not the transport mechanism. And since I have to make the PDF from something, it's not all that hard to make multiple formats available. PDF, doc, html, plain text. no biggie. Probably take an extra ten minutes for my whole site.

      The idea that there is only one standard, and that's what people have to use, well it's pretty difficult to implement. Put a number of choices up, and let the users sort it out.

      --
      Why is this even on SlashDot?... Why is this even on Slashdot?...Why is this even on Slashdot?
  19. OZ gov't is a bunch of whiners by christoofar · · Score: 1

    The Aussie government failed to recommend a standard that supplants PDF in such a way that it handles all the cases one would expect to handle. So what's the point of this exercise that the OZ gov't did other than basically say without words... 'we should publish everything in XML documents since at least those can be parsed to some degree?

    You know, there should be an industry-standard sheet of paper (Letter/AF) that meets the JAWS difficulty test, much in the same way there are test HTML pages that test web browser compliance with HTML 1.1/5.0.

    Needless to say, blind people already have solutions for reading printed text that is not braille. Print the PDF and then scan it back into OCR-to-speech software. I'm sure someone by now has invented the OCR-capable print driver that eliminates the need to print to paper to reach the step of reading scanned paper.

    Create a PDF document that has radially-printed text, "The green fox slept and fellated the brown dog." printed in a straight line, then printed in a spiral, and then printed upside down.

    Then for Hebrew and Arabic (RTL languages), the same type of sentence... printed in RTL in various configurations.

    Then the newsprint column layout, etc. etc. etc.

    Point JAWS at the PDF, or use the PDF reader's built in speech interpretation, and let PDF vendors attain for certified compliance from the accessibility software industry.

    Problem solved.

    1. Re:OZ gov't is a bunch of whiners by Anonymous Coward · · Score: 0

      As a vision impared Australian I am pleased to see this ruling come out on their departments not being allowed to use PDF.

      I don't wish to wait 5 years until the torture test you mention is done, I want to be able to read what is on the government sites now.

  20. Stupid headline - sorry by BudAaron · · Score: 1

    Who writes these idiot gloom and doom headlines. I truly hate misleading BS like this!!!!

  21. Aussies IT Directors Retarded by Anonymous Coward · · Score: 1, Interesting

    Remember the Sydney Olympic Games website being non-readable?
    Did they learn anything? Nooooo.

    And many .gov.au sites still depend on IE6 - they are frozen to a defunct standard, and applications standardized around 17' in LCD monitor resolution.

    The Australian AG's office nearly mostly password protects and bitmaps all its corro to it clients
    for the sole reason to make things harder. Brain dead.

    This is forgetting all the very real and stark security holes associated PDF's and ADOBE.

    Now some have gone a step further and sharepointed things.

    The ANAO (Audit Office) should simply go around and give Dept's 'F' for disability considerations, and substandard policy setting.

    1. Re:Aussies IT Directors Retarded by Bert64 · · Score: 1

      PDF is not in itself a security hole... Adobe's reader on the other hand has many, and the problem is made worse by the apparent monoculture - many people think pdf is a proprietary format and that only the adobe tools are capable of reading it... I have even seen mac users download and install adobe reader because they think its required, had they simply attempted to open the pdf file in the first place they would have found that OSX ships with a much better PDF reader out of the box.

      When anything has 90%+ marketshare it becomes a target for hackers, IE has been beaten down so is far less attractive so now people target flash and acrobat.

      We really need user education, get enough people using alternative PDF readers and no single program will have enough marketshare to attract so much hostile attention.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    2. Re:Aussies IT Directors Retarded by ynohoo · · Score: 1

      So where is the free PDF editor? Never mind, there isn't one.

      Who wants dead tree format anyway? Most times I follow a link and discover the content is PDF, I give it a pass. If you want to publish on the web, use HTML.

    3. Re:Aussies IT Directors Retarded by Anonymous Coward · · Score: 0

      Which part of the word "reader" in the GP's post did you not understand? Where did you get the "free" part from? There are lots of alternatives to Acrobat, even for editing, but they are either free of easy.

    4. Re:Aussies IT Directors Retarded by snugge · · Score: 0

      With current laptop resolutions (usually so called "HD"), a 5 year old 17" monitor is not that bad....

    5. Re:Aussies IT Directors Retarded by Anonymous Coward · · Score: 0

      OpenOffice.org (or LibreOffice for that matter) can output PDF, pdfLaTeX, even free "Desktop Publishing" editors like Scribus. Research before making silly claims.

    6. Re:Aussies IT Directors Retarded by Bert64 · · Score: 2, Insightful

      Are you talking about modifying existing pdf files, or simply creating new ones?

      OpenOffice/LibreOffice has a PDF Import extension which does a pretty good job of editing, i also found via a very quick google search a pdfedit program on sourceforge - http://sourceforge.net/projects/pdfedit/

      As for creating pdf files, there are countless programs for doing that, openoffice, pdflatex, virtually anything that can print to postscript combined with ps2pdf etc etc etc.

      Sure, HTML is preferable to PDF for web content, but PDF is a pretty good format when used appropriately.

      --
      http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    7. Re:Aussies IT Directors Retarded by VolciMaster · · Score: 2, Insightful

      Most times I follow a link and discover the content is PDF, I give it a pass. If you want to publish on the web, use HTML.

      And if you *truly* want to ensure it *always* looks the same *everywhere*, you use PDF

    8. Re:Aussies IT Directors Retarded by juasko · · Score: 0

      Ever used OSX?

    9. Re:Aussies IT Directors Retarded by ibennetch · · Score: 1

      OSX ships with a much better PDF reader out of the box

      I used to think this, then within a week or two I had the following problems

      • Created a PDF from OS X which someone couldn't open (not sure what reader she was using but as it was a corporate PC and she does a lot of PDF work, I assume it was the full Adobe suite).
      • Edited a lengthy interactive PDF, saved it with my input, and sent it back to someone else, who saw a blank form (I ended up using my PC which has Foxit Reader to re-enter everything).
      • Tried to open an interactive PDF that had form fields filled in, but the fields were just blank.

      That's when I decided that Preview is okay for quickly viewing simple PDFs but that I really need to find a replacement program for anything serious.

    10. Re:Aussies IT Directors Retarded by ajrs · · Score: 1

      there is a perl library for that

    11. Re:Aussies IT Directors Retarded by SirGeek · · Score: 1

      So where is the free PDF editor? Never mind, there isn't one. You mean Open Office ?

    12. Re:Aussies IT Directors Retarded by Obfiscator · · Score: 1

      I agree with this. I generally use Skim on OSX instead of Preview or Adobe, because I found it fit my needs better. Maybe it'll work for you.

      --
      "Nothing shocks me. I'm a scientist." -Indiana Jones
  22. Oh, noes! by Anonymous Coward · · Score: 0

    That would just shift the burden from blind people not being able to read the document (which is bad) to even more people becoming blind by reading through grotty XML (which might be considered as worse).

    1. Re:Oh, noes! by Marble1972 · · Score: 1

      I like ;)

    2. Re:Oh, noes! by sourcerror · · Score: 1

      Yeah, it's not like regular browsers could display XHTML.

  23. Death To PDF by z-j-y · · Score: 1

    PDF's main goal is to make sure that a document always *looks* the same(if you have eyes that can look). But what's the point of that? Who cares about the precise graphic layout? Most PDFs that we encounter could have served their purpose better by being HTML documents. For gov documents, it's highly unlikely that they contain complex math equations that require careful layout.

    1. Re:Death To PDF by ChunderDownunder · · Score: 1

      Even if documents don't include mathematical equations, there's an obvious plain text solution - latex.
      It outputs to pdf and i'm sure there exist browser extensions to render to html on the fly.
      Though gmail does pdf-html automatically when viewing attachments

    2. Re:Death To PDF by afidel · · Score: 1

      Most of the stuff I've ever downloaded from my government are standardized forms that need to be consistent.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
  24. Isn't it ironic by DNS-and-BIND · · Score: 1

    Well, now here's a rich story. A story about lack of accessibility...on Slashdot. Surely this site is highly qualified to criticize others.

    --
    Shutting down free speech with violence isn't fighting fascism. It IS fascism!
    1. Re:Isn't it ironic by joelito_pr · · Score: 0

      More qualified than a graphic inside a PDF. And don't call me Shirley.

  25. PDF Format is like ATM Machine and PIN Number by PhunkySchtuff · · Score: 1

    Portable Document Format... Format.

    1. Re:PDF Format is like ATM Machine and PIN Number by nedlohs · · Score: 1

      Yes, welcome to English.

    2. Re:PDF Format is like ATM Machine and PIN Number by Bigjeff5 · · Score: 1

      It is appropriate for any acronym for which the underlying meaning is no longer clearly understood by people who use the acronym.

      Most people don't know that ATM means Automated Teller Machine; ATM has become the name of the machine, not just an acronym for the name, so it's appropriate to call it a machine even though it also calls itself a machine. Same with PIN, most people don't recognize it as Personal Identification Number, so it's appropriate to call it a PIN number even though it also calls itself a number.

      Same with PDF, PDF is the name of the format - it's not a PD formatted document, it's a PDF formatted document.

      Saying PDF format is appropriate.

      The same is true for many acronyms.

      --
      Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
  26. Not the format by ruf10 · · Score: 1

    It's not the format wrong, it's users. We in Poland have the same problem with gov's documents. Those morons write documents in ms word, then print them, then scan the printed document and embed scanned image in PDF. PDF *can* contain and preserve the content as text, with format and layout. the user who choose to misuse it is the problem.

    1. Re:Not the format by fbobraga · · Score: 1

      Those morons write documents in ms word, then print them, then scan the printed document and embed scanned image in PDF.

      It isn't a exclusivity of Poland: I see it all day long here, in Brazilian gov.

    2. Re:Not the format by mjwalshe · · Score: 1

      lol did they miss the plugin that acrobat provides for word

    3. Re:Not the format by jonbryce · · Score: 1

      Yes. It costs money, whereas their photocopier already has a scan to pdf facility.

    4. Re:Not the format by ruf10 · · Score: 1

      Ms work costs money as well.

      Unlike OO.o, which incidentally has export to PDF.

    5. Re:Not the format by clone52431 · · Score: 1

      Not like there aren’t alternatives. I have a free PDF printer. The OpenOffice text editor has PDF creation built-in.

      --
      Distributed Denial of APK: It takes 15 seconds to reply to him anonymously, but wastes tons of his time if we all do it.
  27. Not a problem with format by necro81 · · Score: 1
    My reading of this is not so much that there is something inherently wrong with the PDF format itself, but rather with how it is used. If you are a government agency, producing documents for public consumption, you better know how the hell to produce a PDF with searchable, readable text, and not sequester it to image-only. If you can't get that single concept into your head, it won't matter what fucking format you use.you would think bureaucrats, with their stickler for regulation and procedure, would be able to understand that not every PDF is created equal: some are produced much better than others.

    The authors of the report say as much in their summary:

    while accessibility of the Portable Document Format is improving, like most tools, it cannot compensate for poor design. Content authors need to design accessibility into their documents from the outset.

    And while both the article summary and the report itself stress the need to provide alternate formats alongside (or in place of) PDF, the full report is scant on details or comparative tests of other formats. HTML and RTF seem decent options, as they permit some text formatting options (but are not wedded to them) and are platform-independent. But when you start adding graphics to the mix (as sometimes must happen) their portability tanks. They also cannot prevent the same problem that plagues PDFs: when some dipshit just scans a document and spits out an image-only file.

    (PS - would it have killed the submitter and editors to link to the main report page, rather than only to a second-hand link from ITNews Australia?)

    1. Re: Not a problem with format by ChipMonk · · Score: 1

      you would think bureaucrats, with their stickler for regulation and procedure, would be able to understand that not every PDF is created equal

      You answer your own implied question: being sticklers for regulation and procedure means they don't have to, you know, think about what they're doing.

  28. Poorly created PDF files by Bert64 · · Score: 3, Insightful

    So basically they are saying that *because* it is possible to produce a shoddy PDF file which is basically an image dump, that this is reason enough not to use the format?
    By this same reckoning, you could produce a really shoddy HTML page which also consists of images and no text... Virtually any format could be misused in this way.

    So what's the alternative? That we all revert back to ASCII text since its incapable of holding graphics?

    Personally i hate seeing poorly designed websites or pdf files as i described here, where the text is actually an embedded image (or worse - a flash file) and there is no clickable index etc.
    We should probably start naming and shaming pdf creation software, and those who use (or misuse) such tools.

    --
    http://spamdecoy.net - free throwaway anonymous email - avoid spam!
    1. Re:Poorly created PDF files by JohnFen · · Score: 1

      So basically they are saying that *because* it is possible to produce a shoddy PDF file which is basically an image dump, that this is reason enough not to use the format?

      I think it's more a case of saying that PDFs shouldn't be used inappropriately. If you're producing something which really has to be viewed and/or printed in a visually consistent way analogous to a magazine page, it's hard to beat PDFs. If you're producing something that is to be used in any other way, PDFs blow.

      This has long been my beef with PDFs, this inappropriate use. If the document is intended as a reference, or is text-heavy and intended to be read more than viewed, PDFs and the second-worst choice possible (flash would be the worst). It doesn't adapt to displays of varying geometries, it can't be searched easily from outside the PDF reader, and etc. Text files are still king for this kind of document.

      So what's the alternative? That we all revert back to ASCII text since its incapable of holding graphics?

      Yes! If the graphics are really that critical to the document, then use PDF (or, generally better, HTML -- at least that way you aren't stuck on a screen geometry and can still grep). Even better, make the graphics less central in the document design, so you can just include a jpeg or two with the document.

      Don't get me wrong, PDFs have their place. The main problem I have with them is that easily 90% of the PDF documents I see are made less useful to me by being PDFs.

    2. Re:Poorly created PDF files by Bigjeff5 · · Score: 1

      Exactly, it's a presentation format - it should be used for presentations.

      It shouldn't be used for documents that need to be used for anything other than presentation.

      It would be nice if PDF were a more all-around document format, but it wasn't designed that way and changing that is difficult at best.

      --
      Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
    3. Re:Poorly created PDF files by splerdu · · Score: 1

      Ascii art to the rescue!

  29. Wooosh! by Anonymous Coward · · Score: 0

    Actually, $(SUBJECT) says it all.

    Nevertheless, I'm sick of the heavy pandemia of XMLitis our craft is going through.

  30. This Just In: Adobe gives the Aussie Govt a big by harrytuttle777 · · Score: 0

        ______
      (( ____ \-^ChTrSrFr
      (( _____
      ((_____
      ((____ ----
                / /
              (_((
    Thumbs DOWN!

    Seriously why do governments feel they need to do this B.S. It sounds like the Aussie government is just as retarded as my U.S. one. I am sure Austrailians feel better now that adobe is required to design a system whereby blind people can see.

    The desire to create a world where everybody has equal access to everything is a pipe dream. Blind people will never have equal access to the world in which we live. The reason has something to do with the fact that they are BLIND!. You see blind people can't see. This is probably the quintessential point to being BLIND. I figured this out on my own without a government salary. I also did it being an U.S.ian, and you all know how stupid we are. So does this mean stupid usians are smarter the aussies?

    Seriously when can we get julian assage to investigate the Austrailian Government. I can see that it is probably rife with stupidity and undoubtable has some deep dark secrets.

    -P.S. I no i mispelled Australian, but i don't care because i am a greedy usian imperialist, and it would go against my nature to not try and commandeer the english language.

  31. Of all the reasons to hate Adobe by VoiceInTheDesert · · Score: 1

    That's the one they choose? It wasn't the gaping security holes, the incessant patch requests (that are never even 6 steps behind the security holes) or the laborious installation/upgrade process? I'm sorry, I know blind people have it tough on the internet, but this is really the dumbest of the reasons I could imagine you would switch away from a nearly universally accepted format.

  32. not Turing-complete by Anonymous Coward · · Score: 0

    someone should invent a technology that allows text to be tagged somehow with indicators of the MEANING of that portion of text, like 'this is a title', and let the display device render the text according to how the reader can best view it.

    Sorry, but SGML already fell by the wayside because it was "too hard" to implement.

    Now all we have are pre-printed flash cards (HTML) that do not allow you to indicate most of the possible meanings of a particular portion of text; it would be nice to have the alphabet (SGML) back.

    Or, to put it in computer-nerd terms, its as if we replaced our Turning-complete language with one that's not Turning-complete.

  33. simple solution, totally wrong... by Thud457 · · Score: 1

    why not just have the great google automagically OCR any images it finds in PDFs and generate a vision-impaired-friendly version of the PDF?

    It then can append a footer to each page stating "the creator of this PDF is a google-certified nimrod".

    (I've always found it a bit galling that some paper catalog companies I've dealt with thought it reasonable to create a web presence by posting PDFs with scans of each page their physical catalog. Good luck searching through that!)

    --

    the preceding comment is my own and in no way reflects the opinion of the Joint Chiefs of Staff

    1. Re:simple solution, totally wrong... by Barny · · Score: 1

      Because relying on an external company to do that, a company that has in the past stated it has no presence in Australia at all (see recent info about our information minister asking them to edit their sites), would not be considered a good solution.

      --
      ...
      /me sighs
  34. sounds like by Anonymous Coward · · Score: 0

    This sounds like a problem that the text browser should be solving, since it appears that whatever browser is being used incorrectly identifies a PDF file as an image and fails to convert it to readable text. It's not like there aren't FOSS tools to do that. Perhaps the only mandate should be for government files to dispense with fancy formatting so the information is easily read.

    The government's role should be to 1) make the information easily read as well as easily available and 2) file a bug report with the browser maintainer.

  35. Still... by fyngyrz · · Score: 1

    Vanilla HTML is a much better answer. Let the reader control the format - separate the markup from the content, let the reader control the fonts, how emphasis displays, even link colors. Or move one step forward and use (basic!) CSS. PDF is overweight, slow, seriously buggy, can lock content, and is not available for all platforms. HTML readers are ubiquitous, fast, highly compressible and wide open. Heck, I can display and edit a basic HTML file, formatted nicely according to the HTML, on my 1970's-era 64k 6809 machine using a text-based terminal. Now that is good compatibility! And it didn't take long to write, either. Try supporting PDF in a 64k environment. Good luck.

    I have never allowed PDF to be used as any form of outgoing documentation for our products; and I've never regretted that decision.

    --
    I've fallen off your lawn, and I can't get up.
  36. PDF is total fail. by fyngyrz · · Score: 1

    And if you *truly* want to ensure it *always* looks the same *everywhere*, you use PDF

    There isn't one good reason in the entire world to make sure a document "looks the same everywhere."

    What we need is that the document is (1) readable, (2) orderly, and (3) conforms to the reader's needs.

    When you have someone with poor vision, you don't want some tiny font used for anything, and zooming the page blows the context right out the window. The reader needs to be able to set the font, and the color(s), and the link colors, if any, and the document width, and quite a few other things.

    PDF is unfriendly and the very idea that the author has to set the absolute look of the document reeks of elitism, misplaced "artistic" intent at the expense of readability and usability.

    And then there is editing -- a document you can't edit and/or annotate is crippled -- and PDF encourages this unfriendly behavior.

    The ideal solution at this point in time is, has been, and is likely to remain, HTML, which resolves every one of those critical problems.

    --
    I've fallen off your lawn, and I can't get up.
    1. Re:PDF is total fail. by theArtificial · · Score: 1

      There isn't one good reason in the entire world to make sure a document "looks the same everywhere."

      Printing.

      What we need is that the document is (1) readable, (2) orderly, and (3) conforms to the reader's needs.

      Define we. What my clients need is a file to look exactly like they sent. Like it or hate it in the real world PDF solves this quite well.

      PDF is unfriendly and the very idea that the author has to set the absolute look of the document reeks of elitism, misplaced "artistic" intent at the expense of readability and usability.

      Elitism!? "How dare you print something for me that doesn't look right!" It doesn't need to be friendly, it needs to perform.

      And then there is editing -- a document you can't edit and/or annotate is crippled -- and PDF encourages this unfriendly behavior.

      Not every tool is ideal for every task. I liken WMV to PDF, ideal for final presentation, which is why they are sending it to me.

      The ideal solution at this point in time is, has been, and is likely to remain, HTML, which resolves every one of those critical problems

      Indeed. Enjoy your random note looking documents!

      --
      Man blir trött av att gå och göra ingenting.
    2. Re:PDF is total fail. by fyngyrz · · Score: 1

      Printing.

      HTML documents are printable. Fail.

      Define we

      Everyone.

      What my clients need

      Here, let me fix that for you: What your clients want... because they are misdirected... is a document that only looks one way. You should disabuse them of that notion, really. That's what I do. Failing that, I tell them to go away.

      Indeed. Enjoy your random note looking documents!

      I'm not an interior decorator. I'm not concerned about matching the writing tone to the fonts, and I don't flap my wrists in false artistic frustration when a sentence moves down an extra line or wraps to another page. I'm concerned that the document be readable, editable, and flexible. I want to be able to set a larger font for readability, to control what emphasis means, and be able to edit the document if I find that useful. PDF doesn't provide this, so PDF is no good. And, although it's not a problem of mine, PDF isn't an accessible format, so that's just more suck.

      --
      I've fallen off your lawn, and I can't get up.
    3. Re:PDF is total fail. by theArtificial · · Score: 1

      HTML documents are printable. Fail.

      I think you're confusing "printing" with the narrow definition of printer, I am referring to publishing. You can take the bus instead of owning a car but buses don't nearly provide the same level of accessibility or convenience of a car.

      Everyone.

      Obviously.

      I'm not an interior decorator.

      Some people are and in many professions precision matters. Contracts are an excellent example of this. It appears you're unfamiliar with corporate branding requirements or trademark usage requirements. Everyone is not you. It's not about making you happy - it's about using the right tool for the job. HTML is not the right tool for publishing documents and expecting them to look the same in different environments.

      PDF doesn't provide this, so PDF is no good

      HTML doesn't provide a method for ensuring consistant display on multiple computers. HTML is not good. See how that sounds? Thank god we have different tools! If the internet was built by people like yourself we wouldn't have any options regardless of technical merit.

      --
      Man blir trött av att gå och göra ingenting.
    4. Re:PDF is total fail. by elettronik · · Score: 1

      Managing a small IT lab for a college I can say PDF is the *simple* solution to get a document printed properly, even if it was edited with OO o Word2007. Different machine give out different output from same HTML, rtf, doc source, so users blame me saying "I've used different font, layout was different". Tell me how HTML can look exactly the same in different OS, or simply using different resolution, without coming back to middle '90s (Oh sorry... I remember browser war... write once test everywhere...). So for most people, for which layout and font are important matters, PDF is solution. In the other branch if you need accessibility, you can distribute a plain text copy of document, so user can *choose*, accessibility VS a well formatted document

  37. b.a.c. by jtrainmf · · Score: 1

    It's hard to read when your wasted, right Australia?

  38. ummmm by uninformedLuddite · · Score: 1

    Isn't the web a visual medium and not suited to the blind? What is going to happen to audio files for the deaf? What about big words for those without education? This is a very precarious and slippery slope. Once you bend over backwards for one minority you have set a precedent. Don't think that this won't be used to help destroy the net as we know it today. It wont be too long before your blog won't be allowed until it has been thoroughly scanned by software to determine its PC friendliness. Your birdwatching blog wont be allowed as it isn't available to the blind or maybe your heavy metal appreciation blog either as it isn't deaf friendly. If you haven't worked out that Net ?Neutrality isn't what you should be worried about then you aren't paying attention. It's the PC police you need to worry about.

    --
    The new right fascists are bilingual. They speak English and Bullshit.