When I first tell someone that blind people can read PDF files, I often get a slightly puzzled expression. Sighted people sometimes appear to assume that the blind can’t read; that maybe somehow blind professors and programmers and lawyers are born, not taught.
I digress. The fact is that blind and other disabled individuals can not only read PDF files, but Word files, web pages and other electronic content using “screen reader”, or other types of “assistive” software or hardware, depending on their disability.
Beyond the simplest of documents, what is required in order to be considered “accessible”, document contents must be structured such that tables, images, footnotes and so on are correctly identified to the user. Without structure, documents are just a heap of words – or letters, if you prefer to assume away the structure that binds the letters and words together into paragraphs.
Sighted users get to “cheat” by gaining clues about tables, lists and so on from the page layout. Those who must use alternative reading methods to read the same document often can’t access this presentational information. They are dependant on either the structure (or lack thereof) in the document, or else the capacity of their reading software to correctly impute the structure based on a programmatic examination of the page.
Section 508 requires that some minimal amount of structure information must be present in order for a document to be considered compliant with the regulation.
Although PDF content is a staple on virtually all government and corporate websites, actual Section 508 compliance assessment and subsequent correction of PDF content remains poor in federal and state government, to say nothing of the corporate world. There are three principal reasons for this.
1. Generally speaking, authors of PDF documents (who aren’t web-content managers) have no idea whatsoever about ensuring their document is accessible. Web content managers, on the other hand, usually know something about accessibility, although they generally focus only HTML.
2. With existing technology, PDF content is far less amenable to automated or even semi-automated accessibility evaluation and correction as compared to content that’s delivered as tagged text (HTML, XML… all the “MLs”). There are a variety of reasons for this, but trust me, it’s true.
3. While PDFs are unambiguously “web-content”, web-content managers nonetheless tend to disregard PDFs, simply because “all they ever do” with a PDF is link to it, or facilitate the creation or emailing of a PDF. Regarding the contents of the PDF and the accessibility thereof, they generally don’t have a clue. PDF is thus the “blind spot” for web-content managers.
Without a doubt, the tools for evaluating and correcting PDF tagging to ensure Section 508 compliance need more work. Strong third-party options for PDF tagging such as NetCentric’s CommonLook Acrobat plugin are emerging. Adobe’s Acrobat Professional 8.0 itself got a rather modest upgrade in the accessibility department – more on that in some other post. ABBYY’s FineReader and Nuance’s OmniPage OCR software are providing new tagging options for PDFs that begin life in a scanner.
The real problem at this juncture is not the developers – after all, they respond to the demand, and demand is coming, thanks in part to the new Target lawsuit. For the moment, we can safely note that it is the web-content managers who have yet to take on board what accessibility standards really mean to them, whether Section 508 or any other mandate. Section 508 doesn’t exactly set a high bar, being somewhere south of WCAG 1.0.
Reports, documentation, manuals, presentations… content needs to be tagged with semantic structure to comply with Section 508. It’s not just HTML and web pages, this applies to Word, Excel, Powerpoint… and of course, to PDF as well. Content tagging should go well beyond current Section 508 standards if a document is to be considered authentically accessible to those who must use assistive technology to read. End-users MUST learn how to structure documents correctly from the outset.
Launching the revolution in structured content is the next frontier. Let’s get on it!
by Duff Johnson