I was on vacation when Adobe announced that the latest 1.7 edition of the venerable PDF Reference was to be shepherded towards an ISO Standard via AIIM.
I checked Adobe’s own FAQ for the move to ISO — it is certainly forthright. Perhaps like most others in this business, I next thought: “great idea, just what all the stick-in-the-muds have been waiting for.” Some worthy contributions have appeared in the blogosphere, especially from RedMonk, and the comments on Scobleizer and Duane’s World are also interesting. Overall, the news is well-received.
My participation in AIIM’s PDF/UA (Universal Accessibility) Standards Committee for the past two years has provided some perspective on PDF in real-world standards development terms. Like anyone else, software developers like to set lofty goals for their more considered efforts, while committees tend to punt on the harder stuff. Developer Committees, on the other hand, are notably forthright in their respect for the concerns of unabashed and articulate consumers of the eventual end products. The trick is getting everyone together in the same room – and conceptually, on the same page. Most end-users don’t talk developer-ese, and software standards, of necessity, are developer-oriented documents.
Moving PDF to ISO means that Adobe Systems will submit the format to industry control, allowing third-parties to contribute to the future development of the PDF specification as equals of Adobe itself. One “interested party”, one vote; something of democracy brought to software development.
Why would Adobe do this? First, the PDF Reference, which describes the guts of PDF in programming terms, is already a freely available document, and has been since 1993. The decision to “go for ISO” rests in very large part on Adobe’s success with its PDF products. At the same time, Adobe’s publishing the Reference in theory allowed others to make equally capable software for creating, changing or viewing PDFs. Indeed, the decision to release the PDF Reference was one of the few key moves by which PDF become the de facto standard for electronic documents in the first place.
If it hasn’t always appeared to act consistently with the ideal, Adobe knows that a world awash in PDF is very likely to be a world where Adobe Systems remains a large and important company, not merely a vendor of high-end desktop software to niche markets. PDF remains key to this plan, and releasing PDF to ISO is Adobe’s best possible move in taking that plan to the next level.
At the standards level, regulatory and competitive pressures blend. For many of the same reasons as Adobe, Microsoft has driven its own Standards agenda for OOXML, and it is in a maze of lobbying, legal and technical races with ODF, PDF and other Standards contenders to satisfy various industry bodies and government regulators as to how genuinely open and practical it really is.
Of course, the lowly customer just wants everything to harmonize such that (for example) an authoring document’s structure is entirely preserved in the final-form version. These various standards efforts aren’t structured to complement each other — that would be far too useful. There is competition in standards, as in software, and electronic document technology is a major battleground.
At the end of the day, after the years of hard work that everyone acknowledges are required to move PDF from Adobe’s PDF Reference 1.7 to a full-blown ISO Standard, developers will gain clear and reliable guideposts for their own PDF application development, and their customers will gain new confidence in their technology investments. The result will be more competition; lower prices, better products and more choices for end-product consumers. In the face of OOXML, XPS and ODF, this was a vital move for PDF, and it comes none too soon.
Turning PDF over to the formal open standards process will have one counterintuitive effect, and that is to raise the cost of entry into PDF software development, starting with the fact that ISO documents cost money to read, and the PDF Reference has always been free. ISO’s PDF will set a higher bar for developers than exists today. If consumers demand “ISO PDF”, then 3rd party toolkits will have to be thoroughly revised and updated and PDF developers will have to hit the books. It will become harder to slap together a “make-do” PDF file. The market may compel software developers to support the Standard, and thereby raise their game. These are well-understood and properly aligned incentives — the reason one develops software standards in the first place.
Apart from the massive investment in PDF technologies as demonstrated in their Acrobat and LiveCycle products, Adobe retains an ace: Reader. With the ubiquitous PDF viewer safely in its hands, Adobe retains the capacity to activate capabilities in Reader using keys stored in an ISO-specification PDF file. Others may try to wrest away Adobe’s position as the provider of the best (free) Reader for PDF, but it won’t be an easy task.
One theory has it that going the standards route will allow Adobe to lock in advantageous market positioning for a decade or more. One option would be to eventually turn Reader into Validator, beginning with carefully worded warnings about “pre-ISO” files, eventually by simply refusing to open shoddy PDFs just because they have a .pdf extension. It’s the PDF Standard, not the Reader Standard, you dig?
Let’s set the context for PDF as an ISO standard by reviewing the existing standards work on PDF technology.
PDF/X and PDF/A, the PDF standards work-products that have made it “out of Committee” thus far, are exclusionary by nature. For the most part, they tell you what you may NOT do in your PDF files: PDF/A for archival and PDF/X for high-end print purposes. It is, therefore, relatively easy to write software to check and if necessary, correct, PDFs to comply with PDF/X or PDF/A-1b. Engineers LIKE these sorts of standards, for they grapple with very specific programmatically addressable concerns.
In PDF/UA (Universal Access), we are concerned with what happens when users must employ screen-readers and other sorts of “assistive technologies” to make use of a document. The most obvious example are blind users, who must use screen-reader or Braille transcription software to read.
It is with this focus that PDF/UA confronts directly an issue that also menaces PDF/A-1a and will haunt the PDF Standard as well. This is, in brief, “semantics”, or the logical structure of document content, as opposed to the physical representation of objects on a page. A properly structured document allows viewing software to accurately deliver the author’s intent in the reading method preferred by the user. A poorly structured document, however intelligible to standard-issue eyes when printed, may deliver gibberish when the contents are abstracted for any other purpose.
Quite apart from government’s ability to insist on addressing these concerns, there is a notable business case for preserving content semantics as well. Besides accessibility to disabled users, other benefits for highly accessible documents abound, including improved navigation and search functionality, as well as content repurposing for mobile and other devices and enhanced interoperability generally. Ensuring that document semantics are (a) accurate and (b) persist into the final form is indisputably good long-term policy for documents.
Whether and how the PDF Standard will come to address these questions, or will leave them entirely to PDF/UA and PDF/A-1a, remains to be seen. For the moment, it appears that Adobe and AIIM want all Standards efforts to proceed in their given tracks, which certainly seems like the right course for the present.
According to their FAQ, Adobe would like to see the 1.7 Reference become an ISO standard within 12-30 months. Can it be done? If done, what would it mean? There’s a significant gap between the language and terminology of the current PDF Reference and the characteristic language of software standards. As one expert put it to me: “The Reference gives you the words, but doesn’t teach you how to write.” From requirements vs. recommendations to validation tools and implementation notes, there’s a lot of work to be done.
Even without ISO recognition, PDF stands as today’s de facto standard for final-form electronic documents. Trust is a precious commodity, and Adobe’s PDF has earned the trust of computer users everywhere for consistency, reliability, ubiquity and relative safety.
Like Microsoft’s Office, PDF and the Adobe Reader are deeply embedded in the fabric of the desktop computing experience. Also like Office, PDF’s present-day dominance generates concerns over proprietary technology. Such concerns are accurately regarded by Adobe as retarding official recognition and adoption of PDF, and at a sensitive time. Just in December, 2006, Microsoft obtained ECMA approval for their massive and much-criticized OOXML specification for Microsoft’s Office applications (and little else), now headed for ISO.
As I pointed out earlier, Adobe’s early choice to publish the PDF Reference was a major factor in the format’s climb to this justifiably proud status. The notable downside was early industry adoption of an imprecise, incomplete Reference document. This spawned a major problem (which continues to this day): the lack of any real conception (programmatic or otherwise) of what constitutes “valid” PDF.
Millions of PDF files are created around the world everyday. The results from a large fraction of these PDF creation events isn’t going to meet any likely ISO-PDF Standard anytime soon, yet most of these same files represent “good faith” executions of the PDF Reference – one way or another.
This article reflects some conversations on these matters with a few 3rd party industry heavyweights, the sort of people who will sit on AIIM’s PDF committee alongside Adobe’s own representatives.
Let’s get one thing clear up-front: I work with, listen to and learn from a wide variety of software developers, but I’m not one myself. I’m a customer. I use, complain about and (occasionally) commend the tools developers build. Some of my best friends (as the saying goes) are software developers, very good ones, in fact. Many are specialists with PDF.
When asked, they agreed that the style and structure of the current PDF Reference simply isn’t ready for ISO as-is. As noted earlier, today’s PDF Reference isn’t written in the typical language of ISO Standards. The Reference makes no differentiation between normative requirements, recommendations and statements of current practice in Adobe’s Acrobat tool. It’s hard to see how one gets PDF to an ISO Standard without undertaking quite a bit of this rather heavy lifting.
The current ISO PDF Standards, PDF/X and PDF/A, are both under 30 pages. Both address highly focused subsets of the 1,300-page PDF Reference 1.7. The ISO PDF project will dwarf those “miniStandards”.
The necessary task of editing the PDF Reference in the timeframe Adobe envisions (12-30 months) seems likely to mean full time work for a small group of highly talented people. Will Adobe dedicate these resources? If it does, will anyone else be able to dedicate the resources necessary to keep up with them? But perhaps these are boring practical considerations. Let’s look at the hummer.
The earlier iterations of the Reference were (to be plain) short and loose. Even the current 1.7 version Adobe is taking to ISO contains many ambiguities. Yet since it published the PDF Reference for anyone to use, Adobe bound itself to respect as many of the possible ways a PDF could be (mal)formed. And they are legion.
In the very, very beginning, Adobe charged USD$50 for the Acrobat Reader instead of giving it away. Had it retained that strategy, the company wouldn’t exist on the business desktop today. The decision to make Reader both free and freely distributable has played perhaps the single most significant role in the success of the PDF format, and of Adobe Systems itself.
Apart from being free, Adobe deliberately made Reader rugged enough to open almost any PDF file, regardless of how corrupt. It had to. Remember, the Reference gave (gives) few rules for building PDF files. Mainly, it offers just the “pdf building” vocabulary itself. When Adobe released the Reference it set itself up for a tidal wave of dubious PDFs, and not only from third parties. Modern document authoring software offers users so many possibilities that Adobe’s own PDF creation software itself often doesn’t know exactly what to do.
In an important sense, then, it is wrong to say that PDF is the de facto standard. “The de facto standard for PDF validation is Adobe Reader,” says Appligent’s CTO, Mark Gavin. “Unfortunately, Adobe Reader is not, and never was intended to be, a validation tool.”
Adobe makes Reader do the software equivalent of a triple back-flip to ensure that the application will open pretty much anything it’s asked to open. That is why Reader is a relatively large and resource-intensive application compared to “alternative” free PDF viewers such as Foxit. Those applications don’t even try to open any heap of bytes with a .pdf extension, but Reader does.
Adobe had realized that compatibility is far more important than formal compliance. Since in the early days the company still needed to evolve the Reference, Adobe made compatibility a matter for the Reader, NOT a matter of the Reference. As a result, the company now finds itself propping up a legacy of rock-solid support for old and vague documentation. As a result, Adobe maintains a zero-revenue Reader that chugs while the rest of the industry enjoys gets a free lunch while looking forward to a free dinner — Adobe’s continued expenditure on developing new features for the PDF Reference.
Adobe has never released a “PDF Validator”, a tool to document compliance or deviation from an “ideal” PDF. If a PDF Standard is to mean anything, it would mean that a validation tool could be built by someone other than Adobe Systems. But it wouldn’t be welcomed everywhere.
Martin Bailey, CTO of Global Graphics says: “If [Adobe] brought out a genuine validation tool today the howls of protest from third-party vendors, their customers (and probably Adobe’s CS team!) would be extremely painful, even if it strengthened PDF and its associated ecosystem in the longer term.”
I wonder. How better to get the painful moves accomplished than by tossing the issue into a Committee? For Adobe, the trauma to third parties and their customers will all be defensible. After all, the Committee is responsible for the Standard that begets the Validation Tool, not Adobe. Moreover, the Committee’s work is very much in the long-term interest of consumers. It’s the classic “price of progress”.
Stephan Jaeggi, co-author of PDF/X-3 and Technical Officer of the Ghent PDF Workgroup, welcomes Adobe’s intention to submit PDF to ISO. “This gives all existing and developing standards based on PDF like PDF/A, PDF/X, PDF/E and PDF/UA a more stable base.” Jaeggi says. “On the other hand it will certainly slow down the development of PDF.”
Martin Bailey agrees, and concluded our discussion with the following thought: “There are clear reasons why Adobe has to take PDF into the standards arena, and clear benefits for users and vendors alike. On the other hand I’ll be interested to see how the time-scales of iterative standards development interact with development of new versions of Adobe’s PDF-based products; that’s going to be a fascinating planning process for all of us.”
Sarah Rosenbaum, Adobe Systems’ Director of Product Management, graciously agreed to answer a few questions on short notice.
Q: Is Adobe basically “done” with the PDF Reference?
A: Adobe will continue to innovate and grow the PDF file format, with the added benefit of stakeholder input via the standards organization.
Acrobat and LiveCycle are important businesses for Adobe, and PDF will continue to evolve through our ongoing investment in innovation as demonstrated by PDF 1.7. Adobe stewarded the PDF specification since 1993, evolving the PDF file format based on customer needs and Adobe will continue to do so in the future. The PDF Reference will continue to be available as it is now until the full PDF 1.7 specification has been fully ratified by ISO. At that time, ISO and the organizations working towards the PDF standard effort will determine where and how the PDF Reference will be accessible and amended.
Q: Adobe’s FAQ states: “New features and changes to PDF…will be developed within ISO.” How will Adobe handle updates to PDF outside of ISO, if any? Will there be an ISO PDF and an Adobe PDF, or an ISO PDF with Adobe Extensions?
A: Once ratified as an international standard, all updates to PDF will be stewarded entirely by ISO. Adobe plans to participate on the technical committee. Just as Adobe does with other standards, the company evaluates new features and includes them in its products as it makes sense according to the objectives established by our product teams. Adobe may include features in upcoming products that are not in the PDF standard.
Q: Does Adobe envision a day when the “mainstream” Reader will only support ISO Specification PDFs? What about Standard or Professional?
A: One of the great features of Adobe Reader and Acrobat is that they are backwards compatible with regards to opening PDF files. You can open PDF files created with the latest Acrobat 8 software and PDF files created with the first version of Acrobat, as well as those created by third-party vendors. We don’t envision a time that there will be a version of Reader and Acrobat that only open the ISO standard of PDF.
Similar to how Reader and Acrobat can now open PDF/A and PDF/X (ISO standards), the proposed ISO standards PDF/E and PDF/UA, as well PDF files that aren’t ratified standards and those from third-party vendors.
Q: Will Adobe release, co-develop or otherwise sanction an ISO PDF Validation Tool, and release a set of validated test files for the first draft of the ISO Specification?
A: Adobe will work in conjunction with AIIM and the ISO working group on the standards process. All tools and test files will be coordinated within the guidelines set forth by the standards committee.
Q: Adobe’s FAQ states that Adobe will submit the “full PDF 1.7 Specification” to AIIM. Are there any exceptions, for example, 3D?
A: The PDF 1.7 specification that is posted to Adobe.com and was referenced in the press release is what is being submitted to AIIM. Everything in that document is included in what is being submitted.
There is certain functionality referenced by the PDF file format that isn’t being released to ISO for standardization. Additionally, there are several other specifications that PDF employs, such as XML Forms Architecture (XFA) and JPEG (image file format) that are under the control of other standards organizations.
Unsolicited advice is rarely well-received, just as few good deeds go unpunished. Offering one’s precious technology to a Standards Committee is one surefire way of becoming a punching bag, as Bob Sutor has ably demonstrated to Microsoft’s grief. Offering considered advice to a major corporation (and occasional consulting client) in public is a good way to get frozen out! Nonetheless, the announcement was made, so this seems like the time. If Adobe were to ask my opinion today, this is what I would say:
- It appears some work is required before 1.7 can realistically become a Standard. Accept that, then dedicate substantial writing, deep technical and other resources to supporting and expediting a rigorous Standards process. The whole industry, regulators and customers will applaud.
- Commit to synchronizing Adobe products to the ISO PDF Specification. Is it desirable or even healthy to think in terms of “getting in front of” the Standard? After all, Adobe still has Reader with which to leverage revenue-generating products. Making this commitment will clarify and reinforce the future of PDF.
- Focus on products! There is so much more to be done with Acrobat and related software, even if PDF itself stayed at 1.7 for years to come.
By Duff Johnson