Sunday, February 20, 2011

An analysis of EPUB3 (and, uh, a bit more)

[I swear when I’m frustrated. That makes this post obscene even by Chris Rock’s standards. Proceed with caution. Also, this was (and is) supposed to be about EPUB3, but as I kept writing it, it kept growing. Fuck it, I’ll just post it as it is.]

The IDPF published the current draft of the new EPUB3 spec a few days ago. Time to see if was worth the wait. Note that this will be a long post.

I’ve read all the sub-specs of EPUB3, and my general feeling about them is one of… “meh”. That’s the best way I can describe it. “meh” leaning towards “not good”. I jotted down notes in my notebook while I was reading them[1], and what follows is a digested summary of my views and sentiments.

Assume I agree with everything in the specs I don’t explicitly disagree with here[2]. Also, while I’ll take this opportunity to mostly rant about the bad parts of EPUB3, keep in mind that there are quite a few good parts as well.

All hail the mighty iPad

I’ll start this section by saying how I have absolutely nothing against JavaScript on the web. That would be stupid. I mean, without JS all of this web app business would not exist, and while arguments against “cloudification” of applications have weight, no matter where you stand on the debate, you can’t say that Gmail isn’t useful.

But we’re talking about books here. In EPUB2, JavaScript execution was under RFC2119 “SHOULD NOT”; for all intents and purposes, this means forbidden. It’s not a “MUST NOT”, but still.

In EPUB3, JS support is now optional. This means you can start using JS in your epub books, yippee! You can now go all Web 2.0 on you e-books. I’ll talk about why this is bad in a moment, but first I’ll like to give credit where credit is due and note that the spec text explicitly mentions  that content creators should avoid using JS if at all possible. Here’s a quote:

Scripting consequently should be used only when essential to the User experience, as it greatly increases the likelihood that content will not be portable across all Reading Systems and creates barriers to accessibility and content reusability.

Sadly, no one will listen to this. But at least the IDPF has this warning, even though it won't do shit. Now that JS support has moved from "should not" to "optional", people will go out of their way to redefine "essential to the user experience" so that it includes JS. This will break horribly. We'll get epub books created solely for iBooks and all other Reading Systems can go to hell. Progressive enhancement? We will never see it. The people who create epub books are not web developers, they work in publishing. They have no idea what writing code for the web looks like (or writing code at all for that matter), so we'll see hacks upon hacks that work on iBooks pretty much by accident and on nothing else. I've always said that the day that the epub specs start mandating JS support is the day those same specs jump the shark. We're not quite there yet, but the gates of Hell are now slightly ajar.

This is what will happen:

  1. EPUB3 brings "optional" JS support.
  2. Publishers start adding crappy JS to their books hoping it will make them "stand out", "embrace the future", "fuck goats" or whatever.
  3. We now have thousands of books with JS scripts in them that are absolutely useless but whose execution is nevertheless required, otherwise reading the book is impossible. You know, things like special navigation menus, buttons to expand example source code, footnotes in "tooltip" style windows or similar "brilliant design ideas" that stop working when you don't run the book's JS.
  4. EPUB4 now demands JS support. I mean really, you can't expect publishers to go over all those crappy epubs and rework them with progressive enhancement in mind, do you? No, no, no. They'll just lobby the IDPF to make JS support mandatory, and they'll succeed.
  5. Welcome to the web circa 2000! Ah, what a fun place that was.

But I don't blame the IDPF for moving JS support from "should not" to "optional". Actually, that's a lie. Of course I blame them. But I understand they had little choice when a lot of the people who make up the EPUB Working Group are the same people who have abused the term "HTML5" to the point where it doesn't mean anything anymore. Quite literally, nothing. It’s become a whizz-bang-pow marketing term hyped into oblivion by a fruit company. This “HTML5 is love, sex and the future of human civilization” nonsense has even pushed the WHATWG into renaming the spec to just “HTML” (even though they won’t admit the reason publicly). That’s right, the term “HTML5” now officially stands for nothing. Here’s a funny link about this “HTML5 means everything”. Bruce Lawson is specifically calling out the CSS3 == HTML5 == JavaScript idiocy, but you get the point.

Interactivity in books? My God, how ever did books survive the last five thousand years without JavaScript, <video>, <audio> and <canvas>? It boggles the mind.

Publishers: "HTML5 BOOKS? MOAR! LOOK MA IM IN TEH FUTUAR! We're totally not going to go extinct now!"

JavaScript, <video>, <audio>, <canvas> in books == "This book needs more cowbell."

I know I’m being a cynic, but I can’t help myself. The iPad came along, was declared “the savior of the publishing industry” and now everyone seems to be losing their mind.

Again, “HTML5”? Great for the web. Actually, fucking awesome for the web. For e-books? I don’t remember the last time I thought “this book really needs some video”. In fact, 99.99% of all epubs would be far better off with only the most basic HTML and maybe a few lines of CSS.

I know it’s not the IDPF’s fault this is all going to be so shamefully abused, but I still think it should have all stayed at the “SHOULD NOT” level. You want interactivity in e-books? That’s not an e-book, that’s an app. Go make one. You are not going to be able to write an interactive book and expect it to run on all the Reading Systems the same (or at all). That will not happen. Save yourself a lot of trouble and just make an app.

For those with the brilliant ideas of tooltip windows, custom navigation menus and the like which books would be far better without, just don’t do it. No, it does not look “sharp”. Or “hip”. Not even “trendy”. What it is, is stupid. You’re making a book, not a website. Please bear that in mind.

I’m sure that there are valid use cases for all of these technologies in books. A smart person using them appropriately can truly make something wondrous. Sadly, most people think they’re the smart ones. They’re usually not.

I remember how all this got started. Back in the old days, when EPUB was just an idea, this was the train of thought: “How are we going to represent electronic books? Raw, custom XML like in DocBook? Huh… maybe it would be better to use web tech like HTML. It is widely understood and there are ready-made components that will make it easier to build Reading Systems.”

So web tech was used because it lowered the barrier to entry. Instead of using DocBook’s <para>, why not use HTML’s <p>? We get free styling with CSS too.

But this changed. Now it’s not “we’re using web tech to make e-books”, it’s “we’re using e-books to package web tech”. It’s not about making books anymore, it’s about using web tech offline. You think I’m exaggerating? Do you know what term was used to “succinctly describe EPUB” during development of EPUB3? Here it comes: “website in a box”. I’m not kidding. It was used in the IDPF meetings and was even in the November 12, 2010 draft of the EPUB Overview document. Here’s a direct quote:

An EPUB publication can be thought of as a "website in a box".

No. No it shouldn’t be. Never ever.

No required glyph coverage

Honestly, the worst problem with EPUB2 was that there was no required Unicode glyph coverage. Let me explain what that means. On the surface, EPUB is all Unicode. Everything has to be either in UTF-8 or in UTF-16. “That’s great! This means I can use any letter I want!” Not quite. While you can specify any letter you want, Reading Systems aren’t required to display that letter. It would certainly be unreasonable to mandate that all RS’s have glyphs for the entire Unicode range, but there is no minimum coverage specified either, and that would be a good thing. I was hoping this would be fixed in EPUB3, but no such luck.

With the way things are, RS’s can just support ASCII and be done with it. Some support more than that, some support only that. Yes, you can get around this problem by embedding fonts with the required glyphs in your epubs, but most people don’t know they have to. See this FAQ entry for the most popular question about Sigil. I couldn’t even begin to describe the number of people who say infinitely moronic things like “Sigil doesn’t support Unicode” because the book they saved displays as a bunch of question marks in ADE and in all the hardware readers that use Adobe’s RMSDK.

It’s not just Sigil’s problem, it’s everyone’s. People have made epubs that were tested only on the iPad, and since iBooks has fonts with wider glyph coverage than ADE, some characters in those books end up as question marks over there.

There should be some minimum coverage specified. One might ask "but where do we draw the line at mandated coverage? Should CJK support be mandatory? Where is the line?" You're right, those are tough questions. That's why we have a Working Group, to answer them. Too hard to draw a line somewhere? Ok, how about adding one of those shiny RFC2119 "SHOULD" statements asking for greater coverage? It wouldn't do shit, but hey, it just might.

The problem is that nobody at the IDPF seems to give a crap about this problem. That's what we get when the vast majority of Working Group members live in ASCII land, I know, but these guys are making an international standard. Show some breadth of understanding. This lack of mandated coverage is a far bigger problem of EPUB2 than "well damn I can't put video in my books". Trust me.

Living in fairyland

There are plenty of things in these new specs that are wonderful or interesting on the surface, but will never see the light of day. Things like "container-constrained" for JavaScript (great idea!) or the “epub:trigger” element (silly idea, people will just use JS). But they will never be supported on the various Reading Systems, and if they will be, then no one will use them. People who make RS's are by-and-large hacks (exceptions do exist though) who slap some custom controls onto WebKit and call it a day. They won't modify WebKit to support epub-specific elements. That's "too hard". Am I the only one who remembers EPUB2's custom "switch" and "case" elements? Or inline XML islands? Or the whole of DTBook? The only RS that supported those was ADE (hats off to Adobe for trying, I really mean that). Everyone else just pretended it didn't exist. And not even Adobe implemented support for “oeb-page-foot” and “oeb-page-head”, and those were damn useful (on paper at least).  

History has shown that wherever the EPUB specs went beyond what popular browser engines implemented, the specs were actively ignored. It's just "too hard". It's not, of course. It "merely" requires two things: competent developers and people who give a shit. Both of these are very, very rare. Combined? Good luck. Oh yes, it requires one more thing: actual fucking work. Not just taking an existing browser engine, making it display XHTML in pages and calling that an RS. No, actual software development is required, and the most difficult kind of all: working with a huge, foreign code base. That's too much for most.

As an example, I have to work with HTML Tidy since it's an internal component of Sigil. I can't tell you how happy it makes me to know I'll have to implement an HTML5 parser for it because of EPUB3. I'm truly ecstatic about this prospect. I fucking love the very idea of it. But I'll do it because it has to be done. And for the love of God bear in mind the difference in the quality of code between something like WebKit and Tidy.

Tidy could easily be the world's most horrible code base. It's 40k lines of straight-up C, written in the most god-awful way. 800 line functions; cryptic, single-letter variable names; hacks upon hacks that step on other hacks; source comments that are either out of date, worthless[3] or usually just plain wrong. Just... absolute, worthless junk abandoned by the original devs (and those that followed them) many years ago. Nobody works on Tidy anymore, at least not with the official project.

And yet I work with it because I know I have to. WebKit source is worked on and maintained by hundreds of people and it's extremely well written. RS developers, get off your damn asses!

To tell you the truth, I've been thinking about implementing an open-source RS for both the desktop and memory- and power-constrained devices "just to show 'em how it's done". I have some sweet, sweet ideas for it. But I can bloody barely find the time to work on Sigil and FlightCrew. A third project? I can't put the gun in my mouth fast enough.

And don’t even get me started on the “quality” of the Reading Systems out there. I remember the day when we used to complain about ADE. Today, ADE is pretty much the best RS available. Do note that I said “the best available”, not “great”. Today, I’d give it a C. Everyone else gets an F on a good day, and a kick in the balls otherwise.

The worst is certainly iBooks, as any epub creator will tell you. Ask one about their opinion of iBooks, and you can rest assured that the response will be filled to the brim with “fuck”s, “shit”s, “cunt”s, “motherfucker”s and “asshole”s. Apple loves to boast about support for open standards and how they’re important. As long as we’re talking about killing Flash. The EPUB specs can go fuck themselves. It’s not that they’re lazy, incompetent or don’t feel like investing the resources to improve their support for EPUB. It’s not about “missing”  functionality. They intentionally went out and broke things in Mobile WebKit to further their agenda. If they ever tried something similar in Safari, there would be a pitchfork-mob in Cupertino.

But the number of people who make EPUB books (or work in publishing in general) compared to the number of people who develop for the web is… somewhat small. Not to mention that we as an industry are too busy sucking Apple’s dick to notice what’s going on. They can safely brick in our mouth. Oh no, Apple demands 30% off all subscriptions, in-app purchases and a lowest price? That really is the last straw. We’re now going to start sucking that dick very, very slowly in protest. That will teach ‘em!

You’d think I have something against what Apple is doing. Not at all. If someone lets you exploit them, by all means, go right ahead. Apple is screwing us only so much as we as an industry let them screw us. And now that people are starting to come around, we’re all like “OMG we have fifteen inches of Apple’s cock up our ass! What the hell happened?!”. We let it happen, that’s what. Inch by inch they kept shoving it, and we let it slide (yes, pun intended). Now we’d like them to back up a bit. Well guess what, when you have fifteen inches of cock up your ass, it’s hard to negotiate. The cock is the one setting the terms.

Weren’t we talking about EPUB3?

You’re right, I forgot. Here are some bullet points since I’m tired:

  • Greater emphasis on accessibility in the specs. Good that someone realized that the e-book movement is a godsend to people with poor vision. There was some support for this before, true, but now it's more front-and-center.
  • "xml-stylesheet" support is required? Interesting and unexpected. I doubt any RS's will actually support it though.
  • "This schema is normative. In case of conflicts between the specification prose and this schema, the schema shall be considered definitive." Hell yeah! At least now we know which is considered definitive. Trust me, you'd encounter this problem if you ever tried to implement a validator. But NVDL, RELAX NG and Schematron? May as well say "you have to use Jing". Some of us don't want to. How about providing an XML Schema schema? It's standard practice. Great, now I'll have to write my own... again...
  • Supplementary resources with <link> in <metadata>? Fancy.
  • DCMES metadata elements are being replaced by DCTERMS properties. This really is a good idea, the new system should be much more flexible. The transition period will be ugly, yes, but it's necessary. Good call on both the replacement and the transition.
  • "Although the EPUB Navigation Document is required in EPUB Publications, it is optional to include it in the spine." Yes! This will eliminate the need for those ugly "inline TOCs" people like to build where they would basically end up with two different files describing the TOC. Now the NCX is basically an XHTML document that can be styled, and if you really want it in the reading order, go right ahead and include it in the <spine>. Very nice.
  • “page-spread-left" and "page-spread-right" on <spine> <itemref>s. Nice, but how many books use two-page spreads?
  • Embedded MathML support is great. Nobody will care about the restrictions the IDPF has placed on its use. When RS's support MathML, it will be because the browser engine they use internally supports it, and that engine couldn't care less about the IDPF's MathML "restricted subset".
  • page-list nav gives support for cross-referencing an epub with the page numbers of a printed edition. This is important and as such will be used by publishers and (should be) supported by RS's… eventually.
  • landmarks nav replaces the OPF <guide>. This is also very good.
  • Media Overlays: feel free to ignore the existence of the entirety of the Media Overlays sub-spec. I really mean that, you don't even have to read it. Just pretend it's not there since nobody will ever implement it. To add insult to injury, support for it is officially optional, so nobody even needs to implement it. It's dead on arrival, much like DTBook was as a valid EPUB2 OPS syntax.

    Don't get me wrong, it would be great if RS's supported this. But they won't. Nobody ever made crazy money by catering to the visually impaired.

Canonical Fragment Identifiers

This deserves a separate section.

Canonical Fragment Identifiers are ridiculous, at least the scheme presented. They're complex to the point of absurdity. Even-numbered indices so as "not to be sensitive to XML parser handling of whitespace, entity references, and CDATA sections."? This is ridiculously over-engineered. It has support for not only pointing at elements and their textual content (worthy goal), but at pixels in raster images, logical units in vector images, temporal locations in audio an video and if I understand the exclamation mark rules, even support for crossing documents.

Look, you can't cram all that into a single scheme. You just can't. The WG should have just stopped at trying to point at textual content. The CFI scheme as written is silly and reminds me of the crap in SVG 1.2. Most people don't know that SVG requires support for things like raw sockets and file uploading. There's this desire in specification working groups to support every single use case imaginable and then some. Common sense goes out the door, and nobody is either willing or able to just say "NO!" to some of the requests. This is exactly the kind of thing people think of when they say “design by committee”.

This CFI scheme is absurd. The way it's designed, no one will support it. I know I certainly won't.

Suma summarum

That’s it. I’m all out. EPUB3 is nice, but most of it will be either a) misused or b) ignored. Neither is really the IDPF’s fault, but some of it is. The parts that people will support are nice and shiny, like page-list nav instead of the NCX or DCTERMS instead of DCMES.

What do I know. Ignore me.


[1] I ended up writing more than five pages, but most of the notes relate to low-level understanding of the changes from EPUB2, possible contradictions and implementation problems and the like. You know, the things I’ll need to pay attention to whilst working on Sigil and FlightCrew, but usually not things content creators need to care about.

[2] That’s probably not accurate, but let’s pretend it is. There are some things that are just not worth complaining about.

[3] "Thanks to X.Y. for reporting a problem with this function!" What kind of developer would actually write that above a function?