Slashdot Mirror


Is Free Software Ready For E-publishing?

johanneswilm writes "Over more than 3 years I have been writing my PhD thesis on the politics of Nicaragua. Being the most professional system for PDF generation, I went with LaTeX, and, to make the text accessible for the editors, I used the LyX editor. Now that the publication date comes near, I found I had to spend considerable time creating a script to convert the manuscript to formats such as Epub as none of the available tools were quite ready to do it automatically. Is LaTeX only good for writers in the natural sciences? Is the open source community boycotting ebook formats, as Richard Stallman has proposed? Are there better tools to do the same?"

6 of 221 comments (clear)

  1. Re:You should had compared by TheRaven64 · · Score: 5, Informative

    My fourth book (Go Phrasebook) is due to be published soon. I send 3 copies to the publisher:

    • Print, PDF, generated by pdflatex. Black and white with crop marks.
    • eBook PDF, generated by pdflatex, with cross-referencing hyperlinks and colour for the syntax highlighting.
    • XHTML, generated by some code I wrote, with hyperlinks and cross references and semantic markup in the code listings generated by clang for [Objective-]C[C++].

    The publisher can then just tweak the CSS for the ePub (XHTML) version. A C code listing has lots of span tags marking words as keywords, typedefs, macro uses, variables, and so on. How these are presented is controlled from the CSS, as is all of the rest of the styling.

    The important thing is to make sure you separate content from presentation. If you use a lot of TeX markup in your chapters, then it's hard to use anything other than [La]TeX to typeset it. If you use simple semantic markup with all of the macros defined in a document class, then you can parse the same markup easily with something else and then transform it into some other format.

    You could use some sort of XML and generate TeX from it, but typing XML is horrible. I like to work in vim, and with a couple of macros entering LaTeX is really easy.

    --
    I am TheRaven on Soylent News
  2. Re:Easy solution by TheRaven64 · · Score: 5, Insightful

    Going through PDF is horrible. LaTeX contains a lot of semantic markup. ePub is XHTML, which is a form of semantic markup. PDF is a presentation format. So, you start with semantic markup, discard it all, and then try to generate it again by magic.

    You end up with something that looks vaguely like the PDF, but loses most of the semantic information (e.g. section / chapter breaks). Worse, you often don't want the ePub version to look like the PDF - they're aimed at different form factors.

    --
    I am TheRaven on Soylent News
  3. Boycott? I Think the Tools Merely Lack Maturity by eldavojohn · · Score: 5, Insightful

    Others have told me that the financial gain of publishing an academic book may be up to 700 USD. In comparison to current Scandinavian wages that really means very little, so I don’t think that earning another 700 USD should be a motive to restrict the access to one’s thoughts.

    First of all I would like to commend you and thank you for this sentiment.

    Is the open source community boycotting ebook formats, as Richard Stallman has proposed?

    I don't understand, Stallman decries e-book formats that aren't open. There are many open e-book formats--including ePub. Granted, there are tools out there that allow you (to varying degrees of success like Calibre) to crack and convert to these formats but why bother? As you can see in that table, most everyone supports PDF. You are misunderstanding Stallman's gripe. It's not that we are boycotting e-books, it's that e-book makers are trying to carve out their own proprietary section of the electronic market, reader and creators included. So let them take their ball and play elsewhere. As you noted in your blog, this isn't the only problem:

    Most ebook-readers out there so not implement the Epub-standard perfectly. That means that although one has an Epub that follows all the standards, one can be quite sure that it will not display properly on all the readers. Kovid Goyal, the creator of the Calibre ebook management software has done a good job in creating conversion scripts that create Epubs for all the different readers. Unfortunately they do this by breaking compatibility with the standard, and many distribution sites will only check whether your Epub complies to the standards and not whether the book will actually look good in the reader.

    Most readers handle PDF, I would just stick to the output of LaTeX. I might suggest that your expectations are misdirected at the open source community and might be better directed at the makers of readers that apparently force you to break standards. It's the IE6 conundrum all over again.

    Stallman didn't suggest boycotting ebook formats, just the DRM associated with them (big surprise there). The problem you are experiencing is that sometimes it's difficult to go from one open standard to another. The tools are lacking in maturity and I'm guessing that since my Android phone can easily display PDFs for me that there's not a lot of people demanding this ePub support that apparently needs multiple flavors for each device (and Calibre helps you with this). The tools exist but they'll only get you so far and I think the really special stuff that LaTeX does well is what you'll find yourself needing to fine tune in the end product. Look at how long it's taken LaTeX to get that beautiful and I think you'll discover that making a magical cure-all converter to ${random format} can be a non-trivial task.

    If you start a kickstarter and get your university to donate hosting to making an open free market for any academic papers in any open format, I'd definitely throw in $20 (I've spent about $200 on kickstarter in the past two years). Either that or maybe throw your lot in with arxiv and work with them to fund more format support?

    --
    My work here is dung.
  4. Re:...PROFIT!! by Khan+Fused · · Score: 5, Insightful

    1. Realise no scripts exist for problem
        1,1 Realize that someone writing a thesis on Nicaraguan politics may not know how to program
        1.2 Begin learning to program
        1.3 Spend more time learning to program
    2. Write scripts
        2.1 Divert time from PhD thesis to write scripts
        2.2 Spend more time (diverted from PhD program) learning to program sufficiently to write workable scripts to solve stated issue
    3. Release scripts as open source
        3.1 Fail to complete PhD thesis in time due to time spent programming

    --
    This mind intentionally left blank.
  5. Re:Easy solution by digitig · · Score: 5, Insightful

    The trouble is, PDF is a pretty rotten format for e-readers, because it's all page-layout oriented and so produces output that doesn't scale well for different screen formats and text sizes. It's the wrong format for the job. And DVI has pretty much the same problems. The problem isn't that free software isn't ready for ePublishing -- Calibre and Sigil do the job well. The problem is that there's a disconnect between the assumptions laTeX makes about a document and the assumptions that are valid for ePublishing, Sorry if it's restating the blindingly obvious, but you didn't want the best system for PDF generation, you wanted the best system for PDF and EPUB generation, and that probably isn't laTeX.

    --
    Quidnam Latine loqui modo coepi?
  6. RMS not boycotting e-books by spf13 · · Score: 5, Informative

    While he states "We must reject e-books until they respect our freedom." He also outlines 7 things amazon's e-books do that violate this freedom. Fortunately epub is the most widely accepted e-book format and it has none of these 7.

    1. Available anonymously.
    2. Standard ownership applies.
    3. License determined by vendor, but many have very liberal licenses including CC and public domain.
    4. Open format based on html.
    5. Lending rules same as physical book.
    6. No inherent DRM (though Adobe has a version compatible with DRM).
    7. No one can remotely delete it any more than any other file on your computer.

    RMS isn't against e-books. He's against amazon's approach to e-books.