Slashdot Mirror


Converting TeX to Microsoft Word?

belmolis asks: "For many years I've done almost all of my writing in TeX. This has increasingly caused problems with publishing in journals. For a long time, many journals reset what you sent them, so they didn't care what program you used. More and more, I find, they do, and in most cases, what they want is MS Word. Is there any good way to convert TeX to Word?" "I've seen some advertised. Some only work with LaTeX, which doesn't help. One claims to use a full-scale TeX interpreter, but my queries as to whether it can handle home-brew Metafont fonts, PIC graphics etc. have gone unanswered. These products also all seem to be plugins for MS Word. I don't use MS Windows or any other MS products, and hate WYSIWYG word processors (I hated Bravo before it was reincarnated as Word) so a Word plugin is not a great solution, even if it works.

Furthermore, I wonder what exactly these programs do. If they interpret the TeX and then generate very low level Word, that may result in a document that looks similar, but a journal editor probably won't be able to edit it the way he wants to. In some cases the editor can be persuaded to accept a camera-ready PDF, since it turns out that the publishers often want PDF and the reason the editor wants Word is so he can edit the text, but when the editor can't or won't budge, is there any alternative to reformatting the document entirely in Word or a clone?

The larger question this raises is, where are we going? Even if formats are open, translation is difficult if they are only commensurable at a very low level. Is the solution to write in something very abstract like DocBook? And if so, will the market go this way?"

5 of 89 comments (clear)

  1. Keep it simple. by MrHanky · · Score: 2, Interesting

    You're not going to get as good output from Word as from TeX, so just forget about keeping the document ready for print. The journals will change the lay-out anyway. You need only to keep the basic structure; paragraphs, chapters, lists, figures, etc. And footnotes.

    I would try converting to html instead of Word, (and maybe to Word from html). There are several command line tools that claim to do this. Since YMMV and all that, I can only suggest that you try it yourself. It shouldn't be too time consuming.

  2. Re:Grow Up? Is that an option? by BinLadenMyHero · · Score: 4, Interesting

    Write what? It's not that Word is a bad wysiwyg, it's that wysiwyg is bad per se. It's not a matter of taste. LaTeX is MUCH more productive, gives better result, and you concentrate on content, rather than fighting with Word about format details. Fighting, because Word keeps changing the breaks, formatting, and stuff.

  3. Re:Let them know. by Geoffreyerffoeg · · Score: 3, Interesting

    but my queries as to whether it can handle home-brew Metafont fonts,

    Yeah--good luck with that. metafont->ttf conversion is very tricky. Furthermore, the journals don't really like weird fonts (once they get the DOCs, they often strip ALL formatting). You can go metafont->postscript image->wmf/emf. It is far from ideal


    Let me ask...why do you need (or even have) custom fonts if you're publishing in a journal which will want its own house style anyway? If you're using them for text (in any language) or common symbols, use the journal's font, not yours. If you're using them for obscure symbols or non-text hacks with fonts, just render it into a picture and be done with it.

    And by saying TeX but not LaTeX, are you implying you're doing something in pure TeX? What can you do in there that can't be done in LaTeX and won't make an editor want to reformat it and can be reasonably exported to Word without losing the reason for it being in TeX?

  4. Stop obsessing and get back to writing. by planetfinder · · Score: 3, Interesting

    Compromise a little, use LaTex.
    You can probably live with the crushing limitations relative to using TeX :-)

    And, if there's no other way then use MS Word, its character building (bad pun intended). I'd say that it won't kill you but if you have a lot of equations it might. After about 15 pages of equation intensive stuff you end up using the find function instead of scrolling because it gets so bogged down. It also regularly decides that your equation laden document won't fit on the XX or so gigbytes of free space on your harddrive. It has a long standing bug that causes it to miscalculate the size of some formulas so that no matter how much space you have left on your drive it won't save your document until you remove the offending equation segment. Hilarious, I know. I'd send a document with the problem in it to MS so that they could see the bug but then I can't save the document to send it to them. Chuckle chuckle. Those funny guys at MS have such a great sense of humor. They're worth every hundred dollar bill I send them for their fine products (sarcasm intended). What's really over the top is that people look me straight in the eye and tell me that they never have a problem using Word. Since all my friends are completely honest about anything regarding their computer use (oh dear, more sarcasm, must be past my bedtime) you can probably safely ignore my ranting.

    I've started using Publicon by WRI. Interesting product. A little bit beta. If you feel like just saying f&$k the editors then this is something that you might like to dink around with even though you say you don't like WYSIWYG. Given your other proclivities I'd suggest taking Publicon for a spin around a document or two. It also claims to export TeX or LaTeX or both and it uses a bibliography database and a bunch of other nice stuff. It has a Mathematica front end so its a nice outlining tool too. The cell thing takes a little getting used to but I've come to really like it.

  5. Have you considered the X* technologies? by Anonymous+Brave+Guy · · Score: 2, Interesting

    I've been looking over your comments in this discussion, and also comparing this to what my girlfriend deals with (she's working on a linguistics PhD, and uses LaTeX for much of her work for similar reasons to you). I get the impression that you strongly prefer a "programmatic" approach to WYSIWYG, and ultimately you mostly produce plain-text-ish files with a wide range of characters, some limited formatting, and various custom diagrams. You also sound pretty technically competent generally. Is that about right?

    If that's the case, then have you considered going the XML/XSLT route? I don't say this to be buzzwordy; I actually designed and maintain a fairly large web site that uses a custom XML schema to define the content (easily editable by our non-technical people so certainly possible for you) and then XSLT to do various clever tricks with it. We generate HTML output, but you could apply many of the same tools and techniques we use to generate a mostly-plain-text format that could be conveniently imported into any word processing package instead, Unicode glyphs and such included.

    If you're willing to invest a few days of effort to develop the system, I can't see why you couldn't write a fairly simple customised mark-up language for yourself. You could use character entities or tags to access the Unicode glyphs for all your linguistic symbols, so instead of \phoneticsymbol, you now just need &phoneticsymbol; or <phoneticsymbol/>, depending on how clever/context-sensitive you need the interpretation to be. You can mark up document structure in much the same way as you would with TeX-based macros. Potentially, you could even define shorthand ways to represent common types of diagram as well: SVG plays nicely with XML, is rapidly becoming a viable graphics format in its own right, and might provide a convenient intermediate format to convert your diagrams into any common format required by the journal staff.

    There are apparently some quite decent editing tools available to work with XML-based documents, but it sounds like you'd have about as much time for them as me and would probably prefer to work directly with the underlying mark-up. Converting your existing TeX-based documents could probably be mostly automated if you wanted, and using a structured, text-based format to represent your document has the advantage that you can support different output formats relatively easily in the future, so you wouldn't have to do all this again in five or ten years' time.

    The only non-trivial work to be done in any specific word processor would then be applying the WP's heading styles, footnotes, etc. as required by the particular journal you're contributing to. You could deal with this by including a little processed mark-up in the output from your XSLT, and writing some trivial macros in any modern word processor to search for that, and apply whatever functions needed doing to that bit of text.

    Without knowing more about the kind of documents you produce, it's hard to know whether this idea would be useful to you, but there it is for whatever it's worth. Good luck.

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.