Converting TeX to Microsoft Word?
belmolis asks: "For many years I've done almost all of my writing in TeX. This has increasingly caused problems with publishing in journals. For a long time, many journals reset what you sent them, so they didn't care what program you used. More and more, I find, they do, and in most cases, what they want is MS Word. Is there any good way to convert TeX to Word?"
"I've seen some advertised. Some only work with LaTeX, which doesn't help. One claims to use a full-scale TeX interpreter, but my queries as to whether it can handle home-brew Metafont fonts, PIC graphics etc. have gone unanswered. These products also all seem to be plugins for MS Word. I don't use MS Windows or any other MS products, and hate WYSIWYG word processors (I hated Bravo before it was reincarnated as Word) so a Word plugin is not a great solution, even if it works.
Furthermore, I wonder what exactly these programs do. If they interpret the TeX and then generate very low level Word, that may result in a document that looks similar, but a journal editor probably won't be able to edit it the way he wants to. In some cases the editor can be persuaded to accept a camera-ready PDF, since it turns out that the publishers often want PDF and the reason the editor wants Word is so he can edit the text, but when the editor can't or won't budge, is there any alternative to reformatting the document entirely in Word or a clone?
The larger question this raises is, where are we going? Even if formats are open, translation is difficult if they are only commensurable at a very low level. Is the solution to write in something very abstract like DocBook? And if so, will the market go this way?"
Furthermore, I wonder what exactly these programs do. If they interpret the TeX and then generate very low level Word, that may result in a document that looks similar, but a journal editor probably won't be able to edit it the way he wants to. In some cases the editor can be persuaded to accept a camera-ready PDF, since it turns out that the publishers often want PDF and the reason the editor wants Word is so he can edit the text, but when the editor can't or won't budge, is there any alternative to reformatting the document entirely in Word or a clone?
The larger question this raises is, where are we going? Even if formats are open, translation is difficult if they are only commensurable at a very low level. Is the solution to write in something very abstract like DocBook? And if so, will the market go this way?"
The F/OSS LaTeX2rtf is probably your best bet. Coverts cross-references, eps pictures to jpeg, or png (pdflatex users will be happy to know rtf supports jpeg and png), equations to either an EQ field or to a bitmap picture, and does tables right. It isn't perfect, but it is good.
Most journals I've worked with accept TeX/LaTeX or PDF files, given that you use the journal's .sty file (which they supply). I've never seen a scientific journal which doesn't accepd LaTeX output. Some don't accept MS-Word.
If it's only a few journals, I guess no respectable researcher would submit to those, so just submit to better journals.
Make even shorter URLs - 8LN.org
You're not going to get as good output from Word as from TeX, so just forget about keeping the document ready for print. The journals will change the lay-out anyway. You need only to keep the basic structure; paragraphs, chapters, lists, figures, etc. And footnotes.
I would try converting to html instead of Word, (and maybe to Word from html). There are several command line tools that claim to do this. Since YMMV and all that, I can only suggest that you try it yourself. It shouldn't be too time consuming.
2. Eat printed code.
3. Wait 12-24 hours.
4. Collect the word docs at "the other end".
--
"we live in a post-ideological world..." - Billy Bragg.
Write what? It's not that Word is a bad wysiwyg, it's that wysiwyg is bad per se. It's not a matter of taste. LaTeX is MUCH more productive, gives better result, and you concentrate on content, rather than fighting with Word about format details. Fighting, because Word keeps changing the breaks, formatting, and stuff.
- The document format is application specific.
- Although you can use styles, few people know this, leading to unstructured documents.
- Even if you use styles, the format is still a bastard between page layout and structured layout, leading to unstructured documents.
This leads to a lot of extra work for the designer. For instance, if you use Quark, all italics have a tendency to get lost when you import the text. If you use unicode, it often gets fubar'ed. All habitual errors from the user (very few people know how to use Word properly) that Word hides because it's a bastard, show up again when you do the page layout, and have to be fixed.So why do journals insist on Word documents? Because InDesign and those other apps have to support Word in some way, and do. But don't expect that turtlenecked designer to know how to handle TeX. So yeah, we should all accept that the world revolves around Microsoft, not around sound technical decisions (or aesthetical, for that matter).
I have a large application written Common Lisp. It makes heavy use of macros and is written in a functional paradigm. Also, it uses a sophisticated code-walker macro to optimize the code and convert it to CPS style, and includes a full Java JVM written in Lisp to ease training new hires, as well as a type inference engine. About 50% uses CLOS multimethods and "around" methods.
However, my new manager only knows Visual Basic on Windows 95. How can I translate? I'm pretty sure it's not a "1-to-1" port. For instance, how do I do continuations in VB? Thanks!
If your journal is telling you that they won't accept latex, tell them you won't submit your articles anymore, thank you very much.
In physics we have it good due to the existence of the arXiv, where we put our articles first. Therefore journals are already limited by the fact that your article is already published on the web, and they have to accept the consequences of that. e.g. they cannot have too draconian copyright terms. I know in many disciplines the situation with journals is much worse. But remember, journals are totally dependent on us, the scientists, and not the other way around. With the advent of the web and email we can diseminate our work to our colleagues and perform peer review all without the intervention of a journal.
The physics community accepts latex as the standard, and people are (rightfully) suspicious of articles which appear on the arxiv in only .doc or .pdf format.
So, I suggest you keep using latex, investigate adding a section to the arxiv for your specialty, and tell your journal that they will accept latex or be replaced.
-- Bob
1^2=1; (-1)^2=1; 1^2=(-1)^2; 1=-1; 1=0.
In fact, here are some of this papers: http://www.billposer.org/papers.html
In soviet russia, You ask not what country do for you, but what you do for country!
Oh wait...
Linguistics doesn't get the same kind of funding as the natural sciences and engineering, so no, we often don't have assistants to handle this kind of thing. Anyhow, I tried to hire a grad student to do the conversions and didn't get a single response. I guess they're better off financially than when I was a student.
Compromise a little, use LaTex. :-)
You can probably live with the crushing limitations relative to using TeX
And, if there's no other way then use MS Word, its character building (bad pun intended). I'd say that it won't kill you but if you have a lot of equations it might. After about 15 pages of equation intensive stuff you end up using the find function instead of scrolling because it gets so bogged down. It also regularly decides that your equation laden document won't fit on the XX or so gigbytes of free space on your harddrive. It has a long standing bug that causes it to miscalculate the size of some formulas so that no matter how much space you have left on your drive it won't save your document until you remove the offending equation segment. Hilarious, I know. I'd send a document with the problem in it to MS so that they could see the bug but then I can't save the document to send it to them. Chuckle chuckle. Those funny guys at MS have such a great sense of humor. They're worth every hundred dollar bill I send them for their fine products (sarcasm intended). What's really over the top is that people look me straight in the eye and tell me that they never have a problem using Word. Since all my friends are completely honest about anything regarding their computer use (oh dear, more sarcasm, must be past my bedtime) you can probably safely ignore my ranting.
I've started using Publicon by WRI. Interesting product. A little bit beta. If you feel like just saying f&$k the editors then this is something that you might like to dink around with even though you say you don't like WYSIWYG. Given your other proclivities I'd suggest taking Publicon for a spin around a document or two. It also claims to export TeX or LaTeX or both and it uses a bibliography database and a bunch of other nice stuff. It has a Mathematica front end so its a nice outlining tool too. The cell thing takes a little getting used to but I've come to really like it.
Unfortunately, most of the converters will do only a subset of the markup languages & so few (if any) will work well with custom macros.
The Chikrii TeX2Word MIGHT do it. TeX4ht may also be worth a try (->HTML/XML, which can easily become other formats). Can't comment on TeXPort. Those are really your only options. If worse-comes-to-worse, you can also look fo ps/pdf->word solutions, but those are just as bad as (La)TeX->Word.
The first key to productivity is that you are comfortable in the environment. Additional keys are that it is expressive & doesn't force you through tedium & allows you to script away as much tedium as possible. Certain people ARE more comfortable with LaTeX & know it well enough (and use the right tools) such that it isn't tedious. The most tedious parts about LaTeX are not knowing how to do something (which is combatted by knowledge or good tools or good code to steal) and compilation errors (which is combatted by knowing the syntax well, by using editors that prevent/fix/point out errors, and by compiling frequently (sometimes in the background)). LaTeX is CERTAINLY more scriptable than Word & automating references & formatting can be quite trivial. An example I recently used was a solution to placing a series of dozens of figures & captions. It is easy to generate the plain text code to do this. Less easy to write a VBA script in Word. LaTeX is also more reusable & versioning CAN be better. In short, people CAN BE PRODUCTIVE in LaTeX
Products with shallow learning curves have simple interfaces. It is true that Word has an easier-to-understand GUI than many of the LaTeX GUIs. More importantly, it is (whether we like it or not) omnipresent & most administrative assistants already have some experience with (or at least knowledge of) it. Shallow learning curves do mean increased productivity for the novice. They don't translate to increased productivity for ALL users or ALL applications.
Other journals accept or even require PDF -- it cuts down on the MS virus problem and guarantees correct rendering, unlike what you get with the diverse MS Word formats.
Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
I've been looking over your comments in this discussion, and also comparing this to what my girlfriend deals with (she's working on a linguistics PhD, and uses LaTeX for much of her work for similar reasons to you). I get the impression that you strongly prefer a "programmatic" approach to WYSIWYG, and ultimately you mostly produce plain-text-ish files with a wide range of characters, some limited formatting, and various custom diagrams. You also sound pretty technically competent generally. Is that about right?
If that's the case, then have you considered going the XML/XSLT route? I don't say this to be buzzwordy; I actually designed and maintain a fairly large web site that uses a custom XML schema to define the content (easily editable by our non-technical people so certainly possible for you) and then XSLT to do various clever tricks with it. We generate HTML output, but you could apply many of the same tools and techniques we use to generate a mostly-plain-text format that could be conveniently imported into any word processing package instead, Unicode glyphs and such included.
If you're willing to invest a few days of effort to develop the system, I can't see why you couldn't write a fairly simple customised mark-up language for yourself. You could use character entities or tags to access the Unicode glyphs for all your linguistic symbols, so instead of \phoneticsymbol, you now just need &phoneticsymbol; or <phoneticsymbol/>, depending on how clever/context-sensitive you need the interpretation to be. You can mark up document structure in much the same way as you would with TeX-based macros. Potentially, you could even define shorthand ways to represent common types of diagram as well: SVG plays nicely with XML, is rapidly becoming a viable graphics format in its own right, and might provide a convenient intermediate format to convert your diagrams into any common format required by the journal staff.
There are apparently some quite decent editing tools available to work with XML-based documents, but it sounds like you'd have about as much time for them as me and would probably prefer to work directly with the underlying mark-up. Converting your existing TeX-based documents could probably be mostly automated if you wanted, and using a structured, text-based format to represent your document has the advantage that you can support different output formats relatively easily in the future, so you wouldn't have to do all this again in five or ten years' time.
The only non-trivial work to be done in any specific word processor would then be applying the WP's heading styles, footnotes, etc. as required by the particular journal you're contributing to. You could deal with this by including a little processed mark-up in the output from your XSLT, and writing some trivial macros in any modern word processor to search for that, and apply whatever functions needed doing to that bit of text.
Without knowing more about the kind of documents you produce, it's hard to know whether this idea would be useful to you, but there it is for whatever it's worth. Good luck.
If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.