Abiword, wvWare And KWord Authors To Collaborate
An anonymous reader writes: "One important aspect of Free software is open collaboration and the pooling of efforts. There are several open source word processors available and they all need to import and export the ubiquitous MS Word format. To try and avoid duplicating efforts, developers from the Abiword, wvWare and Kword projects have been talking with regard to pooling their efforts in
writing filters."
But seriously, what good does .doc format do for _anyone_? Take away the fact that g00ns all over the planet use word, and you're left with nothing.
.doc filters are a technical solution for a social problem.
IMHO,
Personally, I use XEmacs to write all my papers in various SGML DTD's, and I couldn't be happier.
This just in! Open Source developers realize they're after a common goal, and decide to cooperate and combine efforts!
:)
(it's about bloody time
hawk
Am I the only Slashdoter here who knows that KWord has been using XML as it's native format since the beginning? Honestly, you can try this yourself.
1. Create a file.kwd in KWord. Make it complex and add pictures and stuff.
2. Rename it to file.tgz
3. Uncompressed and untar it and viola, you have an XML document and a bunch of picture files etc...
The rest of KOffice works this way. Negotiations are still on to get all the Free office suites on Linux to unite on a single file format. I like the KOffice scheam because it inherently produces small files (already compressed). Others have favorites.
As for filters. I think we should have a separate program for importing the dreaded *.doc files and have all the office suites call this program for that task. Why should they all waste time redoing the same function that we would prefer not be needed at all? (I.e. MSWord not so cumbersome and convoluted in it's document formats)
--= Isn't it surprising how badly I spell ?
IMHO, antiword is by far the best Word-to-Ascii converter out there. It even renders footnotes, can be used in pipes and is much faster than wvWare. The program is GPL and comes for a variety of OS platforms. As the moderator of a mailing list, I regularly use it to convert *.doc attachments. (One should patch majordomo so that it automatically filters *.doc attachments through antiword. It has worked flawlessly for me since more than a year. It surprises me quite a lot that such a superb program is so litte known in the Free Software community.
gopher://cramer.plaintext.cc http://cramer.plaintext.cc:70
But why not also see if they can also inlist Open Office. We that is if they will play nice.
Now the other 100.000 Open Source / GPL projects should do the same and finally produce a production quality application suite...
"I love my job, but I hate talking to people like you" (Freddie Mercury)
So there would be 3 different efforts still, but they would share knowledge with each other.
So what will they do when MS.net is up and open and people are using that?
Just imagine MS having access to your internal internet....
I don't want a lot, I just want it all!
Flame away, I have a hose!
Only 'flamers' flame!
How can the Slashdot editors criticize web-porn filters and Napster filters for blocking the wrong people when they do it themselves?
Dump the braindead heuristics. If you really want to curb AC abuse, make it so that AC posts don't appear on the main page until a logged-in user "adopts" the post and any karmic moderation that gets done to it.
--
--
Mod up a post Rob doesn't like and you'll never mod again
All of them already have some form of import/export. The problem is: they all suck.
But imagine the threat to Microsoft if any of them -- muchless all of them -- could import and export MS Word documents perfectly.
What a world it could be, Microsoft-free.
---
I, for one, welcome our new Antichrist overlord.
Yeah, imagine what a horrible world it would be if everyone used the same format and we could interchange documents without any problems.
Uh, only if we're all using the same very latest version of Word on the same very latest version of Windows, on the same Microsoft-approved Intel-supplied hardware -- and then we get to play a big game of Simon Says -- "Microsoft says: okay everybody, time to upgrade, please enter your credit card number here."
I think what a lot of people fail to realize is that Microsoft has just as much right as anyone else to set standards.
The problem is that their "standards" follow the form of "here's our magic new standard format, it'll sorta do most of what you need, but only if you use it with our software. Don't bother trying to figure out the details of the format, because we'll change it at our whim, every so often, just to make sure that no one else's software will work with it. Even older versions of our own software won't work the the latest format, so everyone in your company will have to upgrade."
Microsoft doesn't have standards, they have proprietary formats. They don't want to promote and use open standards, they want to own the "standards". If they were willing/able to play well with others, they wouldn't be as hated as they are today.
Or did I miss something ?
While it's great to see collaboration done for importing and exporting Word documents, if they really want interoperability, they should agree on a unified document format. That is, when the different word processors from the different desktop environments save, they should save to the same file format.
The reason while Word's DOC format is so important is because it's the de-facto standard in the Windows world. I'm hoping we're not looking to make it the standard *nix world, too.
So, it just makes sense that all the developers get together and agree on a standard format so whether or not my coworkers and I are using Gnome or KDE or whatever, we don't have to go through yet ANOTHER set of filters.
http://www.talknerdy.org
Except that XML does seem to be an actual up-and-coming standard.
Ok this is probably way off topic, well it is, but I'll put some of my strong points on my arguements over XML, which are strongly opinionated (as is everyone's). One of the biggest problems I've seen with XML is that, many have already created massive content on existing languages, whether its XML, Python, Perl, HTML, and many have invested a large amount of money into the already existing languages.
In order for a company to feasibly make the move over from $INSERT_LANGUAGE_HERE over to XML would mean that their programmers would have to know it meaning it would cost them more to pay for their education in it (even though they could learn online please here this out) or hire someone familiar with XML.
Looking at the current scenario, many companies have done well without it, not to say it shouldn't be used, but just to give everyone a reminder on it. It's always going to be an extremely opinionated arguement, and points/counterpoints could run on for years. Same arguements go for JAVA and others, you don't neccessarily need them for one, and just because someone uses X or X becomes a pseudo standard should not mean that programmers should focus on X and forget the core basics of it all.
UML, XML, HTML, CSS, COOL, JAVA, it all boils down to needs, and XML is not really a neccessity, and soon there'll be another acronym toting the same claims as the existing ones, "The Next Best (overhyped) Thing"
Sorry if I sound like a troll I'm trying to be as sincere as possible about my thoughts on it, without sounding anti-anything (XML, or other) just my notes on it. I think the programmers should stick with the basics without getting all fancy.
Want Root?
Your assuming things will move over to XML, and everyone is going to use it. Let us not forget about the standings when it comes to creating a so called standard, shtml, WML, and all those other acronyms I care not to type.
Want Root?
tex exporting is already supported in a few koffice apps, including KWord. they beat you to it =)
Most commonly used to parse (unambiguous computer) languages, but a word file is alot less complicated then a language I can assure you :)
Free Techno/Jazz/DNB/MI Music by guys obsessed with monkeys!
Why should you need it? Word has always been able to read an RTF. So if you write a document, export in RTF and send it to a Word-addicted coworker, he should be able to import it into Word with no problems.
The problem is that then he will want to send you back the modified document. If he used full-power Word (e.g. using the change bars to hilight the changes), even if he is willing to convert the doc back in RTF, lots of fomatting info will be lost.
Ciao
----
FB
Which shows that there is a silver lining in black clouds, afterall.
Ciao
----
FB
Exporting to TeX is straightforward. Importing TeX is very tough, because TeX is a programming language, not a representation. It's hard to do anything with TeX except run it, which renders output. This loses the document structure. The same is true of PostScript.
The OpenDWG effort is laudable, but last I checked, the public won't get source to the library. Apart from the library not being available for the platform I use, it's not very sustainable: what if they fold? What if you upgrade and the libraries are no longer compatible with your new OS?
Bert Driehuis -- All I asked was a friggin' rotatin' chair. Throw me a bone here, people.
XML is better to use than SGML. SGML is very hard to deal with and parse. XML is more strict, and less likely to have interoperability problems.
Just because it CAN be done, doesn't mean it should!
The announcement linked didn't mention XML but I agree with you--this seems like the right thing to do. For almost anything that MS Word formats you could duplicate it exactly using html+css1, and I think this should be a priority. The thing is, this would make an excellent independent project; you don't need the gurus of free office suites to muck around with this. You don't even need to know anything about their particular software at all.
I wonder if the recent propaganda assault by Microsoft is drawing the open source/free software community closer? There have been a spate of these "new cooperation" stories lately. Perhaps differences in philosophy and direction start to seem pretty minor when Microsoft conspicuously brings its ion cannons to bear...
Only reason I picked XML is that it seems to be the best choice for an intermediate language anyway.
:-)
What is needed, though, is documentation on the current MSOffice formats. (Reverse engineering for interoperability...) Probably using OpenOffice's formats would be better, but ymmv.
And you're right about not needing to be involved in the projects; all you need is a way of forcing them to use your intermediate language at gunpoint
/Brian
That might be a start...
Another point -- are we talking filtering one way or both? I'm thinking the cleanest way to go back is RTF export (which presumably already exists on all platforms) but where can you get an rtf->Word filter (probably to Word 97?)?
/Brian
And monkeys *might* fly out of my butt...
.doc file standard on the shelf next to Adobe's PDF definition. This is like Samba -- Andrew Tridgell wrote the original using a packet-sniffer on a DEC Pathworks server, as I recall. That's reverse-engineering for you.
If Microsoft had any interest at all in interoperability there'd be a
/brian
And what -- sit by and never be able to handle Word documents? Unfortunately, there are still a good number of people who want to see, for example, resumes in Word format. (Even tech HR people sometimes insist on that, though I'm inclined to write them off as clueless...)
It's like being a Mac user or, I don't know, a non-American. Your average Mac can read a PC disk, but it doesn't usually go the other way. Meanwhile, your average USian speaks English and *maybe* Spanish, which means the rest of the world has to learn English to communicate with us. Good, bad, it's the reality -- it's great that Sun eats its own dogfood by using StarOffice internally, but file exchange is pretty important, and MSWord is the number one format to translate.
/brian
This actually makes a lot of sense, especially when file formats are starting to move to XML-based formats (see OpenOffice) -- just translate the Word format to XML (or whatever) as an intermediate format.
Come to think of it, this would make a great project; anyone know what would be needed to write msw2xml(1)? My perl skills are becoming a bit rusty...
/Brian
There are several open source word processors available and they all need to import and export the ubiquitous MS Word format
What about WordPerfect's *.wpd format? Yes, I know -- WordPerfect is available for Linux, and for free. But a lightweight, open source word processor along the lines of AbiWord or kWord would be real nice if it supported wpd files.
---
DOOR!!
I pledge allegiance to the flag...
of the Corporate States of America...
I don't know what the fuss is about. I use KMail for e-mail, and it already has a filter for dealing with .DOC attachments. It's activated via the 'Delete' button...
I love AbiWord for reading MS Word documents and writing quick letters, etc. I think it's a great program, and it reads the Microsoft .doc format quite nicely. But one thing that all open source word processors have omitted, including Open Office, is WordPerfect document support! Sure, I can get WordPerfect for Linux, but isn't the point of Open Source that you shouldn't need to be tied to a single proprietary piece of software? Isn't that what the freedom is all about?
For one reason or another, I can't get WordPerfect 8, the personal edition available for download, to install on my Linux box. Perhaps it doesn't like Mandrake 8, maybe it's my own ineptitude (I've been running Linux as my primary OS for about 4 months now), but it just won't cooperate. I wouldn't mind purchasing WP Office 2000 for Linux, but if I can't get WP8 to install, that tells me that WP2000 might suffer from the same problems. Given the average return policy of most software stores (i.e., no returns once it's open), I'm extremely hesitant to spend upwards of $100 on software that may or may not work on my machine. But I've been using Word Perfect for over 12 years now, and need WP file support. Right now, the only way I can get it is by booting my Windows partition and using WP2k for Windows.
So developers, if you're listening, Word support is great, but don't forget about those of us who haven't used Microsoft (at least for word processing) for a long time!
- Stealth Dave
--
Evil is as eval("does");
- WordPerfect, because that's the word processor I like, and it prints well for hard copies.
- html, because it's everywhere, and even M$ Word lusers can read it.
When I email my resume to someone, I attach the html version, and if the want ad specified a Word format, then I politely explain to them that I can't provide it as aOh, and I know someone it going to protest by saying that WordPerfect can save to .doc or .rtf, but it really destroys the formatting, which to me is half the battle of getting potential employers to actually do more than glance at a resume. If they see something with the indenting trashed and different font sizes from one page to the next, all they are going to do is toss the resume in the round file.
Need a Linux consultant in New Orleans?
Cool, I hope they'll also start supporting importing and exporting to TeX. Maybe then the stuff will start looking professional*. Kidding aside, it's a shame that the word-processing crowd is ignoring the best type-setting system around. WYSIWIG documents just don't cut it compared with a doc prepared in LaTeX.
* professional as in 'professional publisher', not as in 'professional marketeer'
There's compiler writing tools. There's GUI building tools. There's class frameworks out there for just about everything. Maybe we need file-format interpreting Meta-Tools and some codified domain specific knowledge for this problem.
This collaboration is a good start, if they concentrate on not only coming up with filters, but discovering HOW to come up with a good filter.
--
Libertarianism is rich wolves and poor sheep playing gambler's ruin for dinner.
Sure. If things keep going at this rate, Open Office could write a Word importer and exporter and finish it about the time MS is releasing the NEXT version of Office. Playing catch-up doesn't help set standards or even acquire market share.
I didn't know Word sucked at importing WP so much. Luckily I've didn't have to import anything from WP to Word on my Mac when I bought it because the only .WPD files I had were stuff from when I was like 5 years old messing around on the 286. When ever I have to use a .DOC file on my parent's PC I just open the copy of Word that Conpaq was so nice to install for us...
--Volrath50
What KWord and that need are filters for other formats, paticularily WordPerfect 6-10. The thing about WordPerfect is that once you get WP6's format working you can open WP7, WP8, WP9 and WP10 because Corel never changes the format, unlike MS.
Without this my Dad can't switch to KWord or anything else (doubt he would want to though, he like WP8 too much) because he is an Auto Teacher and he has about 10 years worth of tests and stuff in WP format dating back to WP 5.1 on a 286 and DOS 5.1. (I remember that 286. Orange and black monitor. Those were the good old days. :-) And I know WP runs on Linux but everyone that I know hates WP for Linux.
While it would be possible to convert them all to RTF or something, he has hundreds and hundreds of files it won't be easy or fast.
What RedHat and others can to focus on is telling the Average Consumer that Windows XP is violating their privacy, among other things. Every few days I tell my dad about Windows XP's evil features (Such as Hardware ID stuff) and he considers switching to Macs or Linux more and more. But again the biggest thing keeping him back is lack of ANY WordPerfect format compatibility. (Minus WP it's self). The biggest thing keeping me form switching from my Mac and Word is lack of good consistant GUI.
I should stop rambling on and sum my post up: WordPerfect compatibility is important too!
--Volrath50
"what good does .doc format do for _anyone_? "
I agree, however over 90% of the market uses this format. Though not the best, it is the leader and we must recognize or fight that. We can't pretend it does not exist.
Im really happy to see this type of collaboration. It is only good for projects. I feel that Kword could benefit the most, as Abiword seems to do the .doc "thang" better for me. Glad to hear this is happening, and I hope to see more of this example.