Domain: wotsit.org
Stories and comments across the archive that link to wotsit.org.
Comments · 33
-
Re:Microsoft cant do that
The reason - they don't have any documents describing the formats.
Except, they do. They've released specs for at least Word97, RTF, and PowerPoint's file formats, the OLE container format, and the Excel chart format. The docs were hosted on MSDN for a few years, even. I'm not saying that these docs are perfect or anything (they're far from it), but they're a decent start. I say this as someone who has used the docs to implement popular F/OSS tools that read and write these formats.
http://www.wotsit.org/list.asp?fc=10
http://www.wotsit.org/list.asp?fc=6 -
Re:Microsoft cant do that
The reason - they don't have any documents describing the formats.
Except, they do. They've released specs for at least Word97, RTF, and PowerPoint's file formats, the OLE container format, and the Excel chart format. The docs were hosted on MSDN for a few years, even. I'm not saying that these docs are perfect or anything (they're far from it), but they're a decent start. I say this as someone who has used the docs to implement popular F/OSS tools that read and write these formats.
http://www.wotsit.org/list.asp?fc=10
http://www.wotsit.org/list.asp?fc=6 -
Re:Ummmm why?
If you click the 'I agree' it takes you to download some file that ends in ".DOC" - since I couldnt find any specifications for *that* file, I wasnt able to read them.
B.S. The docs were available on MSDN for years, and are now available elsewhere as well. How'd you think that OOo, AbiWord, KWord, and the like (largely) got their DOC support to where it is today?
http://wvware.sourceforge.net/wvInfo.html
http://www.wotsit.org/search.asp?s=text
Don't spread FUD. You don't know what you're talking about enough to do an effective job of it. -
Re:Vector Graphics
The WMF format is very well understood. Microsoft isn't hiding anything about it. Check out wotsit.org for 3 good documents about WMF (including one from MS)
-
Computer embroidery
It happens that my job is writing software for controlling these machines, and for making the designs, as well. Unfortunately for your purposes, I am mostly familiar with the industrial versions of the machines, which don't use the same file formats as the home machines.
The first thing you need to know is that the process of converting from an image file to an embroidery file is a lot more complicated than you would think, and requires a human being to do a lot of it if you want anything like a good result. Most conversion programs you will find will convert between the embroidery formats, but are fairly useless for conversions to and from image files.
Ann The Gran is a site that is oriented at the home market you seem to be in, and it has a fair amount of useful information and programs you can buy.
Amazing Designs Also has software that may be of use to you, including auto-digitizing software, which sounds like the feature you need. There are other sites as well, I just mentioned these because they sell some of the software I work on, so I don't really have an unbiased opinion of them.
If you are serious about trying to generate your own files, Wotsit has a partial description of the format for PES files, which is what you mentioned you have. That description is not sufficient for you to do anything with, but if you look further up the page to the description of Melco files, which is one of the industrial formats I am more familiar with, is somewhat more useful. The Melco description is also incomplete, but does contain enough information to create a complete functional embroidery design, and there are certainly programs available which will allow you to convert from melco to PES and back.
I don't really recommend this approach, because it is a LOT harder than it looks, and even after you understand how the files work, you have not even begun to understand the best way to actually generate a design to sew on the machine. I've been working full time on this type of software for over ten years and have only learned the rudiments of 'punching', which is the term for creating embroidery designs. On the other hand, I've never really been that interested in punching, either. Creating software which allows others to 'punch' has kept me busy enougn...
The term 'punching', by the way, is a leftover from the days when the embroidery machines read their stitch data directly from a paper tape, and the designs were created by punching the holes directly on the paper tape. I'm one of the few people I know who can read the tapes by eye... It's not a particularly useful skill now. -
Re:Conversion Filter?
RTF is open standard...
No, it's a proprietary format. RTF files generated by Word are much like native Word files but with text-based tags instead of binary ones. They are a little more portable than native Word files. Microsoft does publish a specification (489K zipped Word document) for RTF, but I doubt that it covers all the details. -
Re:PDF = open format, won't go awayAnd here is the URL:
Portable Document Format Reference Manual Version 1.3
(near the bottom of the page) -
wotsit
wotsit regarding file formats
And I'm drunk -
Good site for file formats
A little off topic, but http://www.wotsit.org/ is a good resource for file format documentation.
-
.3DS and .DXF
It may be my bias as a long-time 3DStudio user (dating back to R2 for DOS), but my opinion is that the
.3DS file format (not.MAX) seems to be widely supported (e.g., Lightwave, Rhino, Multigen Creator, Sense8 WCT, ElectricImage, etc.) as an import/export option.
DXF is supported by more packages, but isn't as feature-rich (no textures, hierarchies, smoothing groups).
k. -
Re:All of them should bethe specs for quark, MS Word (.doc), framemaker, flash, shockwave, etc. are not.
As far as I'm aware, all of those have been documented to a greater or lesser extent. MS published the file formats for Word 6 and Word 8 on MSDN (see http://www.wotsit.org for details). Equally, Macromedia published the Flash 5 SWF file format, Adobe published the FrameMaker MIF format (can't find it online, but it's in the printed docs), and I believe the Quark file format is also documented. These are far from complete (no Word 2k, no Flash 6, no Framemaker native format, etc.), but at least the basics are there.
-
Re:Insightful or useless banter?you'd imagine they would use an open document format.
Care to expand on how PDF isn't an open format? It's fully documented by Adobe in the book "PDF Reference" (ISBN: 0201615886 for the current 1.3 version, or 0201758393 for the soon to be released 1.4 version). It's also available online in various places, for example, http://wotsit.org. Furthermore, several independent implementations of PDF encoders and viewers exist, such as xpdf and ghostscript. Yes, many PDFs include LZW compressed data, but that's a problem with Unisys, not Adobe, and there are non-patent-infringing ways of uncompressing the data anyway. Plus, modern PDFs are compressed with the patent-free deflate algorithm. So exactly how more open do you want PDF to be?
-
Some funky googling...
PBE, huh?
It sounded familiar, and still does - so I started thinking it might stand for "Portable Bitmap Exchange" - some googling on that started yielding things - so do some more googling on "Portable Bitmap" and the PBM format.
It could be PDE are nothing more than renamed PBMs, perhaps. There does seem to be software out there that will do conversion, if it is a PBM (pbm2gif, etc). Also, if it is PBM, or you suspect it, the format spec can be found here.
Also, in response to the poster who couldn't find anything on these scanning stations by googling on "KV-F510" - google on "KVF510" - and you will get a few hits (but still close to none). However, to the original poster, maybe these "sellers" of the equipment might be able to give you some help (manuals, contact phone numbers, etc).
I am surprised the the manual or info you have with the equipment doesn't list the specs for the format, unless it is considered a dire secret, or what. Most manufacturers of the larger equipment give pretty detailed interfacing specs, or at least provide them to interested parties - you might contact Panasonic and ask. If they refuse you, ask them what the format is, or such. State your needs clearly. Maybe even offer them a copy of the package back in exchange?
Above all, be persistent - you will find the answer eventually... -
Re:It's a hard battle
1: The documentation is late, so of course filters for old versions can be done, but new versions are not publicly documented, yet.
No. The documentation has been around for a while (years). You can see here: http://www.wotsit.org/search.asp?s=text that there are references to the Word 6 format as well.
2: The documentation has some sort of licensing provisions that are unacceptable, therefore is effectively useless for building a competitive product.
No. There are no license restrictions to writing filters for MS Word file formats that I know of.
3: The only good programmers work for Microsoft. So even with documentation, nobody else can make import/export filters that work well.
Well, good programmers don't necessarily work for MS but it's a big format and it's not a task for a hobbyist coder. But I think the main problem is that there is a somewhat inappropriate focus on rendering the output. IMO I think that an internel representation should be chosen such that it can be traversed like a tree and output in any format. Writing a converter is then a matter of interpreting the attributes of a node in the tree (a paragraph, an image, a sequence of characters) and genereting the appropraite output wheather it be ps, html, or most importantly another internal representation of a document used by another office package such as star office. -
Information on File FormatsApparently information on the Word file formats either is or was available on MSDN. I pulled up a 560K HTML file spec from Wotsit's Format, a file format information site.
Also of interest may be LAOLA, which is "a collection of documentations and perl programs dealing with binary file formats of Windows program documents." The link to that came from Wotsit's as well.
-
Re:MS Word format
M$ posted the specs to Word documents for version 8 (aka Word97) on the MSDN site and on the MSDN CDs. The documentation has since been pulled from the web site, but is now available from other sites such as wotsit. The program wvWare is an open source program that can read "word 2000, 97, 95 and 6 file formats." I assume that the ability to read Word2000 files is based on reverse engineering.
-
Re:one possible good result of this:
Apart from the occasional CV I never use the filthy format myself, but the specs for this and most other Office file formats are freely available.
You can get them on msdn (membership is free, apparently) if you're that way inclined or better still head over to binary Valhalla Wotsit.
You only have to glance at the specs to see why they're now moving to XML. -
Re:one possible good result of this:
Apart from the occasional CV I never use the filthy format myself, but the specs for this and most other Office file formats are freely available.
You can get them on msdn (membership is free, apparently) if you're that way inclined or better still head over to binary Valhalla Wotsit.
You only have to glance at the specs to see why they're now moving to XML. -
Check out wotsit.org
wotsit.org has info on lots of file formats, including JPEG.
Hey everybody, check out The MACE project! -
The Wotsit File Format Collection
The The Wotsit File Format Collection has specs for almost any format you could want.
Using the search on there page I pulled up 6 documents that contain all the specs and other information on the JPEG.
-----
If my facts are wrong then tell me. I don't mind. -
Re:Use a jar!
Wotsit's Format has the JAR file format specifications. Java supports class file storage in ZIP archives also, but they're not the same thing.
-
Re:Use a jar!
Wotsit's Format has the JAR file format specifications. Java supports class file storage in ZIP archives also, but they're not the same thing.
-
Re:Another font editorThis is a Motif based BDF (Bi Directional Font?)
BDF is the Adobe Bitmap Distribution Format for fonts. You can get a copy of the spec at http://www.wotsit.org.
-
Its bloody well pretty much done...Listen again and again this comes up, and again and again I make the point that my wv does read
.doc format. Abiword uses this for their .doc import. KWord uses a munged copy of it too. It is not perfect, but it does support versions 6, 95, 97 and should handle 2000 as well.Its GPLed, granted it needs work. So scoot onto the abiword mailing list and cvs down the latest version, get hacking on it and sort it out.
ole2 is fully sorted out with libole2, excel is being handling by gnumeric.
What is not handled by wv is not by lack of documentation or design, its simply a matter of spending some time at it. Easy peasy. Info on the MSDN docs can be got from here. They can be gotten off the MSDN 1998 July cd, or you can get some of them from wotsit.org. I even wrote ivt2html for you to convert the office.ivt file into html. Like what else do you need.
90% of all the hard work has been done, wv can parse fast and simple with no bother to it, which was a nightmare to do, it can construct the correct PAP (paragraph properties) and CHP (character properties) for a given run of text. Feed you the correct characters and charset and font, the TAP (table properties), graphic properties and handle to graphics. The correct OLE handle for embedded objects. Document properties etc. There is an example html conversion program included for reference (wvHtml).
I put together libwmf to convert wmf file into something useful as well. Theres a half done implementation of an Escher (the graphics for Office) importer floating around in there as well.
Theres also an implementation of a Summary Stream displayer for all ole2 documents.
I even bust my ass and dragged together the right bunch of motivated people to help implement the decryption module for word 97, 95 and 6, and that was not fun at all to say the least
The hard work is done, if you want something improved you have a very very solid base to work from. Yes the spec is confusing, yes its not a great format, yeah is sort of moves over time, but in a fairly rational way that can be supported with some work. There are any number of equally crap formats with weak documentation supported in various tools.
There is just this false myth that the Microsoft formats are inpenetrable and/or not available. Just download wv, fair enough there might be problem documents, if there are, just debug wv and get onto the abiword list and work it out with them. If something fails it can be fixed and improved, its not a case of "ah well, its a MS format, nothing can be done". If you truly want to handle Microsoft formats there are a number of people working on it that you can help.
So its right there for the right bunch of motivated people to work on. C.
-
MS Word File Format is here
At Wotsit. Microsoft Word 6.0, 8.0, Word 97, and Palm Pilot doc files where all reverse engineered.
-
A Site About File Formats
-
Re: CAB File FormatA working implementation of the CAB inflater and details of its format can be gotten from the freeware DUMPCAB program. It was quite trivial to convert the decompression routine to cross platform code. I did this for my ivt2html utility to convert those pesky proprietry Microsoft InfoViewer
.ivt files to html. .ivt uses the exact same MSZIP compression mechanism as CAB.MSDN is not really a realistic resource for useful data for interoperability with windows. There are a few nuggets spread thinly about the site but it is awesomely hard work to track them down, links are forever moving around and the search engine sucks. Formats are usually described in terms of their windows api interfaces and MS always invents new terms for existing standards and mechanisms. Concise and complete descriptions are hard to find
On the other hand people are very quick to assume that a format is secret or not documented, this is not always true so it is a very good idea to check msdn before simply lying back and saying "ack its proprietry, we can never support it". There are a lot of MS formats which could be supported right now from working with the available documentation. Simple examples which I did some fiddling with include the wmf format, emf format, pe and ne executable formats. In addition windows and dos programmers have often made source available to parse some of the undocumented formats already and just need some massaging to make that source crossplatform. And note that theres ole2 stream support for linux as well, so thats no barrier.
Wander over to wotsit.org and take an unsupported windows format and write a linux converter today.
C.
-
RE: MS doesn't publish their specs, WRONGI am blue in the face from repeating that they have published their specs, you can get them on the July 1998 MSDN cd, they had them on their website for over a year. They can now be got from wotsit.org. These are the Office97 formats, in addition worsit also has the word 6 spec
Also my wv project has a passable word reader that abiword is using as a word importer, and gnumeric has quite a good excel importer
C.
-
Re: Microsoft File Formats, *are* availableIndeed they are already documented, and im blue in the face from repeating it. Noone want to hear it, and very few despite their constant gripes about the lack of support for them sees fit to aid the various projects that are importing them, namely that
Right now, gnumeric can import excel, and could do with more help
abiword uses wv to import word documents, kword also uses wv though they have seen fit to branch off their own version to do so. It too could do with more help
These specifications are also available on the July 1998 MSDN Microsoft Developer CD. They are stored on this CD in the proprietry
.ivt format, but nonetheless I've even implemented an .ivt to .html converter for you to read those files under linux.Get ivt2html and convert that office.ivt on disk 3
Alternatively wotsit.org has versions of many of them as well, including the word 6 file format which wv can handle as well
Now if someone wanted to do something constructive but wants to start small, then rather than sitting around on their arse blathering uselessly they could take a look at the public specs for mathtype and put together a linux equation edit file format to mathml converter which both abi and kword might use as an importer for equations.
Or they could help enhance libwmf to convert wmf files into svg format.
Its not just the office formats that are the problem, its the fact that they all embed or are based upon, or otherwise require the ability to convert all the secondary windows formats as well, so theres loads to see and do
C.
-
Re:Can't wait to get my hands on the Windows sourc
I would hope that all file formats would get released 100% into the public domain, so that all software instantly becomes Office compatable (okay, given them a few weeks). That will be a major problem for M$, as long as they are forced to keep their file formats open.
Same with their APIs
Just a little hint...
The Office 97 file format (which is the same as the Office 2000 file format, by the way) is available on:
MSDN Library January 1999.
It's also on http://www.wotsit.org.
Just thought I'd let you know... because it hasn't seemed to be any kind of major problem for M$ so far...
Simon -
Re:What about Dead Formats?Is anyone aware of a repository for "current" file formats
Perhaps something like Wotsit, but without the offsite links?
-
Re: Cookie FormatGo to www.wotsit.org and search for cookie. Its explained there
C.
-
Re:AutoCAD may be too entrenched...
...dwg file format compatability would be needed, but that's probabaly the most un-open file format there is.
Did you check wotsit? (search for DWG)
That said, I think the primary file format should be zipped XML. DXF hails from the days of visicalc, i.e., DIF begat DXF - it's hard to find file formats that suck more. I wouldn't expect DWG to be much better, though I haven't looked at it. Generally, when you go spelunking through these 1980's era PC file formats you'd better bring your barf bag.