Slashdot Mirror


Help/Opinions on Parsing OFX FIles?

innerweb asks: "I am looking for help and advice on using and parsing the OFX (Open Financial Exchange) file spec using C/C++ and/or Perl. I have read the standards, downloaded the DTD (ofx version 2), and tried to parse several files from different banks. They have all failed in my normal parsers (commercial and OSS), yet they load fine in Microsoft Money. It is not so complicated that I can not hand roll my own, and I have much of it working that way as a proof, but I would rather stick with something that is standards based, as this is a standard that in my opinion ought to work with standards based tools. Am I missing something here, or is this truly a file format that is broken as a feature?" "I know the files are malformed when they come down, as they are missing the normal XML and SGML file headers ?XML or !DOCTYPE to define the dtd to use to parse the file. I know that the document is not 'well formed' as I understand it, as most of the tags in the datafile are not closed (open tag, but no corresponding closing tag). When I fix these errors, the files seem to parse. yet, I know that from what I have seen, MS Money takes in the same raw data and parses it. Microsoft lists the OFX file format as XML in some places and SGML in others. The OFX website seems to be saying this is SGML, not XML (XML is a subset of SGML in most cases, but the way it is *used* sometimes it is not really SGML at all.)

I have been reading like mad for a few weeks on OFX format files and usage, but not getting much useful information. I have worked with SGML in the past and XML, so I am at least familiar with these *conventions*. I need to be pointed in the right direction, and or told what I am doing wrong/overlooking. I know it is probably something obvious, but somehow I am not getting it.

Thanks in advance for any help that you can throw my way."

3 of 49 comments (clear)

  1. Isn’t it obvious? by Pan+T.+Hose · · Score: 2, Insightful

    If you have written code that works better than the open source code you have tried, but you'd rather use said open source code, isn't it obvious that you should send some patches? That's how open source works, you know.

    --
    Sincerely,
    Pan Tarhei Hosé, PhD.
    "Homo sum et cogito ergo odi profanum vulgus et libido."
  2. Re:I think you found your answer by brunes69 · · Score: 2, Insightful

    Microsoft's parser is probably broken in a way that it doesn't look for closing tags

    Or, you could say that Microsoft's parser is much more robust in how it deals with malformed documents.

    Seriously, you can't blame MS for writing a better XML parser. Just because a parser knows that

    <foo>this<bar>that</foo>
    ...is not valid XML, does not mean it needs to choke and die. MSXML can easily validate a document as well as parse invalid ones.

  3. Re:I think you found your answer by Knights+who+say+'INT · · Score: 3, Insightful

    Tsk. It's called the "be tolerant in what you accept and strict in what you send" rule.