Firstly, XML is not a standard of any kind: it's a Recommendation of the W3C, which is rather different.
Secondly, it's based on SGML, which is a standard, and is very stable, so questions of volatility simply don't arise. You can quite happily base a project on XML: while it may one day be supplanted, it can easily be transformed into a new format, unlike Java:-)
What software you want to use depends on what you want to do. If you're doing "data" type XML for e-commerce, then XML Spy or InfoPath will be good. If on the other hand you're doing traditional "document" publishing, you need a proper text editor like XMetaL (Corel), EPIC (Arbortext), epcEdit (www.epcedit.com), WordPerfect XML (Corel again), or even at a pinch, the new StarOffice 6.1b or Word-11 when it arrives. And of course, Emacs.
Whether you like Emacs or not, it is unquestionably the most reliable and simplest to use. The psgml-mode and xml-mode or xxml-mode provide a hugely functional XML editor, with context-sensitive markup menus, colorized tags,
and (with SP/nsgmls) full validation. The tdtd-mode provides DTD editing, and xslide-mode is a complete IDE for XSLT stylesheets. It won't cost you anything and it runs on all modern platforms.
XML is also a bandwagon: many companies are grotesquely misusing it because they haven't bothered to find out what it's really for, and many other companies are failing to use it where they ought to be, for the same reason. But that's life.
Emacs is the faithful standby and has been since the days of SGML. When the chips are down and other tools fail, Emacs with psgml-mode and xxml-mode will see you through.
But what you describe is grotesquely suboptimal: do use psgml at least: Inserting separate start-tags and end-tags is the way to madness. C-c C-e inserts whole elements and there's a load of other keystrokes to make life easy (and menus too for those who prefer them)
All true, although I continue to use OO (and I even bought SO in the hope that it might encourage Sun) simply because my main machines are all Linux. I use Abiword some of the time to open simple Word docs people send me.
A small example of crassness is that the OO presentation software doesn't come with any sample schemes for new slides like SO does. Fsck knows why, I reported it (bug 13608) and the response simply acknowledged that yes, we have no bananas.
There are some good people working on both SO and OO, and I'm downloading SO 6.1b to see if it's any better at XML than Word-11, but absolutely no-one seems to be looking at fixing the real interface problems and the real functionality problems. Yet.
What they've missed is that for "Enterprise" work
they need a single-user workstation configuration
suitable for desktops and laptops that isn't classified by their sales droids as "for hackers and hobbyists". Server OS configs are fine for servers
but your CEO will look pretty silly sitting on the train waiting 40 mins for her laptop to boot because
it can't find the network resources it expects.
The mass market is on the desktop and if RH is ever to make any inroads (and there are lots of people who think that Linux should not be aimed at the desktop at all) then it must produce a business-capable desktop/laptop version. RH9 simply doesn't cut it. Maybe they just don't want to sell in bulk to enterprises.
I don't know, but if it had been a university project, written in university time, using university resources, and hosted on a university server, I suspect the RIAA would have had a more difficult time.
As it is, it appears to have been hosted on a commercial server as a private business, which implies to me that it was a personal project. This doesn't justify the RIAA in persecuting the guy, but it may explain any lack of institutional support.
Re:What's that other Internet Explorer thing again
on
Mozilla 1.4 RC1
·
· Score: 1
If this release is Firebird I don't want it: I tried
out Firebird a couple of weeks ago and it's like
regular Moz but without any of the useful features
(and no support for antialiased fonts, either).
And I'll bet you the new release still won't print
any font other than Times under Linux...
Yeah, and the Linux version of Moz still can't print
any font except Times. Moz is trying hard, far
harder than Microsloth, but they're starting to add
frills instead of fixing basic functionality. Moz
has improved massively over the last 2-3 years but
not being able to print what's on the screen
according to the stylesheet is a big killer.
>I can understand that a company doesn't want to support more than one or two OSs...
These companies have utterly missed the point. They don't have to support an OS libre: all they have to do is support their own hardware. The user community of an OS libre will support the OS.
All we need to do is have a coordinated approach for the companies which says something along the lines of
We will write the drivers and any other software if you will let us have the documentation. We'll guarantee only one person
will contact you with queries, and in return you get the software and an additional x% market for your gizzmo.
Sadly the idea of effortless and cost-free extra market share is too difficult for most corporate suits to grasp. Their mind-set is very limited, and they already make so much money from their existing market that they really are not interested in increasing their revenue any more.
When there is no type string, a null-pointer is used.
There's the bug. When TYPE is absent, the default is
the value "TEXT". This is in the HTML spec, and in the DTD, but as I said earlier, browser makers don't read doc. It should only compare the value to HIDDEN if a value has been specified.
Handling default values is something most 12-year-old programmers can master. Why do some browser makers fail to do it right?
This is a good example of the problem from the programmer's viewpoint. "type" is not a "directive", it's an attribute (HTML doesn't have "directives"). And it doesn't have to be followed by the equals sign, white-space is permitted provided the equals sign comes next.
Don't let programmers anywhere near HTML unless they have read the doc. It's a markup language, not a
programming language. Treating it like a programming language is a recipe for failure. Sadly, Microsoft hasn't learned this any more than the Mosaic programmers didn't, all those years ago. They're not stupid, they just don't know enough about HTML.
And it's not buffer underflow checking either (although that may be what ultimately causes the crash), it's simply not parsing the HTML properly. A decent parser will merely note that "type" is not a valid value for
any of the token groups allowed in an input element, and skip over it as garbage. Ditto for "crash". This is not rocket science.
Use the link in the article. It's at Amazon UK
(don't ask me why they can't pool their ISBN data,
it's prolly some crazy publishers' market-protection
kick).
Good goddess, is that the best they can do?
Parturiunt montes. Not only is it a boring, ugly
typeface, it's a sans-serif, and the last thing we
need is more boring ugly sans typefaces. Given the
effort they are supposed to have put into this, I
would have thought the least they could have done is
some up with something original.
Wrong. This is precisely what XML was intended for. Go and read the Spec.
Where we went wrong was in using XML for spreadsheet/database-style rectangular data, for which it was never designed, and for which is it grotesquely unsuited.
My T68i is fine, if a little sluggish in the menus (it it written in Java?).
But the big turn-off for the new one is that it's plug-ugly compared with the T68i. Camera? Got one as accessory to the T68i. 16-bit color? Don't care. Polyphonic tones? Just sad. GPRS? At 60/month I won't be using it. Bluetooth? Haven't found a use for it yet. I got the T68i because it's small, light, can exchange VCs with my Zaurus via IR, do a simple GSM dial-up, and work most places I go.
Maybe they just wanted the new one to compete with the traditional brick-like US cellphones.
The way I treated our very first laser printer
was much simpler. It was an early HP LJ II and
as we extracted it from its tightly-wedged foam
packaging, I managed to drop it four feet onto
a concrete floor.
Apart from a slightly bent pressure tongue in the
paper tray, easily fixed, and a scuffed crack on the plastic housing, it seemed undamaged, so we plugged it in and it worked without problems for the next 8 years or so.
HP's software support may suck but their hardware seems robust enough.
Prior art is easy on this one. Cringely makes the comment that the "patent examiners and Ameritech's patent attorneys just missed or ignored them" -- more than likely either way given the patent examiners' ignorance of technology and the lawyers' lack of incentive.
So I'll step forward, Bob. I implemented the layout for the individual document format for the CELT project (formerly CURIA) web site in 1995. We generated (and still do) some 500 documents from SGML masters in Old Irish, Latin, and Old French using TEI into HTML via an Omnimark script. Click on the link labeled "HTML" next to any document listed in
http://celt.ucc.ie/publishd.html to see it.
(It's simplistic to the point of being crude, but we specifically wanted to keep the Table of Contents on view all the time, but let the user change the document panel display when needed, which is pretty much the point of the patent, if I've understood you. Despite my dislike of the navigational problems of frames, that was how we did it.)
Any of the hundreds of scholars who have visited the site since then will be able to attest this, and I presented papers about what we were planning to do as long before as 1992 and 1993. The site has been extensively publicized in the academic field (it was originally the 9th Web server in the world) although we never specifically shouted about the technique of what we did, as it seemed too simple and obvious:-)
But it's easy to go back further. I think this method was used in one of the original SGML offline browsers, perhaps the first: the IETM (ebook) system called DynaBook, at that time (late 1980s) from EBT (Providence, RI), later Inso Corp; it was still until recently being marketed by Enigma.
Fonts or typefaces?
There is a big difference: a text face conventionally has roman, bold, italic, and bold-italic fonts, maybe more, maybe fewer; a display face usually has one; so...
How many of them are text faces and how many are display faces?
Are they TrueType, Type 1, OpenType, or some other format?
Are they rip-offs of well-known existing faces, or really new?
I was at the launch presentation of Office-11 by Jean Paoli at XML 2003 in Baltimore MD last week, and I'm also a late sign to MS's extended beta list for the product (now closed).
To clear up some points people have commented on (based on a very preliminary inspection plus a lot of discussion at the conference):
The default save format is still.doc (ie you have to go the extra click to save in XML format)
If you pick to click it, the default XML format is MS's own office-document vocabulary, which retains all the formatting, held in attributes. Hairy but processable, and they will be shipping their schema for it so people can reprocess it externally. But this format will (of course) only represent the appearance, not any structure.
It will also let you specify your own schema (or an industry standard one) and let you supply a binding of named styles to your element types, so you can edit using what look like styles but actually get represented in the saved file as XML markup. There is some debate as to whether this constitutes "being an XML editor" or just "being a wordprocessor that saves data in XML" (my money is on the latter).
It will not support DTDs, so you're stuck with W3C Schemas whether you like them or not*
The discussion over a [more?] suitable schema/DTD for handling office documents (wordprocessing, spreadsheet, presentation) continues at the OASISTC on Open Office XML Formats**
With Office-11, Microsoft has nearly caught up with Corel's WordPerfect, (which has had a fully-fledged SGML and XML editor built-in for years) and XMetaL (which Corel took over from SoftQuad earlier this year). MS still has a long way to go to match industrial-strength applications like ArborText's EPIC or even Emacs with psgml-mode et al, but Office-11 will be a solution for the masses who believe the Word interface to be more desirable, or the Microsoft licensing régime to be more attractive, or the software to be more stable.
* [Bias note] I think W3C schemas were a big mistake; provision for data content typing and validation, namespaces, and extended grouping could have been achieved by extending DTD syntax; and wimpy programmers who moan about having two syntaxes to handle should get a life - it's not a big deal, the code is free and has been in use for 15 years:-)
** Sun has donated the OpenOffice (aka StarOffice) XML file formats to the public domain. It's worth remembering that {Star|Open}Office has been saving in XML as its native format for some time now, and has a lot more experience at this than MS.
There have been some electronic editions of medieval texts, notably the sole remaining manuscript of the poem Beowulf, which was written down in the early 1100s. Alas, it is proprietary, and you have to pay a rather large sum to the British Library if you want a copy.
As distinct from the CELT project, which is making electronic transcriptions of Early Irish manuscripts (all SGML, moving to XML) which are all freely available to anyone who wants (some with translations!).
Oh, look, there are some pigs flying past the window...
--
"The best cure for seasickness is to go and sit under a tree" --Spike Milligan
Secondly, it's based on SGML, which is a standard, and is very stable, so questions of volatility simply don't arise. You can quite happily base a project on XML: while it may one day be supplanted, it can easily be transformed into a new format, unlike Java :-)
What software you want to use depends on what you want to do. If you're doing "data" type XML for e-commerce, then XML Spy or InfoPath will be good. If on the other hand you're doing traditional "document" publishing, you need a proper text editor like XMetaL (Corel), EPIC (Arbortext), epcEdit (www.epcedit.com), WordPerfect XML (Corel again), or even at a pinch, the new StarOffice 6.1b or Word-11 when it arrives. And of course, Emacs.
Whether you like Emacs or not, it is unquestionably the most reliable and simplest to use. The psgml-mode and xml-mode or xxml-mode provide a hugely functional XML editor, with context-sensitive markup menus, colorized tags, and (with SP/nsgmls) full validation. The tdtd-mode provides DTD editing, and xslide-mode is a complete IDE for XSLT stylesheets. It won't cost you anything and it runs on all modern platforms.
XML is also a bandwagon: many companies are grotesquely misusing it because they haven't bothered to find out what it's really for, and many other companies are failing to use it where they ought to be, for the same reason. But that's life.
But what you describe is grotesquely suboptimal: do use psgml at least: Inserting separate start-tags and end-tags is the way to madness. C-c C-e inserts whole elements and there's a load of other keystrokes to make life easy (and menus too for those who prefer them)
A small example of crassness is that the OO presentation software doesn't come with any sample schemes for new slides like SO does. Fsck knows why, I reported it (bug 13608) and the response simply acknowledged that yes, we have no bananas.
There are some good people working on both SO and OO, and I'm downloading SO 6.1b to see if it's any better at XML than Word-11, but absolutely no-one seems to be looking at fixing the real interface problems and the real functionality problems. Yet.
What they've missed is that for "Enterprise" work they need a single-user workstation configuration suitable for desktops and laptops that isn't classified by their sales droids as "for hackers and hobbyists". Server OS configs are fine for servers but your CEO will look pretty silly sitting on the train waiting 40 mins for her laptop to boot because it can't find the network resources it expects.
The mass market is on the desktop and if RH is ever to make any inroads (and there are lots of people who think that Linux should not be aimed at the desktop at all) then it must produce a business-capable desktop/laptop version. RH9 simply doesn't cut it. Maybe they just don't want to sell in bulk to enterprises.
I don't know, but if it had been a university project, written in university time, using university resources, and hosted on a university server, I suspect the RIAA would have had a more difficult time.
As it is, it appears to have been hosted on a commercial server as a private business, which implies to me that it was a personal project. This doesn't justify the RIAA in persecuting the guy, but it may explain any lack of institutional support.
And I'll bet you the new release still won't print any font other than Times under Linux...
Yeah, and the Linux version of Moz still can't print any font except Times. Moz is trying hard, far harder than Microsloth, but they're starting to add frills instead of fixing basic functionality. Moz has improved massively over the last 2-3 years but not being able to print what's on the screen according to the stylesheet is a big killer.
These companies have utterly missed the point. They don't have to support an OS libre: all they have to do is support their own hardware. The user community of an OS libre will support the OS.
All we need to do is have a coordinated approach for the companies which says something along the lines of
Sadly the idea of effortless and cost-free extra market share is too difficult for most corporate suits to grasp. Their mind-set is very limited, and they already make so much money from their existing market that they really are not interested in increasing their revenue any more.I don't want to buy a goddamn bulldozer from Gung-Ho Province.
Everyone who is anyone knows we are on the back of four elephants, riding on the back of a great turtle...
There's the bug. When TYPE is absent, the default is the value "TEXT". This is in the HTML spec, and in the DTD, but as I said earlier, browser makers don't read doc. It should only compare the value to HIDDEN if a value has been specified.
Handling default values is something most 12-year-old programmers can master. Why do some browser makers fail to do it right?
Don't let programmers anywhere near HTML unless they have read the doc. It's a markup language, not a programming language. Treating it like a programming language is a recipe for failure. Sadly, Microsoft hasn't learned this any more than the Mosaic programmers didn't, all those years ago. They're not stupid, they just don't know enough about HTML.
And it's not buffer underflow checking either (although that may be what ultimately causes the crash), it's simply not parsing the HTML properly. A decent parser will merely note that "type" is not a valid value for any of the token groups allowed in an input element, and skip over it as garbage. Ditto for "crash". This is not rocket science.
Use the link in the article. It's at Amazon UK (don't ask me why they can't pool their ISBN data, it's prolly some crazy publishers' market-protection kick).
The phrase in parentheses was meant to refer to GE's *purchase* of a UNIVAC in 1954.
Good goddess, is that the best they can do? Parturiunt montes. Not only is it a boring, ugly typeface, it's a sans-serif, and the last thing we need is more boring ugly sans typefaces. Given the effort they are supposed to have put into this, I would have thought the least they could have done is some up with something original.
Wrong. This is precisely what XML was intended for. Go and read the Spec.
Where we went wrong was in using XML for spreadsheet/database-style rectangular data, for which it was never designed, and for which is it grotesquely unsuited.
>You don't keep xml data in your xsl stylesheets do you?
Exactly. This is what is supposed to happen. I'm baffled as to why the OP is surprised.
But the big turn-off for the new one is that it's plug-ugly compared with the T68i. Camera? Got one as accessory to the T68i. 16-bit color? Don't care. Polyphonic tones? Just sad. GPRS? At 60/month I won't be using it. Bluetooth? Haven't found a use for it yet. I got the T68i because it's small, light, can exchange VCs with my Zaurus via IR, do a simple GSM dial-up, and work most places I go.
Maybe they just wanted the new one to compete with the traditional brick-like US cellphones.
Apart from a slightly bent pressure tongue in the paper tray, easily fixed, and a scuffed crack on the plastic housing, it seemed undamaged, so we plugged it in and it worked without problems for the next 8 years or so.
HP's software support may suck but their hardware seems robust enough.
So I'll step forward, Bob. I implemented the layout for the individual document format for the CELT project (formerly CURIA) web site in 1995. We generated (and still do) some 500 documents from SGML masters in Old Irish, Latin, and Old French using TEI into HTML via an Omnimark script. Click on the link labeled "HTML" next to any document listed in http://celt.ucc.ie/publishd.html to see it.
(It's simplistic to the point of being crude, but we specifically wanted to keep the Table of Contents on view all the time, but let the user change the document panel display when needed, which is pretty much the point of the patent, if I've understood you. Despite my dislike of the navigational problems of frames, that was how we did it.)
Any of the hundreds of scholars who have visited the site since then will be able to attest this, and I presented papers about what we were planning to do as long before as 1992 and 1993. The site has been extensively publicized in the academic field (it was originally the 9th Web server in the world) although we never specifically shouted about the technique of what we did, as it seemed too simple and obvious :-)
But it's easy to go back further. I think this method was used in one of the original SGML offline browsers, perhaps the first: the IETM (ebook) system called DynaBook, at that time (late 1980s) from EBT (Providence, RI), later Inso Corp; it was still until recently being marketed by Enigma.
The thought of Phillippa Forrester naked was probably too much for them :-)
To clear up some points people have commented on (based on a very preliminary inspection plus a lot of discussion at the conference):
- The default save format is still
.doc (ie you have to go the extra click to save in XML format)
- If you pick to click it, the default XML format is MS's own office-document vocabulary, which retains all the formatting, held in attributes. Hairy but processable, and they will be shipping their schema for it so people can reprocess it externally. But this format will (of course) only represent the appearance, not any structure.
- It will also let you specify your own schema (or an industry standard one) and let you supply a binding of named styles to your element types, so you can edit using what look like styles but actually get represented in the saved file as XML markup. There is some debate as to whether this constitutes "being an XML editor" or just "being a wordprocessor that saves data in XML" (my money is on the latter).
- It will not support DTDs, so you're stuck with W3C Schemas whether you like them or not*
- The discussion over a [more?] suitable schema/DTD for handling office documents (wordprocessing, spreadsheet, presentation) continues at the OASIS TC on Open Office XML Formats **
With Office-11, Microsoft has nearly caught up with Corel's WordPerfect, (which has had a fully-fledged SGML and XML editor built-in for years) and XMetaL (which Corel took over from SoftQuad earlier this year). MS still has a long way to go to match industrial-strength applications like ArborText's EPIC or even Emacs with psgml-mode et al , but Office-11 will be a solution for the masses who believe the Word interface to be more desirable, or the Microsoft licensing régime to be more attractive, or the software to be more stable.* [Bias note] I think W3C schemas were a big mistake; provision for data content typing and validation, namespaces, and extended grouping could have been achieved by extending DTD syntax; and wimpy programmers who moan about having two syntaxes to handle should get a life - it's not a big deal, the code is free and has been in use for 15 years :-)
** Sun has donated the OpenOffice (aka StarOffice) XML file formats to the public domain. It's worth remembering that {Star|Open}Office has been saving in XML as its native format for some time now, and has a lot more experience at this than MS.