There are a few tags which are not explicit in their implementation which exist for legacy purposes only, such as supporting defunct features found in Word 95.
These formats have absolutely nothing to do with the.DOC format..DOC was literally a memory dump of the data structures. The XML files are well structured. Style and content are highly separated. They are quite easy to read and understand.
Sorry mate, but bullshit. Yes, the DOC format was an object serialization of the in-memory format. But OOXML is no saint by any measure. Not only does it include references to Word 95, but also Word 6.0, Word 5.0, Word 97, Word 2002, and Wordperfect 6.x. It also references several Word/Office versions on the Macintosh, because heavens forbid MS make a cross-platform application that works the same on both Windows and Mac. It even references east Asian font rendering in a specific version of Word. And note I say "references", because that's all the standard does. Finding out what all those different versions of MS Office did on both Windows and Macintosh, and possibly also for different languages or regions of the world is left up to anyone trying to implement Microsoft's "Open" Office XML format. Even though the documentation for OOXML is huge compared to ODF, these details are still not included.
So please tell me, what do these few tags/attributes do?
lineWrapLikeWord6
mwSmallCaps
shapeLayoutLikeWW8
truncateFontHeightsLikeWP6
useWord2002TableStyleRules
useWord97LineBreakRules
wpJustification
shapeLayoutLikeWW8
Anyone claiming OOXML is in any way comparable to ODF is either misinformed and/or a shill. As we can see with this story, MS has a lot of money and influence to throw around for the purpose of muddying the waters and making OOXML look like a viable "standard".
They move to live streams (although at higher resolution than most non-porn streams seem to offer), to make it more difficult and less interesting to copy content.
Honestly, how would that help? Doesn't anyone know about downscaling? A lot of porn video clips still seem to be 320x240 (or at least less than 640x480) in either MPEG-1 or WMV. So all anyone has to do is capture the stream, downscale it to a more reasonable picture size, re-encode it and sell it on their site. You also don't need the massive amounts of bandwidth or storage that these guys need. Realistically, do you really need HD video to watch a woman getting screwed by three hung guys?
So it's barely better than the old 3DFX Voodoo I/II? IIRC, they were just texturing hardware. The CPU/driver had to rasterize the polygons, passing scan-lines off to the card to be drawn. That's why the original SLI setup was so easy - just pass odd-numbered lines to the first card, even-numbered to the second.
Real smart. Like everyone isn't going to think that IO stands for Input/Output
That's still better than everyone mistaking the 'II' for a Roman numeral and thinking it's "SATA 2.0". You know, like USB 2.0.
There is no "SATA II"
on
eSATA Connectors
·
· Score: 3, Informative
SATA II is the old name of the organisation that created the SATA standard (although I can't find what the acronym used to stand for). It has since changed its name to SATA-IO ("International Organisation") because everyone mistook the two I's as Roman numerals and assumed the newly created SATA 3Gb/s standard was "version 2" of SATA. It's not. It's just a new signalling rate and other features like NCQ are separate.
"redrum" would appear to be Daniel Eran, the owner of roughlydrafted.com. The people over on digg.com have accused him of spamming Digg with his articles and then using sockpuppet accounts to 'digg' his stories (and only his stories) to get them on the frontpage (or however it works on Digg). When this was found out, he was banned from Digg and he took this personally. In his deluded mind this is a conspiracy against Apple by pro-Microsoft minions. He even has people email Apple asking them to set up a "pro-Apple" competitor to Digg. Daniel Eran is a sycophantic Apple fanboy of the worst kind.
Wow, you're really good at spewing alarmist bullshit aren't you?
I don't know how often archive.org scans a web page, but Google averages 1 month between full indexes (admittedly, spread out over the month). I can't imagine archive.org doing it much faster. So the chance of archive.org picking up a document that was visible for a few hours is pretty slim. Instead, hundreds or even thousands of ordinary visitors could have viewed the same information and saved it, sent it off to the press, created their own mirror, etc. You don't need to repeat the "oops, something was posted when it shouldn't have been" scenario three times.
The Internet Archive respects the robots.txt file and will remove/not cache content that is disallowed to the ia bot. There's also procedures for removing content from the archive when robots.txt is not enough.
...it's about the millions of people publishing information on the Internet [...] [who] may be damaged if this sort of issue isn't dealt with
Just what sort of "damage" are you claiming here? If someone is just putting up "my first homepage" on their ISP-supplied web space, they don't likely have much that someone else would want to "steal". But if someone does have some valuable content they want to publish, even if it just on their ISP's web space, one would assume they would research the issue and possible precautions to take. I believe that's called due diligence, although IANAL so I don't know if that applies in the his case.
...for example, [ISP caches] damage web hosts by messing up server statistics
Go read up on caching in HTTP and learn how to work with web caches instead of against them. Do it properly and you save bandwidth and server load while getting the non-cacheable requests you need/want.
Upon request, archive.org can and most probably will remove the copy from their site.
Oh they do, unfortunately. A few months ago I was editing the Rob Enderle article on Wikipedia. I was looking for a link to his keynote speech at the SCO Forum 2004. You know, the one where he describes in detail all the heroic things he's done, why Bill Gates and Microsoft are so great, and why IBM is evil. Basically, he outed himself as a raving lunatic with a feeble grasp on reality and an even feebler grasp on logic. Anyway, I had an old link to the speech on the sco.com website, but they had since removed his speech. So, off to archive.org I went. No show. Apparently SCO's robots.txt denied access to that part of their site so archive.org hadn't cached it. Or if it was ever cached, it had been removed because of new rules in their robots.txt.
So, yes, archive.org does remove cached content when the robots.txt file denies access.
I doubt that there are 1,000,000 people in the world who know about robots.txt
That would be 1,820,000 pages at least. Just because you don't know about it doesn't mean it's not common knowledge in the applied field. Seriously, if you look up information about publishing web pages, and especially about search engines, you're going to run into info about robots.txt pretty soon. It's an accepted standard that's been around for well over 10 years now.
Why does it matter if OOXML is an ISO standard or not? Microsoft is already using it in Office 2007 and people are creating OOXML files. MS just wants it to be a standard to lend the format an air of officialdom and to dazzle clueless managers.
Ghostscript had to deal with this problem years ago, because PostScript is actually a programming language, not a page description language.
Ghostscript had to deal with what problem? Yes, PostScript is a programming language with built-in graphics primitives. What does that have to do with search engines? It doesn't have to recognise certain outlines as being text (i.e text drawn without using the PostScript primitive for drawing text), it just draws it. Ghostscript is just another implementation of a language otherwise.
OCR is also an option. Because of the lack of serious font support in HTML, most business names are in images.
Is OCR really necessary? Odds are the business name is also in the domain name and at least the front page as text, if not included in the title and/or copyright footer of every page. Except for damn all-flash web sites, the business name is unlikely to be hidden away from a search engine.
If nothing else, I'd worry that they would be less likely to accept patches, fix bugs that were important to me, interact on mailing lists, and in general provide support.
I acknowledge that is a valid concern, but I don't know if it's really happened. It depends on the project though. Most probably don't really care. I bet most people wouldn't notice on a mailing list or wherever. A lot of people have email addresses that end with ".com" because of their ISP, so you'd probably just blend in. The FSF might be a little different for their GNU projects though, especially when accepting patches. I know they require some sort of release form for anyone contributing more than a certain amount (12 lines?) to any of their projects. And since the whole SCOG vs IBM mess, Linus and the other Linux kernel guys have begun keeping better records of who contributes patches. Not that any copyright infringement has actually happened, they just want an easy record for any future claims, instead of having to sift through a dozen years of the kernel mailing list and patches.
And - no offense intended - 'judge the code on its merits' is not practiced by the FSF community (if you'll permit me to generalize from RMS), who believe that, for example, schools should use exclusively free software, regardless of any technically superior proprietary alternatives. For RMS, too, procedural issues dominate the technical issues.
Touché. But note that one of the reasons given is "Free software permits students to learn how software works". Is that not a merit in this case (education)?
If you were in the business of selling software, would you feel comfortable using code written by folks who are philosophically opposed to the existence of your business?
Richard Stallman is against the existence of commercial software, and he gives very good reasons for why he thinks it is bad for society. But I'm not sure he is against the existence of commercial software businesses. He probably just sees them as a waste of time i.e spending time and money to produce something that will ultimately be restricted and kept behind closed doors. Why not instead work on something that can be not only used by everyone, but even (potentially) modified and improved by everyone? But anyway, what does that have to do with being "comfortable" about using GNU code? Judge the code on its merits.
...you also have to acknowledge the impact of the open source shills
Open Source shills?
From Wikipedia:
A shill is an associate of a person selling goods or services who pretends no association to the seller and assumes the air of an enthusiastic customer.
So how can someone be a shill for a product which isn't being sold and is developed by a community?
The word you should have used is either advocate or zealot, depending on how and why a person promotes Free/Open Source Software. I'm usually an advocate, although I can cross into zealot territory sometimes. I try to avoid it because I know it is often counter-productive.
Now, you could say, the open-sourced firmware was never proprietary to begin with somehow, but that's just semantics
How is that semantics? I thought that was the whole point - PHB's are afraid of having to release all or part of their precious proprietary software. But that's not what happened with Linksys/Cisco and the WRT54G routers. It was a striped down Linux distro. Ok, they had to put it together, perhaps write some shell scripts. I'm not sure where the web interface came from. But did they have to release any super-secret proprietary source code? I doubt it.
So really, has there been any actual cases of a manager's worst nightmare, the scenario that Microsoft has been FUD'ing us with for years - having to "open source" their internally developed software because a developer in some way used Open Source Software? That's what I'm after. And I don't believe it's ever happened. It's just FUD but the managers don't know any better.
If people are wondering why managers are scared of Free/Open Source Software, just look at Rob Enderle's recent story posted here on Slashdot yesterday. Managers are the targets of these schill reporters (Enderle, O'Gara, Lyons) and their efforts are clearly working. We might not fall for their FUD, but managers and other non-techies do. And that's why they get paid.
So, GPL was used to wrestle a few vendors into releasing their own code.
I'm sorry. What did you just write? Give me one example of a company being forced to release previously proprietary software under the GNU GPL. One. I dare you.
I also have a feeling he's wrong about the pseudonyms part as well. I'd bet the majority of kernel contributions come from people who are identifiable.
And you'd win that bet. Just check out the CREDITS file in the root of the Linux kernel source tree find most people's full names, addresses, phone numbers, etc. The kernel contributors are far from anonymous.
Using Intel's Linux driver - This required a kernel version of 2.6.8 or greater. openSUSE 10.2's kernel is 2.6.16 or something.
Um, the kernel version number is three numbers that happen to be separated by a period. It's not a fraction. Version two point six point sixteen is indeed greater than two point six point eight.
Thanks for that information. Wikipedia has a little more at Aspartame controversy#Conflict of interests in the FDA approval process. Wow, not only is George W. Bush the worst president ever (and hopefully is for a while to come), but most of the people around him are pretty shady too - Cheney (Halliburton), Rumsfeld (many administrations, see "The Power of Nightmares"), Wolfowitz (Team B), etc. It's amazing to look back at recent historical events in the U.S., even in different fields like this, and find the same small group of people popping up again and again in key positions. Just amazing.
How long does that take? I know it'd take me about five minutes...
No, it might take five minutes for someone to create an RPM and provide it to the hordes of Fedora users on rpmfind.net. But Debian takes its time. Debian has rules about what goes into a package and how it operates. Does it install properly for a bunch of situations? Does it uninstall properly? Upgrades? And not just from the immediately previous version. Do the dependencies really pull in everything that is needed? Does it properly conflict with the necessary packages? How about 'provides' and 'replaces' as well? And are they versioned properly?
If you've ever used Debian you would have noticed that software is often split into multiple packages: one for a server, one for a client, one for documentation, one for development headers, etc. That way you don't need to install stuff you don't need. There might even be different builds of the same software e.g GTK, GNOME, QT, KDE, or WX interfaces.
Making a proper Debian package is more than just slapping something together in five minutes. A lot of care is taken to make a cohesive repository of software, and that is what makes Debian so good. Besides, there's already the sun-java6-* packages in non-free.
Iran has openly stated its desire to wipe Israel off the map
Please stop spreading lies. Mahmoud did not say he wanted to "wipe Israel off the map", he said he wanted to eliminate the Zionist regime of Israel from the pages of history. European, US, and off course Israeli politicians and media have blown this mis-translation out of proportion, claiming he wants to nuke Israel and using this as a justification to attack Iran. Mahmoud may be a raving idiot (and reportedly many Iranians are not happy with his domestic performance), but please don't fall for the anti-Iranian propaganda and allow another unjustified war to be started by the Bush administration.
Sorry mate, but bullshit. Yes, the DOC format was an object serialization of the in-memory format. But OOXML is no saint by any measure. Not only does it include references to Word 95, but also Word 6.0, Word 5.0, Word 97, Word 2002, and Wordperfect 6.x. It also references several Word/Office versions on the Macintosh, because heavens forbid MS make a cross-platform application that works the same on both Windows and Mac. It even references east Asian font rendering in a specific version of Word. And note I say "references", because that's all the standard does. Finding out what all those different versions of MS Office did on both Windows and Macintosh, and possibly also for different languages or regions of the world is left up to anyone trying to implement Microsoft's "Open" Office XML format. Even though the documentation for OOXML is huge compared to ODF, these details are still not included.
So please tell me, what do these few tags/attributes do?
Anyone claiming OOXML is in any way comparable to ODF is either misinformed and/or a shill. As we can see with this story, MS has a lot of money and influence to throw around for the purpose of muddying the waters and making OOXML look like a viable "standard".
Honestly, how would that help? Doesn't anyone know about downscaling? A lot of porn video clips still seem to be 320x240 (or at least less than 640x480) in either MPEG-1 or WMV. So all anyone has to do is capture the stream, downscale it to a more reasonable picture size, re-encode it and sell it on their site. You also don't need the massive amounts of bandwidth or storage that these guys need. Realistically, do you really need HD video to watch a woman getting screwed by three hung guys?
So it's barely better than the old 3DFX Voodoo I/II? IIRC, they were just texturing hardware. The CPU/driver had to rasterize the polygons, passing scan-lines off to the card to be drawn. That's why the original SLI setup was so easy - just pass odd-numbered lines to the first card, even-numbered to the second.
That's still better than everyone mistaking the 'II' for a Roman numeral and thinking it's "SATA 2.0". You know, like USB 2.0.
SATA II is the old name of the organisation that created the SATA standard (although I can't find what the acronym used to stand for). It has since changed its name to SATA-IO ("International Organisation") because everyone mistook the two I's as Roman numerals and assumed the newly created SATA 3Gb/s standard was "version 2" of SATA. It's not. It's just a new signalling rate and other features like NCQ are separate.
"redrum" would appear to be Daniel Eran, the owner of roughlydrafted.com. The people over on digg.com have accused him of spamming Digg with his articles and then using sockpuppet accounts to 'digg' his stories (and only his stories) to get them on the frontpage (or however it works on Digg). When this was found out, he was banned from Digg and he took this personally. In his deluded mind this is a conspiracy against Apple by pro-Microsoft minions. He even has people email Apple asking them to set up a "pro-Apple" competitor to Digg. Daniel Eran is a sycophantic Apple fanboy of the worst kind.
Wow, you're really good at spewing alarmist bullshit aren't you?
Just what sort of "damage" are you claiming here? If someone is just putting up "my first homepage" on their ISP-supplied web space, they don't likely have much that someone else would want to "steal". But if someone does have some valuable content they want to publish, even if it just on their ISP's web space, one would assume they would research the issue and possible precautions to take. I believe that's called due diligence, although IANAL so I don't know if that applies in the his case.
Go read up on caching in HTTP and learn how to work with web caches instead of against them. Do it properly and you save bandwidth and server load while getting the non-cacheable requests you need/want.
Oh they do, unfortunately. A few months ago I was editing the Rob Enderle article on Wikipedia. I was looking for a link to his keynote speech at the SCO Forum 2004. You know, the one where he describes in detail all the heroic things he's done, why Bill Gates and Microsoft are so great, and why IBM is evil. Basically, he outed himself as a raving lunatic with a feeble grasp on reality and an even feebler grasp on logic. Anyway, I had an old link to the speech on the sco.com website, but they had since removed his speech. So, off to archive.org I went. No show. Apparently SCO's robots.txt denied access to that part of their site so archive.org hadn't cached it. Or if it was ever cached, it had been removed because of new rules in their robots.txt.
So, yes, archive.org does remove cached content when the robots.txt file denies access.
That would be 1,820,000 pages at least. Just because you don't know about it doesn't mean it's not common knowledge in the applied field. Seriously, if you look up information about publishing web pages, and especially about search engines, you're going to run into info about robots.txt pretty soon. It's an accepted standard that's been around for well over 10 years now.
Why does it matter if OOXML is an ISO standard or not? Microsoft is already using it in Office 2007 and people are creating OOXML files. MS just wants it to be a standard to lend the format an air of officialdom and to dazzle clueless managers.
Ghostscript had to deal with what problem? Yes, PostScript is a programming language with built-in graphics primitives. What does that have to do with search engines? It doesn't have to recognise certain outlines as being text (i.e text drawn without using the PostScript primitive for drawing text), it just draws it. Ghostscript is just another implementation of a language otherwise.
Is OCR really necessary? Odds are the business name is also in the domain name and at least the front page as text, if not included in the title and/or copyright footer of every page. Except for damn all-flash web sites, the business name is unlikely to be hidden away from a search engine.
I acknowledge that is a valid concern, but I don't know if it's really happened. It depends on the project though. Most probably don't really care. I bet most people wouldn't notice on a mailing list or wherever. A lot of people have email addresses that end with ".com" because of their ISP, so you'd probably just blend in. The FSF might be a little different for their GNU projects though, especially when accepting patches. I know they require some sort of release form for anyone contributing more than a certain amount (12 lines?) to any of their projects. And since the whole SCOG vs IBM mess, Linus and the other Linux kernel guys have begun keeping better records of who contributes patches. Not that any copyright infringement has actually happened, they just want an easy record for any future claims, instead of having to sift through a dozen years of the kernel mailing list and patches.
Touché. But note that one of the reasons given is "Free software permits students to learn how software works". Is that not a merit in this case (education)?
Richard Stallman is against the existence of commercial software, and he gives very good reasons for why he thinks it is bad for society. But I'm not sure he is against the existence of commercial software businesses. He probably just sees them as a waste of time i.e spending time and money to produce something that will ultimately be restricted and kept behind closed doors. Why not instead work on something that can be not only used by everyone, but even (potentially) modified and improved by everyone? But anyway, what does that have to do with being "comfortable" about using GNU code? Judge the code on its merits.
Open Source shills?
From Wikipedia:
So how can someone be a shill for a product which isn't being sold and is developed by a community?
The word you should have used is either advocate or zealot, depending on how and why a person promotes Free/Open Source Software. I'm usually an advocate, although I can cross into zealot territory sometimes. I try to avoid it because I know it is often counter-productive.
How is that semantics? I thought that was the whole point - PHB's are afraid of having to release all or part of their precious proprietary software. But that's not what happened with Linksys/Cisco and the WRT54G routers. It was a striped down Linux distro. Ok, they had to put it together, perhaps write some shell scripts. I'm not sure where the web interface came from. But did they have to release any super-secret proprietary source code? I doubt it.
So really, has there been any actual cases of a manager's worst nightmare, the scenario that Microsoft has been FUD'ing us with for years - having to "open source" their internally developed software because a developer in some way used Open Source Software? That's what I'm after. And I don't believe it's ever happened. It's just FUD but the managers don't know any better.
If people are wondering why managers are scared of Free/Open Source Software, just look at Rob Enderle's recent story posted here on Slashdot yesterday. Managers are the targets of these schill reporters (Enderle, O'Gara, Lyons) and their efforts are clearly working. We might not fall for their FUD, but managers and other non-techies do. And that's why they get paid.
I'm sorry. What did you just write? Give me one example of a company being forced to release previously proprietary software under the GNU GPL. One. I dare you.
And you'd win that bet. Just check out the CREDITS file in the root of the Linux kernel source tree find most people's full names, addresses, phone numbers, etc. The kernel contributors are far from anonymous.
The last time I saw him he had a pretty mean goatee.
Um, the kernel version number is three numbers that happen to be separated by a period. It's not a fraction. Version two point six point sixteen is indeed greater than two point six point eight.
Thanks for that information. Wikipedia has a little more at Aspartame controversy#Conflict of interests in the FDA approval process. Wow, not only is George W. Bush the worst president ever (and hopefully is for a while to come), but most of the people around him are pretty shady too - Cheney (Halliburton), Rumsfeld (many administrations, see "The Power of Nightmares"), Wolfowitz (Team B), etc. It's amazing to look back at recent historical events in the U.S., even in different fields like this, and find the same small group of people popping up again and again in key positions. Just amazing.
No, it might take five minutes for someone to create an RPM and provide it to the hordes of Fedora users on rpmfind.net. But Debian takes its time. Debian has rules about what goes into a package and how it operates. Does it install properly for a bunch of situations? Does it uninstall properly? Upgrades? And not just from the immediately previous version. Do the dependencies really pull in everything that is needed? Does it properly conflict with the necessary packages? How about 'provides' and 'replaces' as well? And are they versioned properly?
If you've ever used Debian you would have noticed that software is often split into multiple packages: one for a server, one for a client, one for documentation, one for development headers, etc. That way you don't need to install stuff you don't need. There might even be different builds of the same software e.g GTK, GNOME, QT, KDE, or WX interfaces.
Making a proper Debian package is more than just slapping something together in five minutes. A lot of care is taken to make a cohesive repository of software, and that is what makes Debian so good. Besides, there's already the sun-java6-* packages in non-free.
Please stop spreading lies. Mahmoud did not say he wanted to "wipe Israel off the map", he said he wanted to eliminate the Zionist regime of Israel from the pages of history. European, US, and off course Israeli politicians and media have blown this mis-translation out of proportion, claiming he wants to nuke Israel and using this as a justification to attack Iran. Mahmoud may be a raving idiot (and reportedly many Iranians are not happy with his domestic performance), but please don't fall for the anti-Iranian propaganda and allow another unjustified war to be started by the Bush administration.
Anonymous Coward said:
Translation:
Thanks for spreading the problem, idiot.