I won't be so sure (that it is not a general case).
A regular user won't be inventing his own ontologies the same way as he is not inventing a new RSS format. There is a set of well-define ontologies that you can use to describe your data. And a regular user won't be hand-crafting RDF data either. Instead RDF data will be exported from his applications the same way as RSS and Atom are exported from his weblog software or as Word saves users files.
RDF data will still merge together, provided there are "crystalisation points" that are common to data from different sources.
Regarding Luc Steels' research you are mentioning - could you give some pointers to his work?
To illustrate one of key differences of how RDF is supposed to work and why it may even be fun (and not to post same info again) check out at this comment: Tutorial on the Semantic Web.
A simple and concrete example: when a conference publishes its delegate list in RDF it is suddenly very easy for services that use this data to appear. By combining this data with Google Maps we get a FOAFMap of its participants where the application has extracted machine readable data from RDF, used geo-information to put them on the map and has retrieved more data from FOAF RDF profiles of the participants.
That's a simple mashup, but it shows how machine readable data can help spring new and sometimes unexpected use cases. I don't understand why all conferences do not provide data in RDF yet.;)
Pay attention to the slide #22 which shows how data from different sources can be merged together. This is one of key differences between XML and RDF - to merge XML data from a number of different schemas one would need to create an application that processes data in these schemas and generate merged data (possibly inventing a new schema to represent the merged information).
In RDF that happens "magically" - in order to merge heterogenous data you don't need to do *anything* - just put all the information in an RDF store and it merges. If the data to be merged change no modifications to the store are necessary - it is like a bag that can hold anything.
As I pointed out in the previous comment authoring data on the semantic web is no more difficult than authoring RSS or XML.
Yes, figuring out for the first time how to represent your data in RDF (or XML for that matter) can be difficult. Imagine if everyone was trying to come up with an RSS standard on his own instead of using RSS export functionality of his content management tool. That's why we need good guidelines how to publish information on the semantic web. And RDF export functionality (plugins) similar to what RSS plugins are doing.
As for opportunities for spammers and mischief - don't think so.
Why? - If you look at the Semantic Web "layer cake" you will notice such technologies as digital signatures, encryption and trust being part of the scheme. They allow to identify the author of data and ensure he is what he claims to be. There is nothing wrong with your application if it only accepts signed and trusted data. And there is nothing preventing authors of the data from signing the contents. Since the semantic web is a new technology and we already know about problems that spam and misuse can present it is more not less prepared to fight spam.
Note1: Semantic web should be viewed as an integral part of the existing web, not its opposite. Might even be that it can provide an additional layer that will help to combat spam and other problems you mention here. Who knows.
Note2: Spammers will always try to come up with new exploits. We all have to be prepared for this and think how to close the holes they are using. But saying that newer (a further development of existing [web]) is necessarily more opportunities for spammers in wrong.
The problem with users (authors) is valid when we consider individual authors creating data (RDF, HTML,...) "by hand". TimBL has referred to the Semantic Web as a global database of knowledge (as compared to the current web of text content). The problem of incompetent users goes away and higher value of data is achieved when exposing already existing content and databases on the Semantic Web. Think sites like SlashDot, wordpress.com, amazon.com, NY Times,...
Authoring of RDF data is not so different from authoring XML or RSS. This means that costs of putting your site on the Semantic Web are quite low. The benefits are a global reuse of information.
For example: it is easy to install WordPress SIOC plugin to export RDF from any WordPress based weblog. Individual users don't have to care what RDF is or looks like. And the data about all posts and comments are now computer readable and can be reused in a number of ways, e.g., to create a TimeLine of your posts.
If we take this approach and expose data from existing sites in RDF, the task of authoring quality data can be accomplished. The problem of spam referred in the article can be dealt with by signing the information - since Semantic Web is still young the problems of misuse can be addressed in the architecture right from the beginning.
I would like to focus your attention in another important area - consumers of Semantic Web data. There is and will be quality data out there. What is interesting now is to find new and useful ways to use this information and add value over what can be done with simple web pages.
So long as it's just blocking fast-forwarding on ABC shows and not other channels, let me be the first to say that I have absolutely no problem with this.
Indeed.
But in order for the executives of ABC to feel what it would be like we could suggest that they remove 'skip' and 'fast forward' buttons from their DVD players at home. 'Play', 'pause' or 'stop' should be enough for everyone!:)
There are already many sites like that out there already. What they could differ in is - provide some metadata about these software titles in machine-readable form.
It would be much more fun to have machine-readable links between different titles that forked one from another, etc. Uses can be starting from "maps" of software evolution mentioned in above and to other uses yet to be imagined. (Note: I do not know if DOAP allows to describe such parent-child relationships between software projects, but if such a property is needed I am sure someone will invent it).
P.S. Having information about abandonware would also be useful - but mainly if they'd also provide downloads and source code (where available). Although I doubt anyone will go to such extent to preserve abandonware.
There were comments in the/. post "On Software Patent Lawsuits Against OSS" that suggested a possibility of an underground (anonymous) OSS development model emerging if patent lawsuits made (a lot of) OSS illegal.
While responses to that comment claimed it is highly unlikely to happen (lots of OSS development done by big companies or just people unwilling to do it if they migh be sued) it is an interesting idea of a trend, which has some similarities with anonymous publishing mentioned here.
Sorry - my mistake - should have been RSS 2.0 instead of 1.0. Though, it does not really matter, as it is a plain text, simplified down to absurd version of RSS.:)
The author seems to just have found out that RSS 3.0 already exists:)
The funniest thing, though, is that RSS 3 apparently exists,
here. I canvassed the web quite thoroughly, or so I thought, before starting this. I didn't find a thing. Well, luckily enough, that dialect of RSS has been around for 3 years and still no takers. And that's because it's an entirely new format, text based rather than XML based. (I wonder if I'll find this funny if the author demands that I change the name of this site...)
A pitty he does not even see irony in why that version of RSS 1.0 was created.
I do not see a reason to patch up RSS 2.0. (No, really!). It is working as is and there are formats that'll replace it.
One of them is RSS 1.0 - although it is often perceived as predecessor of RSS 2.0, it is a different branch of evolution, based on W3C standards. What RSS 1.0 allows for is to combine and mix different kinds of information - rich information about the author of a post, etc...
Sadly it has been perceived at too difficult (but now they say even RSS 2.0 is too complex...) and it has been only recently when the benefits of RDF based feed format are being really used.
And then there is Atom 1.0, which has just been released.
Unless it is a viral marketing stunt made by the publisher. Maybe all there is is just the news story and stuff.
If the can is open, we should see the "worms" soon. If we don't, well, maybe there is no can [open].
Citing nacturation: "If someone really wanted to, they could give the book to a friend in the US where they're free to publish all the plot details. Let's see the BC Supreme Court enforce its rights-bashing injunction on a US citizen."
It would be a very good thing if somebody who saw it did publish a review of a book - to show we still have some freedom.
But I have to disappoint you about publishing plot details in the US.
According to this BBC article the US had issued preliminary injunction against disclosure of the book even before the leak:
BBC: "Publisher Bloomsbury has also taken out a "John Doe" injunction - a legal order against an as yet unnamed defendant, routinely used in the US - to stop anyone disclosing information about its contents."
I won't be so sure (that it is not a general case).
A regular user won't be inventing his own ontologies the same way as he is not inventing a new RSS format. There is a set of well-define ontologies that you can use to describe your data. And a regular user won't be hand-crafting RDF data either. Instead RDF data will be exported from his applications the same way as RSS and Atom are exported from his weblog software or as Word saves users files.
RDF data will still merge together, provided there are "crystalisation points" that are common to data from different sources.
Regarding Luc Steels' research you are mentioning - could you give some pointers to his work?
A good point about the million monkeys. :)
;)
To illustrate one of key differences of how RDF is supposed to work and why it may even be fun (and not to post same info again) check out at this comment: Tutorial on the Semantic Web.
A simple and concrete example: when a conference publishes its delegate list in RDF it is suddenly very easy for services that use this data to appear. By combining this data with Google Maps we get a FOAFMap of its participants where the application has extracted machine readable data from RDF, used geo-information to put them on the map and has retrieved more data from FOAF RDF profiles of the participants.
That's a simple mashup, but it shows how machine readable data can help spring new and sometimes unexpected use cases. I don't understand why all conferences do not provide data in RDF yet.
Here is a Tutorial on the Semantic Web.
Pay attention to the slide #22 which shows how data from different sources can be merged together. This is one of key differences between XML and RDF - to merge XML data from a number of different schemas one would need to create an application that processes data in these schemas and generate merged data (possibly inventing a new schema to represent the merged information).
In RDF that happens "magically" - in order to merge heterogenous data you don't need to do *anything* - just put all the information in an RDF store and it merges. If the data to be merged change no modifications to the store are necessary - it is like a bag that can hold anything.
As I pointed out in the previous comment authoring data on the semantic web is no more difficult than authoring RSS or XML.
Yes, figuring out for the first time how to represent your data in RDF (or XML for that matter) can be difficult. Imagine if everyone was trying to come up with an RSS standard on his own instead of using RSS export functionality of his content management tool. That's why we need good guidelines how to publish information on the semantic web. And RDF export functionality (plugins) similar to what RSS plugins are doing.
As for opportunities for spammers and mischief - don't think so.
Why? - If you look at the Semantic Web "layer cake" you will notice such technologies as digital signatures, encryption and trust being part of the scheme. They allow to identify the author of data and ensure he is what he claims to be. There is nothing wrong with your application if it only accepts signed and trusted data. And there is nothing preventing authors of the data from signing the contents. Since the semantic web is a new technology and we already know about problems that spam and misuse can present it is more not less prepared to fight spam.
Note1: Semantic web should be viewed as an integral part of the existing web, not its opposite. Might even be that it can provide an additional layer that will help to combat spam and other problems you mention here. Who knows.
Note2: Spammers will always try to come up with new exploits. We all have to be prepared for this and think how to close the holes they are using. But saying that newer (a further development of existing [web]) is necessarily more opportunities for spammers in wrong.
The problem with users (authors) is valid when we consider individual authors creating data (RDF, HTML, ...) "by hand". TimBL has referred to the Semantic Web as a global database of knowledge (as compared to the current web of text content). The problem of incompetent users goes away and higher value of data is achieved when exposing already existing content and databases on the Semantic Web. Think sites like SlashDot, wordpress.com, amazon.com, NY Times, ...
Authoring of RDF data is not so different from authoring XML or RSS. This means that costs of putting your site on the Semantic Web are quite low. The benefits are a global reuse of information.
For example: it is easy to install WordPress SIOC plugin to export RDF from any WordPress based weblog. Individual users don't have to care what RDF is or looks like. And the data about all posts and comments are now computer readable and can be reused in a number of ways, e.g., to create a TimeLine of your posts.
If we take this approach and expose data from existing sites in RDF, the task of authoring quality data can be accomplished. The problem of spam referred in the article can be dealt with by signing the information - since Semantic Web is still young the problems of misuse can be addressed in the architecture right from the beginning.
I would like to focus your attention in another important area - consumers of Semantic Web data. There is and will be quality data out there. What is interesting now is to find new and useful ways to use this information and add value over what can be done with simple web pages.
A way to extend Opera's user interface the same way as FireFox extensions can.
But in order for the executives of ABC to feel what it would be like we could suggest that they remove 'skip' and 'fast forward' buttons from their DVD players at home. 'Play', 'pause' or 'stop' should be enough for everyone! :)
There is now DOAP (description of a project) - a vocabulary / schema that allows to mark up such information.
It would be much more fun to have machine-readable links between different titles that forked one from another, etc. Uses can be starting from "maps" of software evolution mentioned in above and to other uses yet to be imagined. (Note: I do not know if DOAP allows to describe such parent-child relationships between software projects, but if such a property is needed I am sure someone will invent it).
P.S. Having information about abandonware would also be useful - but mainly if they'd also provide downloads and source code (where available). Although I doubt anyone will go to such extent to preserve abandonware.
There were comments in the /. post "On Software Patent Lawsuits Against OSS" that suggested a possibility of an underground (anonymous) OSS development model emerging if patent lawsuits made (a lot of) OSS illegal.
While responses to that comment claimed it is highly unlikely to happen (lots of OSS development done by big companies or just people unwilling to do it if they migh be sued) it is an interesting idea of a trend, which has some similarities with anonymous publishing mentioned here.
Now if they only had a Linux version... I don't want another application holding me to stick with Windows.
RSS 3.0 Cease & Desist Notice
The style of the notice and a healthy attitude well deserve some time spent reading it.
PS Agree about RSS 1.0.
Sorry - my mistake - should have been RSS 2.0 instead of 1.0. Though, it does not really matter, as it is a plain text, simplified down to absurd version of RSS. :)
I do not see a reason to patch up RSS 2.0. (No, really!). It is working as is and there are formats that'll replace it.
One of them is RSS 1.0 - although it is often perceived as predecessor of RSS 2.0, it is a different branch of evolution, based on W3C standards. What RSS 1.0 allows for is to combine and mix different kinds of information - rich information about the author of a post, etc ...
Sadly it has been perceived at too difficult (but now they say even RSS 2.0 is too complex...) and it has been only recently when the benefits of RDF based feed format are being really used.
And then there is Atom 1.0, which has just been released.
The "leak news" just made it even more visible - even to those who do not care about the book itself so much. E.g., they just got
-- CaptSolo Weblog
If the can is open, we should see the "worms" soon. If we don't, well, maybe there is no can [open].
-- CaptSolo Weblog