I work at one of the few places that crawls billions of URLs each month, and I observed exactly the same thing as Peter. There just isn't that much xml/rdf/daml/owl on the web. At the point when we had crawled 6 billion URLs, I found only 180,000 URLs that had a mime type or extension to indicate that they were machine-readable metadata.
The reason is something that people in the semantic web community are loathe to talk about - that there isn't enough incentive for people to create metadata that they put out for others to read. When we write web pages or blogs, we are able to express ourselves to other humans, but when we put out data there is no clear incentive (economic or otherwise) to justify the effort. This is probably why there is so little metadata being published.
If you wish to dispute the small amount of data, feel free to put up a web server showing a million URLs of metadata created by others.
What company with a $150B market cap uses MySQL to store their mission critical data? One that doesn't take their advice from IBM...
I work at one of the few places that crawls billions of URLs each month, and I observed exactly the same thing as Peter. There just isn't that much xml/rdf/daml/owl on the web. At the point when we had crawled 6 billion URLs, I found only 180,000 URLs that had a mime type or extension to indicate that they were machine-readable metadata.
The reason is something that people in the semantic web community are loathe to talk about - that there isn't enough incentive for people to create metadata that they put out for others to read. When we write web pages or blogs, we are able to express ourselves to other humans, but when we put out data there is no clear incentive (economic or otherwise) to justify the effort. This is probably why there is so little metadata being published.
If you wish to dispute the small amount of data, feel free to put up a web server showing a million URLs of metadata created by others.