Web Redesigned With Hindsight
Randy Sparks writes "Tim Berners-Lee has been speaking about his vision for the Web. He proposed the Semantic Web six years ago and it's taken that long for the W3C to ratify his plans for Resource Description Framework (RDF) and the OWL Web Ontology Language (OWL). Effective the Semantic Web is the Web as we know it put into database form and with added metadata. You can read more about it over on MacWorld and see a Semantic Web proof-of-concept at the Web Archive."
The web is popular because it's easy to create web pages. The semantic web stuff strikes me as something that only someone with a PhD in semantics could love. IMO it violates the KISS principle.
Have you read my blog lately?
The macworld article isnt very informative to someone who've never heard of this "next generation" web, but it seems like they want to add it on top of the existing WWW.
Why cant someone just invent a new similar, improved web that is separated from the current WWW, with its own specific browser, and implement the various ins, outs and whathaveyous to keep the riffraff from exploiting it in very annoying ways?
This kind of thing goes to show how much difference can be made by getting the initial trajectory right.
A few small changes at the start can lead to BIG consequences later as the inertia of the whole mess gets going.
Anyone else out there with a really great idea? Do us all a favor and think as far ahead as you can before you release it on the world. Even then, it will still eventually not be going in the optimal direction.
"Provided by the management for your protection."
- Intelligent search engines that produce much better results than Google etc. because they can index the meaning of documents, not the words they contain.
- Agent technology that can retrieve information for you, price compare items you are shopping for and automate a number of interesting processes.
- Automatic clustering of website around subjects of interest to create much richer knowledge-oriented navigation.
But the Semantic Web project can't succeed as it is currently specified. It is working towards standards for storing and managing the meta-content required for this Brave New World but doesn't tackle the much harder problem of how to create meta-content that is consistent and pervasive. At present this is left to individual web page authors with no mechanism to ensure consistency. Without consistency, the Semantic Web is doomed. If I tag a web page as being about "software engineering" and another person uses the tag "computer programming" the Semantic Web can't tell they are about the same thing.In a world where an estimated 70% of web pages don't even have a title isn't it rather unrealistic to expect most web page authors will learn a complex new representation like RDF and consistently tag their pages with it?
Clay Shirky has a very good article on this. I recommend reading it before you get too excited about the Semantic Web.
Sailing over the event horizon
The semantic web does keep it simple. It's supplimental to current web pages and is optional. It simply adds more data for computers to read. It's something very basic that leaves the opportunity for much more complex things later. Anyone who can't understand a triple - a subject, verb, and object - probably failed second grade english.
Developers: We can use your help.
Excuse me, but can they stop overdesigning HTML? Its a freaking pseudo-layout language. The whole beauty of it is that complete newbs can learn to text-edit it. Now, with all the crufty front matter, its impossible to hand-write html that will pass a verifier. Many of the more useful layout features that don't have anything to do with style classes are being put into css instead of html proper. HTML is a dead simple concept, and as such should be a newbie tool. Instead, its just getting increasingly baroque. It really doesn't need more crap.
Now, the http system itself - that could do with some upgrades. More support for "push" content is what it needs - like slashdot telling _me_ when there is new news so my browser can refresh, and sending me a diff instead of the full new page. Or support for distributed file hosting. Or some way to recieve HTTP requests from behind a NAT (even if it requires an external name server to help you along) without forwarding ports to yourself (if thats at all possible). My knowledge of network topology is limited at best, but if I can get ICQ messages while behind a nat, why can't I serve HTML? Its still just receiving unrequested data - messages in one case, requests for content in the other.
Having access to tons of annotated data is a wonderfull dream. I could see academic institutions going for this, but not corporations for the most part.
You see, corporations don't WANT you to be able to access data easily. One of the major driving factors of the current web is advertising. Basically, this is something none of us want to see, but with web pages it's easy to try and force us to see it. Properly annotated data would kill advertising as we know it, something the corporations will not let happen.
Also, corporations do not want us to be able to easily compare data either. Take prices for instance. Many stores have promises like "we'll match any price". This worked on the basis that it's hard and tedious to go check other prices and people will think "well, hey, if they are making this promise surely they already have the lowest price otherwise everyone would be calling them on it". Well, no, most people will not go check for lower prices, and if they do and end up finding lower prices elsewhere, they will often buy elswhere. Easy price comparisons are not something online stores want to allow.
Ulitmatly, most sites want to force you to look at data they want you to look at (ads). I doubt we'll ever see all web data in a nice annotated form allowing us to view only what we are interested in.
Yes, it is beautiful. Why? 'cause it was written by a twelve year old who read a three page hand out her teacher gave her on "how to make a webpage", and she's been learning by tinkering since then.
People are not coders. People are users. Users want to just use things - not muck around with research, not have to learn whole new lexicons for each task, just get stuff done. HTML is practically the only pure-text system they seem to do that in anymore - everything else is covered in complex guis. To many people, html is the bridge to programming. With that bridge lost, they might never want to use anything that's not pure wzywig, and there aren't many programming languages like that.
Like it or not, HTML has become the learning ground for many budding computer users.
My CSS complaints came out wrong - what I was complaining about CSS was that originally, everything that could be done in CSS could be done in HTML as well. You could write proper, stripped HTML and use robust CSS, or you could just do the whole damn thing in ugly, ugly HTML, and still have access to the whole featureset. Now there are features that exist only in CSS beyond simply defining classes of things that already occur in HTML. So, newb html-only users end up with an incomplete feature set. If CSS was more intuitive this wouldn't be a problem, but currently it is far too cryptic to push onto an uninformed user. As a result, learning users stick to pure HTML, and thus are stuck with half a feature set.