Untangling Web Information
Ostracus writes "The next big stage in the evolution of the Internet, according to many experts and luminaries, will be the advent of the Semantic Web — that is, technologies that let computers process the meaning of Web pages instead of simply downloading or serving them up blindly. Microsoft's acquisition of the semantic search engine Powerset earlier this year shows faith in this vision. But thus far, little Semantic Web technology has been available to the general public. That's why many eyes will be on Twine, a Web organizer based on semantic technology that launches publicly today."
First, before anything even really started, The Semantic Web was merely a pipe dream.
.... twice. And we were happy.
... the Semantic Web went mainstream and started getting real.
...
But that was the long long ago, so let's fast forward a few years. When its future looked most bleak, Sir Tim (who can summon fire and explosions at will) told us what to expect
Then a few years passed and nothing.
Until the 2006 World Wide Web conference made us suspicious of the Semantic Web. We spread rumors about the Semantic Web and told all the cooler technologies that the Semantic Web was just out to rape our privacy. So we challenged the Semantic Web. And claimed it would fail.
Just when I was expecting Sir Tim to get underneath a blanket & release a sobbing YouTube video of everyone being bastards for attacking The Semantic Web right when she was going through really tough times and that we should all just leave her alone
I've got no problem with people pushing technologies but this one sounds more like a soap opera than anything. Has the Semantic Web changed anything for anyone on Slashdot? I haven't seen anything directly if it has
My work here is dung.
will include a digital rights management compliant
cloud based on a service oriented architecture
that will empower my workgroup over the new semantic web 2.0
insert license fee here.
Good people go to bed earlier.
when will slashdot move to a semantic web model?
Do you even lift?
These aren't the 'roids you're looking for.
Semantics for search engines? You mean like "hunt" engines, "inquest" engines, or "inquiry" engines? Come see the new and exciting functions inherent in the brand new Microsoft "find stuff" engine!
"Creationists make it sound as though a 'theory' is something you dreamt up after being drunk all night." -Asimov
The advertisers and search engine optimizers have already shown that they have absolutely ZERO qualms about providing false or misleading information to search engine robots in the form of page cloaking, hidden frames, false meta tags, etc so what makes anyone believe that they will not play the same games, possibly with even greater result, against the semantic web? There is money to be made by gaming the system and as long as it is possible for website operators to describe themselves on the semantic web then they will describe themselves in any way they have to to drive traffic to their sites and get ad hits, truth be damned.
as an "early adopter" all i can say is this is the most overhyped and pathetic bookmarking site i've seen in a while.
all it does is let you bookmark URLs (via the amazing tech of "bookmarklet"), and then print them URLs embedded in a lot of tags (awww, yeah, RDF, semantics-schemantics). if that is what the semantic web ought to be, thanks, but how about no.
i tried to upload a picture via their e-mail system from my phone. it was a jpeg with embedded location data. guess what I got -- I got an "item" classified as "attachment".
so, again, twine? how about no.
Looking for "Penn State" returned two "tweens". One for the Golden State Warriors and the other for State Cell Phone Driving Laws. How relevant.
Oh, and here's to hoping "tween" doesn't catch on as a buzzword... ugh.
Twine seems to be just a generic contextual search engine, as opposed to a pure keyword search engine. While it's a step, it's a very tiny step.
What I want to see is more about the correlation between topics. For example, if I'm looking into PHP templating and search twine, I get a few people's bookmarks on the topic. Nothing especially useful, and definitely nothing I couldn't find elsewhere. With real semantics I'd want to see a list of various templating engines, pro and con articles grouped for each, and maybe other sections on related design patterns and frameworks.
In other words, I want to see semantics. Context search isn't going to make anyone turn their head.
Developers: We can use your help.
...how are they supposed to teach a machine to infer meaning better than they're able to?
I'm seriously wanting to know.
This concept of a semantic web is pretty new to me as a general "Web 2.0" type buzzword. There's no question that in general we want our computing experiences to be more thorough and intelligent. But if we're talking about computers analyzing the web, what we really are looking for it seems is true artificial intelligence. We want the type of AI that Tony Stark has. And I think 25 years from now perhaps we may start getting systems that come close to that goal, but there still appears to be a lot of work required before we realize those dreams.
Part of the hardcore faithful who believed in Apple long before it was cool again to do so
Until then, I'm sticking with Lynx!
You should try hakia.com.
In a nutshell, the goal of the Semantic Web is to bring knowledge representation to the Web (using graphs, networks, binary predicates, however you want to call it).
I've been trying to apply data from the Semantic Web for a few years now.
I can see two roadblocks to mainstream adoption:
* Web data is immensely scruffy. If thousands of people contribute to a dataset without any restrictions, you get a mess (e.g. multiple URIs used to denote the same class or individual, which results in fractured data). Having said that, I can see some convergence happening on reusing URIs (for classes that has happened for a while now, for instances this is getting better every day).
* Without proper data, it's hard to show the benefit of having a web-wide knowledge base. Right now, my marketing pitch for our semantic web search engine is to go "from documents to objects", i.e. you want to locate objects (the person CmdrTaco) rather than documents matching keywords.
Once you have achieved a web-wide knowledge base of decent quality, you can start thinking about how to navigate that information space to actually answer questions (and I don't mean natural language understanding, but a point-and-clic, menu-based interface). CmdrTaco's phone number, people he knows, blog posts he's written, and so on.
The chicken-and-egg circle is slowly breaking up. For a demo, our system is online at http://swse.deri.org/.
I browsed to the Twine page mentioned in the summary and every single link on the page when clicked takes you to a blank white page. So much for a launch.. :)
I think that we will see something like the semantic web, but it will likely develop from grass-roots efforts rather than top-down driven standards.
That said, I have a chapter on the SW in my latest book (hopefully about to be printed in a few weeks) and my next book project will be about the commercial AllegroGraph SW kit.
Good open source and commercial tools exist for RDF repositories, SPARQL queries and inference, etc. The problem is that the current round of applications don't really excite me (yet).
The really big win for RDF/RDFS is the ability to use multiple sources of information without any explicit data conversion: that is (at least partially) what RDFS is for.
Buncha anti-semantists on this site.
Life needs more saving throws.
All the semantic web gives you is the ability to layer a logical design over data. It's like a database design, except it's "open world", meaning there can be many different designs, it's up to the agent to pick the one it trusts, and it can't really make assumptions based on what it doesn't know.
The only inferences made are those that have been imagined by some human designer. And they might be very wrong , if the designer was wrong.
The "kinds" of inferences available are also pretty limited, like hierarchy or transitivity, or set membership. Useful, yes, but stepping stones...
-Stu
serving up web pages blindly IS what the 'web' is supposed to do. More than that and you have a new application. I'm always amazed at how many even technical people discuss the web as if it WAS the Internet.
Comment removed based on user account deletion
I think people will find that the Semantic Web is a chicken and egg problem much like the beginnings of the Internet were. For both to be useful more people had to be using it and for people to use it there had to be something there for people to want to use.
There's one issue to think about when discussing the new web, be it social sites or the semantic web. The quantity and quality of information required for both to work properly vs privacy issues.
Shai Schticks:"You don't make peace with friends, you make peace with enemies"
Search engines pretty much ignore meta tags, because spammers used them to misrepresent their pages and get more hits, so why do these "experts" expect anything different from tags which try to represent "meaning" in the semantic Web?
Reduce, reuse, cycle
...technologies that let computers process the meaning of Web pages...
Assuming it is buggy enough, it could serve as an automated summary-generator for slashdot.
I worked at a big tech company doing SemWeb, where my experience was exactly the same. Everyone was scratching their head.
Now I've moved into Healthcare IT environment, where SemWeb makes perfect sense. Its like the best tool for the job.
The essential difference is what end of the stick you are picking up. The tech folks who are trying to shoe-horn RDF/OWL onto anything n everything (e.g. search) are failing. On the other hand, Healthcare/Life science folks who have to work with heavy knowledge intensive stuff, its working like a charm.
The SemWeb story is quite similar to Amazon Kindle.. wherein the tech folks are hating it whereas real users are all over it.. So it might seem like a failure to all you tech bozos.. but the domain experts are lovin' it.
... or how to launch a site using slashdot and a poorly written summary of vague buzzwords.
"Violence is the last refuge of the competent, and, generally, the first refuge of the incompetent" - Thing_1
I have a dozen web sites, and the couple of dozen search engines trying to index the sites grew unsitely. So I now block all but the top six.
Here's a list of excluded ones:
#
# blocked UAs
#
regexp -nocase
{^Mozilla/4.0$|CorenSearchBot|ActiveX|iPhone|nutch|NaverBot|attributor
|rarest|spider|DBLBot|Robot|Indy Library|Yandex| obot|ISC Systems|
OOZBOT|WebDataCentreBot|Twiceler|discobot|SnapPreviewBot|Snapbot
|Szukaj|BecomeBot|oder so|proximic|scoutjet|mrcarlito|Transcoder|
Opera Mini|SuperBot|WebAlta} $::env(HTTP_USER_AGENT)
There are other SE blockages, based on other criteria, such as if someone built yet another SE from amazonaws.com and it comes in from there.
And this crap:
unknown HTTP_USER_AGENT 'mcdpvnh6j drlpfxvrnfgIyy a Ivpns'
unknown HTTP_USER_AGENT 'mrckvowekp Jl sjclikwhe ih'
unknown HTTP_USER_AGENT 'xiuesr9bkdff9j 9oawotxtds orhdkh'
I did find one trying to figure out music choice suggestions by examining a music site; this was described by their web site as building a semantic search.
Their beta SE was faulty and kept reindexing us at a high rate.
They got blocked too.
http://harvey-mars.com/
I typed in "XUL" and it surprised me with results that Google will have problems to surpass. Just go to Google and compare.
(I tried even more keywords at once, but I was simplifying it more and more... so I ended up with one keyword to get the best results).
Well, I've got to get back to work. When I stop rowing, the slave ship just goes in circles.