Detecting Conflict-Of-Interest on the Semantic Web

← Back to Stories (view on slashdot.org)

Detecting Conflict-Of-Interest on the Semantic Web

Posted by ryuzaki0 on Friday December 8, 2006 @07:00AM from the smoothing-things-out dept.

CexpTretical writes "At the 15th International WWW Conference in Edinburgh Scotland, Refereed Track on Semantic Web accepted many thorough and interesting academic papers on semantic web research on subjects related to where the Web is in the Semantic Web? One such paper nominated for Best Paper Award, Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection hits on the whole subject of validation and/or verification in the brave world of so called "Web 3.0" topologies/frameworks/architectures. The paper describes a "Semantic Web application that detects Conflict of Interest (COI) relationships"."

34 comments

Min score:

Reason:

Sort:

Slashdot? by DigiShaman · 2006-12-08 07:02 · Score: 0, Troll

With all the "Slashvertisements", Slashdot sure could use one of these servers.

--
Life is not for the lazy.
Web 3.0" topologies/frameworks/architectures by delirium+of+disorder · 2006-12-08 07:10 · Score: 2, Funny

WTF is up with this web reversioning trend?

--
------ Take away the right to say fuck and you take away the right to say fuck the government.
Conflict of interest? by jlowery · 2006-12-08 07:10 · Score: 4, Funny

Perhaps the harder problem is detecting any interest in the Semantic Web.

--
If you post it, they will read.
1. Re:Conflict of interest? by silentounce · 2006-12-08 08:57 · Score: 1
  
  Shut up you anti-Semant!
  
  --
  There are many tongues to talk, and but few heads to think. -Victor Hugo
I'd like to see this applied to politicians and by zappepcs · 2006-12-08 07:20 · Score: 1, Interesting

their campaign contributions, former colleagues etc.
That would make for an interesting web application and an interesting election year...

--
Support NYCountryLawyer RIAA vs People
It's all semantics by Ranger · 2006-12-08 07:20 · Score: 3, Funny

Calling it a conflict-of-interest is really a matter of semantics. The conflict arises when people see the words semantic web. They are someowhat interested but but are conflicted in not admitting they don't know what the word semantics means and are too embarrassed to look it up.

--
"You'll get nothing, and you'll like it!"
1. Re:It's all semantics by eln · 2006-12-08 07:55 · Score: 1
  
  The Internet means never having to feel embarrassed about looking things up. For example, let's say I answered your post by saying, "Semantics is the study of the relationships between various signs and symbols and what they represent." Clearly, you would have no way of knowing whether I went to dictionary.com and pulled a random definition out, or if I just knew that off the top of my head.
2. Re:It's all semantics by Saikik · 2006-12-08 13:04 · Score: 1
  
  I didn't RTFA, but I think he was talking about Anti-Virus
  
  This web thing sounds like the perfect way to catch a lot of nasty bugs.
3. Re:It's all semantics by Anonymous Coward · 2006-12-08 14:41 · Score: 0
  
  Semantics, you know, it means, what does it mean?
Semantic Web Explanation? by Anonymous Coward · 2006-12-08 07:25 · Score: 0

Let's hear an example of what this semantic web will look like to me, J. Random Hacker. In nethack dialogue preferably.

I'll start:

f - a blessed web browser
af
Suddenly, your blessed web browser starts up! --more--
You see a...
"No-fly-list" for Conflict of Interest by radtea · 2006-12-08 07:29 · Score: 4, Informative

This is an excellent paper that highlights many of the issues that will be encountered as the naive realists promoting the semantic web hit the hard fact that data quality is poor and identitification is hard. From the paper's conclusions:

The goal of full/complete automation is some years away. Currently, quality and availability of data is often a key challenge given the limited number of high quality and useful data sources. Significant work is required in certain tasks, such as entity disambiguation.

As a practical tool the Semantic Web has all of the problems that no-fly lists have. People share names with each other and one individual may appear under multiple names. Datasets are radically incomplete, and an awareness of the possible uses to which data may be put will encourage the less scrupulous amongst us to deliberatly devalue datasets by including misleading or incomplete information.

Even without deliberate poisoning of the data, it is doubtful that standard vocabularies will be used in sufficiently consistent ways by various institutions and individuals to create homogenous (and therefore useful) datasets. For example, people who do multi-centre cancer trials expend an enormous amount of energy on data curation and auditing, which includes actual site visits to institutions and periodic audits of data, as well as centralized control of what gets into the final database. And this is for data collected by cancer centers and cancer docs who are nominally committed to following precise protocols and have been given training in what the fields in the various forms are supposed to mean. Yet centres can and do get delisted from studies due to lack of compliance.

The same thing can be seen in nominally standardized data formats like MAGE-ML and its cousins: industry-standard XML-based languages for marking up genomic datasets. There are specific elements that are intended for particular pieces of data, but a depressing amount of the time companies decide to put the really important stuff in a catch-all element, because "it's easier" than understanding the well-documented and clearly defined format.

Likewise, medical images created in DICOM format by major equipment manufacturers not infrequently have clear and blatant violations of the DICOM standard, despite over a decade of effort to ensure a reasonable level of compliance. And these are not subtle violations, but missing required fields, or incorrect data in required fields ("because all our images are 512x512 why should we have to fill in the width and height all the time? It's easier to just leave them zero.")

People are stupid and lazy. I know I am. And we use the same words to mean different things, and different words to mean the same thing. The Semantic Web requires people to be smart and hardworking, and to use standardized vocabularies in standardized ways. Decades of failed or at best partially successful data exchange protocols strongly suggest that these requirements will not be fulfilled.

--
Blasphemy is a human right. Blasphemophobia kills.
1. Re:"No-fly-list" for Conflict of Interest by SpaceCadetTrav · 2006-12-08 07:39 · Score: 1
  
  Allow me to sum up your post: Semantic Web - Sounds great. Good luck with all that.
  
  --
  Life in Orange County
2. Re:"No-fly-list" for Conflict of Interest by maxume · 2006-12-08 07:49 · Score: 1
  
  The upshot is that it more or less inspired 'Web 2.0'; people realized that some data was better than no data, and that correctness didn't matter as much as a few people(rdf) wanted. So along came delicious and flickr and so forth, and there was the brief period of excitement-chasing where 'folksonomies' replaced ontologies and blah blah blah, but people pretty much said, 'No, I just like to be able to look at all my vacation pictures at once' and things died down, but flickr really is a better photo management site, and delicious really is better than rolling your own link log, etc.
  
  --
  Nerd rage is the funniest rage.
3. Re:"No-fly-list" for Conflict of Interest by fistfullast33l · 2006-12-08 10:01 · Score: 2, Insightful
  
  So there's 9 authors on this paper. Which one are you and which one is the submitter or the article?
4. Re:"No-fly-list" for Conflict of Interest by Anonymous Coward · 2006-12-09 08:27 · Score: 0
  
  Allow me to sum up your post: Semantic Web - Sounds great. Good luck with all that.
  
  That and the core problem with the semantic web is that you're trusting producers of the data to be altruistic.
  
  You can see this very easily even at the filenames and meta data in files used within a normal office environment. Unless there is a mandate and continued reinforcement from above, users will *not* fill out meta data fields and will use the shortest, laziest folder/file names possible.
  
  In short, users will generally not expend any amount of effort for long-term gain that will not benefit themselves in the short term. It's human nature.
Wow... by Duncan3 · 2006-12-08 07:30 · Score: 1

People that know the most about something have an interest in it *gasp* Smart people in X like to hang around other smart people in X, sometimes even goto conferences *whoa*

This is old news in academia, and yet things still work pretty darn well there. That is because reputation is important, and as soon as you do something unethical or even just stupid, you're toast.

If a field gets too "imbred" as far as their research/reviewing goes (e.g. a group always present at workshops at 3rd rate conferences, and the program committee is always the same 20 people), the community just ignores them.

The web on the other hand, you just whip up a new identity and keep spamming, but that's why most people love it, zero accountability, which of course leads exactly where you expect it to.

--
- Adam L. Beberg - The Cosm Project - http://www.mithral.com/
He Said, She Said... by Jeremiah+Cornelius · 2006-12-08 07:32 · Score: 1

I said who put all the things in your head
Things that make me feel that I'm mad
And you're making me feel like I've never been born

She said you don't understand what I said
I said no, no, no you're wrong
When I was a boy, everything was right
Everything was right

I said even though you know what you know
I know that I'm ready to leave
'Cause you're making me feel like I've never been born

--
"Flyin' in just a sweet place,
Never been known to fail..."
1. Re:He Said, She Said... by Jeremiah+Cornelius · 2006-12-08 07:35 · Score: 3, Funny
  
  That comment was pure Web 4.0. The oblique referential web.
  
  --
  "Flyin' in just a sweet place,
  Never been known to fail..."
Conflict of Interest or "common interest"? by RhettLivingston · 2006-12-08 07:33 · Score: 1

It is far more widely useful to view this as a means of finding people who have common interest and may actually be collaborating. Imagine building a database of relationships, appointments, investments, etc that concentrates on the rich and famous, businessmen, politicians, and others with power and then running these algorithms on those relationships. Imagine examining those relationships in the context of subjects (relationship strengths should differ depending on the subject through which the relationship is being judged). The world of politics could be turned on its heels if a useful map of relationships could be created for each political subject and easily traversed by strength. Surely though, this isn't new. In particular, I've seen many calls for research into this aspect of datamining from the intelligence community, a community that has also been investing in the creation of database technologies that can hold and semantically access many petabytes of data.
1. Re:Conflict of Interest or "common interest"? by ebers · 2006-12-08 07:44 · Score: 1
  
  You might start with the notable names database, http://www.nndb.com/
Detecting horrible grammar. by Anonymous Coward · 2006-12-08 07:44 · Score: 0, Troll

At the 15th International WWW Conference in Edinburgh Scotland, Refereed Track on Semantic Web accepted many thorough and interesting academic papers on semantic web research on subjects related to where the Web is in the Semantic Web?

Uhh...wtf? Does that make sense to anyone? How is that sentence a question?

For fucks sake Zonk, EDIT! DO YOUR F'IN JOB !

And now back to mine...
Semantic Web by Anonymous Coward · 2006-12-08 07:54 · Score: 0

I think anyone who uses the term "Semantic Web" deserves a cock-punch. Who's with me?
1. Re:Semantic Web by joshetc · 2006-12-08 08:18 · Score: 1, Funny
  
  I think anyone who uses the word "web" followed or suceeded by anything other than "world-wide" or "spider" deserves a cock-punch.
Well duh! :-) by Anonymous Coward · 2006-12-08 08:12 · Score: 0

> Calling it a conflict-of-interest is really a matter of semantics.

Well, of course it is! This IS the Semantic Web they're talking about. I think everything on it is a matter of semantics :-)

At least it's not the Symantec Web, though. That'd probably make high end PCs slow to a crawl and ask you "Are you sure you want to open index.html? It might contain a virus or something, although we're not really sure. We're just asking to make it absolutely clear that it's your fault if it breaks something."
1. Re:Well duh! :-) by xmedar · 2006-12-08 12:48 · Score: 1
  
  I think everything on it is a matter of semantics :-)
  
  Except the syntatic sugar ;-)
  
  --
  Any sufficiently advanced man is indistinguishable from God
wikipedia standardized vocabulary and semantic web by free2 · 2006-12-08 08:26 · Score: 2, Informative

People are stupid and lazy. I know I am. And we use the same words to mean different things, and different words to mean the same thing. The Semantic Web requires people to be smart and hardworking, and to use standardized vocabularies in standardized ways. Decades of failed or at best partially successful data exchange protocols strongly suggest that these requirements will not be fulfilled.

A quite standardized vocabulary actually exist in Wikipedia (markup language, templates, categories).
Here is a list of links that try to combine wikipedia and the semantic web:

http://wiki.ontoworld.org/index.php/Semantic_Wiki_ State_Of_The_Art
http://wiki.ontoworld.org/wiki/Sites_using_Semanti c_MediaWiki
http://en.wikipedia.org/wiki/Wikipedia_talk:Semant ic_Wikipedia
http://www2006.org/programme/item.php?id=4039
http://meta.wikimedia.org/wiki/Semantic_MediaWiki
UH OK by SydBarrett · 2006-12-08 09:05 · Score: 1

Even though I looked at the paper, I still have no fucking idea what this is about.

I vote that next year the Best Paper Award go to "Looseleaf Paper" because it has both holes AND lines.
Meh. by PHAEDRU5 · 2006-12-08 09:06 · Score: 1

This was interesting enough, I guess, as a really high-level description of a process that could be used to build a semantic web.

That said, the researchers picked the domain very carefully - to guarantee a positive-looking result, I guess - but I don't see how this could scale to the web in general, a place where, well, nobody knows you're a dog.

--
668: Neighbour of the Beast
Yes, slightly offtopic.... by UncleTogie · 2006-12-08 09:11 · Score: 1

...but in the article, did you notice their example, "Swoogle"?
Being just curious enough to add a .com after that, I then wondered if they chose that site on purpose. Doubtful, but positive reinforcement DOES work at times...

--
Don't tell me to get a life. I'm a gamer; I have LOTS of lives!
Semantic Web - the new FIPA by DocDJ · 2006-12-08 09:28 · Score: 2, Informative

It's all very well these hucksters peddling the semantic web to funding bodies who don't know any better, as long as they don't start pretending it's anything other than the new FIPA - a collection of committees generating specifications that the world will continue to ignore.
Conflict of interest: Nobody is interested by oren · 2006-12-08 10:20 · Score: 1

The semantic web assumes everyone in the world will play nice and publish his data using standard schemas.

This is estimated to happen soon after Microsoft will switch to a POSIX standard operating system, the RIAA will support buying musing in Ogg Vorbis format, and Sony and Microsoft will agree on a common Blu-DVD format, and airline companies will really tell you how the compute their ticket prices. And the rupture.

Seriously... the idea is beautiful in theory, but in practice people do not want their data to be available. The business case for the semantic web seems to be "lets all cut our profit margins to nothing!". Small wonder it took off like a lead baloon.

Here is a trivial example: product prices. If vendors had wanted to make it easy for everyone in the internet to be able to view their catalog and compare prices, all it would take is a "standard" using  and . There is a reason this doesn't happen. The internet vendors hate pricegrabber and froggle and their kind. They want you the customer to log in to their site to look prices up, thank you very much.
1. Re:Conflict of interest: Nobody is interested by Anonymous Coward · 2006-12-08 13:17 · Score: 1, Interesting
  
  They've done a good job shutting pricegrabber and froogle down, haven't they? No, wait ...
  
  Seriously, some people do want their data to be available, and if that makes them more competitive, why not?
I didn't know that Jewish people by Quintios · 2006-12-08 10:48 · Score: 1

had their own Interwebs. Good for them!

--
Anonymous Cowards are at -6...
Mel Gibson on the Internet by Anonymous Coward · 2006-12-08 12:26 · Score: 0

When I read this I thought "Detecting Conflict-of-Interest on the Semetic Web"

Damn Zionists! :)