Nepomuk Brings Semantic Web To the Desktop, Instead
An anonymous reader writes "Technology Review has a story looking at Nepomuk — the semantic tool that is bundled with the latest version of KDE. It seems that some Semantic Web researchers believe the tool will prove a breakthrough for semantic technology. By encouraging people to add semantic meta-data to the information stored on their machines they hope it could succeed where other semantic tools have failed."
I've tried Symantec products in the past, and they are worse than actually having a virus. They slow your PC to a crawl, get their claws into every part of your computer, and are extremely difficult to purge when you finally give up on them.
What exactly is semantic web, and why haven't I ever heard of it?
--- "When you gotta do something wrong. You gotta do it right. (Fighter)"
...without all the fucking ads and stuff:
http://www.technologyreview.com/printer_friendly_article.aspx?id=21840&channel=computing§ion=
Please, starting patenting things. And "granting permission" to open source communities.
And refuse to companies.
Would that work?
I've tried out Nepomuk and, while I have to say that it's promising, it's got miles to go before it's even near ready. The main problem is application support. Sure, you can rate and tag and describe your files in the Dolphin file browser. So what? You can do the same in Vista. This doesn't mean anything if applications don't hook into this and make use of it. Of the apps I've used, Gwenview (a photo viewer) has Nepomuk partially implemented but it's buggy and you need to compile it yourself with it explicitly enabled (this will apparently change in KDE 4.2). Digikam, which allows you to rate, tag, and describe photos already, says that they have no plans of integrating with Nepomuk anytime soon. Amarok 2 has work towards a Nepomuk collection, but the devs say that this will always run along side the main, MySql-based collection and it's nowhere near ready yet. My email is in the cloud so I can't even begin to talk about KDE-PIM's support or lack thereof.
The other problem at the moment is a lack of ability to query your semantic data. Can I get anything to show all photos with my wife in them that I've rated four or above? Not at the moment. Hopefully this is coming in KDE 4.2, but as it stands at the moment it makes Nepomuk a case of write-only memory.
So, maybe something to get excited about in the future, but not quite yet.
Sorry about that.
NepoMUCK? Anything ending in "MUCK" doesn't sound like a good product. The concept is very interesting but the name isn't the best I've seen.
I'm glad that they don't prefix everything with K though.
Yes, I know that Nepomuk means "Networked Environment for Personalized, Ontology-based Management of Unified Knowledge" as stated in the article.
You are not entitled to your opinion. You are entitled to your informed opinion. -- Harlan Ellison
[original research]
http://www.xkcd.com/123/
The Tao of math: The numbers you can count are not the real numbers.
It's not as bad as GIMP :)
It probably depends on the performance of your computer. Mine still handles the script within the browser's script timeout limit, but the script is taking noticeably longer and it's getting annoying enough that I am considering turning off Javascript on slashdot.org if this issue isn't fixed soon.
I love it how mr Spivack says: "Nepomuk is designed for real people and developers"
I've been experimenting with metadata and blogs, and specifically the cluster analysis of those conversations on the web - so far so good ( http://www.wallcloud.net/ ). I'm really interested in seeing how our desktops change as our information starts "clumping" together for us - our contacts, files, work items, etc arranging themselves on screen. I'd love to have a dev tool that would allow me to right click and jump to the SQL table I'm hovering in the code, and maybe gesture to bring up jobs that intersect with those tables. I think our work will one day be more like origami - unfold and turn find related information...
meh
Everyone is just argueing about semantics.
And I'll tell you why.
The Nepomuk Web site wants to make me chew my own arm off. Now, I'm familar with the Semantic Web, I'm excited by the idea of semantic organisation. But this site is the epitome of grim, lifeless European research-ese. It completely fails to convey the technological approach, how it works, or why you should give a damn. I get the impression that the team was more interested in the EC funding then actually developing a disruptive technology.
Why why can't researchers spend 15 minutes thinking about how to convey the importance and excitement of what they are trying to do in terms of practical examples.
I'm afraid you'll probably have to wait until some enterprising 3rd party to grab the source and build some of the technology into a different product.
Semantic shmemantic. It's so 2008, let it go.
Let's have a new buzzword for 2009. I nominate "emotional".
Confucius say, "Find worm in apple - bad. Find half a worm - worse."
All information is semantic. This slashdot post is information encoded using English semantics. Unfortunately for the machines, the English semantics are way to complicated for them to understand. So they need a simpler set of grammar rules to be able to parse it. But why would anyone want to waste time marking it up just for the benefit of machine readability when google basically can accomplish the same thing without all that metadata markup cruft?
Football Odds
Dear friends,
I will summarize, for all of you, the semantic web in a few lines getting inspiration from this tool (which is, if we want, the semantic web narrowed to your workstations).
The only scope of this semantic tool should be that of:
1) Tagging pictures/whatever with tags like "my wife", "woman", "person" etc
2) Searching pictures using deduction (given an ontology,let's say a SCHEMA, that states something like: my wife (IS-A woman AND IS-A person))
then searching, e.g., for women' pictures, you will find also those of your wife and so on..
(definitions could be, of course, more complex).
Semantic web is all about this and any average joe can easily guess the following LITTLE (ahahah) problems:
1) Everything must be tagged.
2) Information must be TRUE (otherwise you will get bad deductions).
3) Ontologies, that is schemas stating what IT IS, should be shared (please don't die laughing)
3) Not all "SCHEMAS" can be deductible (the complexity of what you state is a huge COMPUTATIONAL problem).
finally: watch out .. there's a lot of hype around semantic web.. it's *ONLY - ahhaha* 8/9 years that researchers are trying to turn into business this s...t
There's a good rant from Corey Doctorow about this. I think the best phrase that summaries people's high hopes for the semantic web is "nerd hubris".
Even if one is using a proper reference frame (not a rotating one), there is still an outward force in the system, namely the reactionary force to the centripetal force. Said reactionary force could legitimately be called a centrifugal force, but it is a force applied to the central object, not on the outer object, which distinguishes it from what people usually mean when they say centrifugal force.
But this is really quite off-topic.
Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
You got that exactly backwards.
The WWW was an earlier doomed attempt at semantic markup, and up until the summer of '93 or so it looked like it might work. That's when the early rants about people using the tags to control layout instead of too convey meta information (e.g. using em to get italics in a bibliography, dt/dd to make roman numeral lists, etc.) started--or at least when I first became aware of them. In fact, pretty much the entire history of HTML has been a tension between the language's designers and purist, who want users to care about what markup means, even if it does nothing, and the vast majority of users who only care about what it does regardless of the "meaning" that may be ascribed to it. Once you can get your head around both perspectives some of the goofier things in the whole tawdry history (the Table Wars, XML, CSS) make a lot more sense.
Ok, a little more sense. But only if you already knew what people are like.
--MarkusQ
Everybody and his uncle tries to make systems that will index every piece of crap on your PC and it invariably results in a useless and horrible waste of resources. The biggest annoyance is trying to figure out how to turn these damn things off. Considering that the average user only searches for something once in several years, an on-demand search system makes far more sense.
Excuse me, but please get off my Pennisetum Clandestinum, eh!
There's actually a pretty good introduction to the semantic web in this month's Communications of the ACM. You're right when you say that the semantic web is, as yet, mostly unrealized. But it has huge potential.
Relational databases were in the same position in the late 60's/early 70's. We needed ways to combine and extract information automatically with a simple and expressive language. Relational database management systems, combined with SQL were the result of that, and they were a smashing success. They are now a standard business tool. The key to that success is essentially the role that the database's ontology plays in an RDBMS.
Having spent a lot of time professionally and academically working with and studying database technologies, most of the work is in understanding your data. Specifically, building a data model. A well-built data model is essentially an ontology. There are various techniques used to make sure that your can be handled automatically, mainly by normalization. This requires a tremendous amount of work on the part of the database designer, but the end result is that the end-user can query this data in fairly simple terms and get an enormous richness of data, sometimes in ways that even the database designer did not foresee. I think the success of database systems is what is driving a lot of the work in building the semantic web.
So you can see-- the big problem with the web is not just that data is not just unstructured, but that there are no standardized ontologies out there. RDF is an attempt to solve some of these problems simply, because you can embed your ontology, but it may be well off. On the other hand, if new tools make structuring data very easy or natural, people may be motivated to do the extra work because they'll personally benefit from it. For example, many people annotate or organize their photo collections naturally, so that they can share them with others. A smart photo gallery software writer may be able to come along and take advantage of that behavior to further enhance the meaning of that data.
> ... the semantic web never did, and never will take off without significant AI involvement.
I understand that the point of Nepomuk is to allow for automated tagging by the standard tools of the KDE desktop. For instance, say you receive a picture from an IM contact who KDE also knows (through the address book framework, Akonadi) lives in Europe.
Then Nepomuk would allow you to make search queries as "Bring up all the pictures that people living in Europe sent me last week". Well, that's the theoretical goal anyway; we will see if they ever get there.
There's one nifty application already: you can create a Folder View plasmoid on your desktop, and instead of making it display ~/Desktop/ as usual, you can make it display the result of a query through the Nepomuk KIO slave. See here how it works.
-- B.
This sig does in fact not have the property it claims not to have.
I wonder if there's an application that will do as you suggest in even more structured environments where such things really ought to be easily possible.
I'd give a minor digit if my Usenet newsreader would tag every download with where I got it from, when, who posted it, and a few other items that should be easily and consistently retrievable from the message headers (that supposedly conform to a defined format). I'd also love it if my web browser would tag every right-click/downloaded picture with the URL it came from and maybe a few other data elements.
Alas, I suppose it's too late for me. Terabytes of unlabeled, unsorted content will probably remain on my computers until long past my demise.
Since this is on topic...and Nepomuk uses strigi components:
My problem: http://forums.gentoo.org/viewtopic-t-710966.html
I've tried contacting both strigi developers, other one doesn't respond and the other says "ask the other guy".
Anyway, I've got about 10000+ JPEGs off my digicams, all of them are commented - in the JPEGs internal comment field. When reading about strigi and other desktop search tools, I was thrilled - I could just search for stuff instead of my old standby jhead *.jpg | grep Comment | grep .
However, at least KDE 4.1 implementation seems to be based on some crappy database with proprietary format with no chance to import the metadata from elsewhere...and when using stand-alone strigi the whole thing doesn't seem to work. From all that I've read, I SHOULD be able to search e.g. all images that were taken with ISO >800 or whatever is in comment field (although there is apparently some confusion whether the comment is JPEG comment or EXIF comment).
Only problem that it doesn't work.
Anyway, I hate the idea of some separate "metadata-database". DB can be used for CACHING, but all the metadata should really be integral to the file itself. EXIF tags for images, ID3 tags for MP3s, and so on - that way if you copy/move the file all the attached information goes with it and requires no specific "transfer metadata too" support from the copy operation.
Anyway, has anyone on /. actually gotten strigi to work with image files/photos?
...or a parfait...whatever.
It has layers. At the outermost layer, on the scale of the WWW, "conveying meaning" as you describe is indeed futile. Cory Doctorow's "Metacrap" essay sums it up nicely (linked to in this discussion thread): People are dishonest, lazy and stupid when it comes to metadata...and when they aren't there is no way to impose standards that more than one person would agree to insofar as imparting meaning on data.
However even Doctorow admits metadata, at some level and taken in context, can be useful. "Laziness" may be one of the hallmarks of a good programmer, but it is "false laziness" in information management to throw out ANY idea of semantic markup.
There are good examples of useful, easily implemented semantic markup or meta-data. "Self-documenting code" is a prime example. Effective use of comments helps maintainability immensely. It also helps to give variables and functions names more useful than i, j, k (if not used as simple loop counters) or doSomething(). Does it help make your code more machine readable? Not at all--it doesn't make your compiler produce tighter binary code or whatever, but it is essential for maintainability, and it IS semantic markup.
In terms of WWW development, semantic markup need not delve deep into the true meaning of content. Web documents would be IMMENSELY more useful of they simply provided "semantic structure". There is very limited utility in a web page that is one big table element consisting of a JPEG Jigsaw of image links, or worse one big embedded flash object. Merely using (X)HTML properly can provide useful semantic structure: Stop using tables for layout, unless you are actually presenting a TABLE of data, and for all you CSS freaks out there, tables are your friends, STOP fashioning tabular data out of DIVs and SPANs--a calendar is a TABLE of dates--it's OK to use TABLEs! If you are making an ordered or unordered list then use OL or UL--that's what they're there for!
So, without resorting to meta tags--completely within the confines of a globally accepted standard--you now have a document with "structural semantics". You can now preform searches for a "table containing a column named 'x'" or an "image named 'y'" or "a heading containing the word 'z'" and you've done nothing but simply properly marked up your document with HTML.
Then you can start working your way to the outer layers to the point where enough people can come to a common ground. Once you've got PROPERLY marked up content with real structural semantics you can use class and rel and other attributes to provide more meaningful semantics. For example, don't use class="bigredbold" in a span simply for the purpose of applying CSS styles. Instead, use class="criticalerror". This is not part of any standard and doesn't help the machine parse any better, but the output of that parsed information can then be interpreted by humans much easier. There need not be rigid standards for this to be useful semantics.
THOSE are the layers where semantic markup can be powerful tools. Simple semantic content can evolve into a general consensus and even a standard: Microformats are a prime example. Facebook doesn't apply rigid semantic web standards to its photo album for example, but being able to simply tag people, places and events is very useful.
The biggest limiting factors of the "semantic web" revolve around "leaky abstractions" and the simple fact that the more people involved in interpretation of content the less they can completely agree on its meaning. This will ALWAYS limit how close you can get to the "outer layers" of semantics in content on the internet. However, at the desktop (or small workgroup) level, there is but one person that imparts meaning on personal data, and a limited audience of consumers. If a solid structure is provided to allow the user to apply their own meaning then the user can have their own personal semantic standards at the outer layers.
I think that is where something like Nepomuk could s
no, i get that in both ff3 and 3.1 on a 2ghz machine
I've tried out Nepomuk and, while I have to say that it's promising, it's got miles to go before it's even near ready.
Unfortunately the same can also be said of KDE 4 in general...
"Nine times out of ten, starting a fire is not the best way to solve the problem." - my wife
I assumed it was KumOpen (come open) backwards. I think the real acronym is even stupider than that.
The official acronym is very contrived so I'm sure it is a "backronym". I also suspect a group of tall-foreheads would deliberately come up with a project name with a suggestive reference like that either.
Google and Wikipedia provide the most likely possibilities for the origin:
Nepomuk is a town in the Czech Republic, in the "kraj" (province or region) called "Pilsen". given this fact, here are some posibilities to explain the name:
* Nepomuk is the birthplace of St. John of Nepomuk, who is considered "the protector from floods". Nepomuk (the project) is intended to aid users in dealing with "a flood" of information.
* Nepomuk was a Bohemian town before the establishment of the Czech Republic. Perhaps they named the project Nepomuk as an indirect reference to anti-establishment viewpoints and free exchange of information/property/etc. assoicated with Bohemian culture.
* Nepomuk is in a region bordering Germany, and this project is headed by a German group. Perhaps a German project lead was born or raised in nearby Nepomuk and named the project after his home town.
* Pilsen is where Pilsner-style beer was invented. Engineers like beer. Code-names are often named after objects of affection. Mmmm...beeer.
Salient phrase in article: "they hope"
By the way, while we are on this topic, I am STILL waiting for my pony for Christmas.
I am anarch of all I survey.
But I don't have any problems here, with IceWeasel 3.1 and a 800MHz machine.
Dilbert RSS feed
or it will never be used. When I download photos, I want my browser to tag where it came from (website) or perhaps which keywords I typed in to find it. I don't want to add this all manually.
The amount allowed per file needs to be limited (perhaps 100 keywords) and managed so the useless ones get weeded out. It will probably be an art to itself, but anything less than filesystem support just won't work.
And yes, in vista you can tag things. But it's tedious and the OS level tools for users aren't there. Something as trivial as tagging an entire set of pictures isn't simpy achieved with the default tools, as I take it.
WARNING! Your toddlers might violate a Patent!
Never fear - this patent has been badly drafted. You can easily circumvent it by not swinging from a tree, use a wooden frame. Or use ropes for your swing. Or use a different number of chains, one looped over the tree for example ...
Are these people just daft! I mean NEPOMUK as in "Networked Environment for Personalized, Ontology-based Management of Unified Knowledge"!
I mean WTF! Can the Linux community pull its head out of its ass long enough to see that names like this drive people AWAY!
Hey KID! Yeah you, get the fuck off my lawn!