Ask Slashdot: Tags and Tagging, What Is the Best Way Forward?
siliconbits writes "The debate about tagging has been going for nearly a decade. Slashdot has covered it a number of times.
But it seems that nobody has yet to come up with a foolproof solution to tagging. Even luminaries like Engadget, The Verge, Gizmodo and Slashdot all have different tagging schemes. Commontag, a venture launched in 2009 to tackle tagging, has proved to be all but a failure despite the backing of heavyweights like Freebase, Yahoo and Zemanta. Even Google gave up and purchased Freebase in July 2010. Somehow I remain convinced that a unified, semantically-based solution, using a mix of folksonomy and taxonomy, is the Graal of tagging. I'd like to hear from fellow Slashdotters as to how they tackle the issue of creating and maintaining a tagging solution, regardless of the platform and the technologies being used in the backend." A good time to note: there may be no pretty way to get at them, but finding stories with a particular tag on Slashdot is simple, at least one at a time: Just fill in a tag you'd like to explore after "slashdot.org/tag/", as in "slashdot.org/tag/bizarro."
that is all
Tagging isn't anything. It's a construct within a semantic web design; a common-language-everywhere issue. Essentially, you want everyone to agree to a tagging vocabulary, or morph things into it using automation. Why not just ask everyone to speak Esperanto?
My questions for OP...
why use words of any language?
why isn't everything online (include video, images, sound) simply act like a tag with "search the web with this input"?
isn't the best database of tags the web itself? in that case, isn't our best query a search engine?
* Put CCTV cameras up near common targets
* Restrict sales of spraypaint to adults
* Beat patrols
See? Tagging isn't so hard to solve.
There's no -1 for "I don't get it."
surely if "tagging" things on the internet was popular they would of figured out something...
wait...
Hyperlinks
Are we talking about labeling, tagging in the version control sense, egocentric graffiti? Can't figure it out from the summary.
My tag "firetheeditors", to catalogue the poor editing jobs and dupes of Slashdot, has yet to catch on...
The best thing about UDP jokes is I don't care if you get them or not
I do not think "luminaries" means what you think it means.
Also, WTF is Graal?
"I don't know, therefore Aliens" Wafflebox1
... or some other language where every word has one and only one meaning.
"Somehow I remain convinced that a unified, semantically-based solution, using a mix of folksonomy and taxonomy, is the Graal of tagging."
So basically you want everyone to agree on what to call everything. HA! Will never happen. Words mean different things in different contexts. A word that's overly-general in one context will be overly-specific in another. Also, fun fact: not everyone on the planet speaks the same language. Hell, even time changes words. 10 seconds ago, I learned that "Graal" was a word: "Holy Grail, or "Graal" in older forms" If you want a good tagging solution, start by not trying to be so cute and showing off how smart you are and use words that are used today -- call it "the grail" like everyone else in this century. People like you are what breaks tagging systems. :-)
We'll probably solve the problem of how to identify people before we come up with a unified way to name things.
Dear Slashdot: next time you want to mess with the site, add a rich-text editor for comments.
Tags are random stuff about what people are thinking of at any given time.
So if I tag something as #anyhoo #whatever and #squork -- that's what I felt like tagging it as, and in the process I might want to make tags which aren't there or make up new ones.
If tags are meant to be a measure of the zeitgeist and what people are thinking, they're not going to do is according to some taxonomy.
Besides, some bastard will just want to come along and monetize tags and be the canonical source -- #screwem #taxonomyneednotapply
Having a "unified, semantically-based solution, using a mix of folksonomy and taxonomy" is someone trying to impose structure on something which is inherently not structured, and people will never conform to it.
I can see why in corporate contexts you'd want a taxonomy, but for the rest of the world this sounds like a solution in search of a problem. The world isn't something for librarians and archivists to tell us how we should categorize things.
Lost at C:>. Found at C.
Every article on slashdot gets the default tag "story".
Fucking useless.
(Full disclosure: I work for Primal) Have you considered a technology like what primal.com offers? We build taxonomies on the fly based on sparse inputs, and output JSON as a result, so it's easy to work with.
Could replace a tagging system by automatically "understanding" the posted content and using the terms that are synthesized from our process to act as the human-curated tags.
...you have too much time on your hands. Get a dog, a girlfriend, or anything else with demands on your attention and your worries about tagging will happily drift away.
Yeah, I agree with mugnyte: there is no problem here. Move along.
Can you (siliconbits, or anyone) define the problem space better? What's wrong with the way they work now? Twitter Hashtags annoy some but work great for twitter. Everyone you listed has a different solution in place for tagging so... what's the issue? Why does there have to be only one solution?
Do you want a common HTML/RSS/W3C/whatever standard to define tags? Do you want centralized curated lists of tags that people must choose from? Do you want to make it somehow easier (than just typing "#", or typing a word in a box) to tag?
If you really look at good semantic web implementations -- such as Semantic wiki, you'll see some good ideas around a more "complete" semantic mechanism than tagging, but the two are basically mutually exclusive. What basic tags allow that a full semantic implementation does not is hyper-fast user-entered semantic content. This is not a shortcoming of tags, but their primary feature. It's one of the things that makes twitter so valuable (although one could argue it would still work without tagging)... people actually create and use tags all over the place.
So yeah... what, exactly, is the problem again?
http://xkcd.com/927/
In what I like to call "the real world" -ie, the place where no one has heard of commontag, Freebase, or Zemanta, and maybe not even gizmodo - the #tag is the closest you're ever going to get. People use it on twitter and instagram, and advertisers have embraced it. Do any of these giant companies want their users going to other sites? Hell no. Facebook brought back the walled garden, and open systems are going to suffer.
Now that we've realized it's unlikely to happen, would you even want it if it did? If you add an ubuntu link on pinboard, would you want to instantly see all the old ubuntu stories on slashdot? Tag a flickr picture with "hotdog" and see all the tweets about hot dogs? Or take a picture with some app that adds its own tag (#vsco or some such) and see all the other pictures taken with that app? Some of these things actually work, but why? I could see doing something like subscribing to only slashdot/bizarro or gizmodo/tv in your RSS reader, but take a look at the RSS market and no one really gives a shit about that either.
I think wide-area tagging is quasi-useless. Even in closed silos (twitter, instagram), it's a messy sea of miscategorization and gamification. If it helps out the sites search engine, great. If it helps your own organization in whatever tool, great. It may even be good in workgroups - i'm interested to see how it pans out in OS X Mavericks.
Simply put, tags are a way of reducing the complexity of a piece of information so that people of similar mindset can identify it as data supporting their personal opinions. Note that tagging is not exclusive, so multiple tags can be assigned to the same information, summarizing it in diametrically opposite ways.
One thing file system directory trees have shown me is that hierarchy is lousy for categorizing. Convenient for file systems, bad for people. The example I like to use is 2 applications organized into binary and data files. Should the files be put in these directories: /app1/bin, /app1/data, /app2/bin, /app2/data ? Or in these directories: /bin/app1, /bin/app2, /data/app1, /data/app2 ? Or should we use some kind of directory linking, so we can sort of have it both ways? This leads to a question about OOP. If hierarchical organizations are bad for files, maybe they're also bad for classes?
Whatever else tags do, they dispense with hierarchy. A file system that truly did away with the hierarchical directory structure and used tags would be interesting. The problem in the above example would vanish, with the files in question merely being tagged as app1 or app2, and as bin or data. Ask for a directory listing of all files tagged as bin, and get all the files tagged as app1 and bin, and app2 and bin. Strips the ordering out of the problem, leaving categorization, which is still a tough problem.
I ran into this tagging problem when thinking about an app to sort images. The idea was to compare 2 images, and come up with a percentage value of how similar they were to each other, with 100% being identical, and 0% being totally different. But, on what criteria should images be compared? I saw that it was much too simplistic to boil down a comparison of such intricate data to just one number.
Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
Slashdot Items Tagged "futanari"
No objects tagged "futanari"
And we're all thankful for that.
I would say the biggest problem is when someone tagged claims that they were not actually tagged because the person who is 'it' "didn't get me". Although this can sometimes be an honest mistake, especially in cold climates where heavy clothing may prevent the tagged person from detecting the tag, more frequently it is just some asshole who doesn't want to admit they were tagged.
Commontag, a venture launched in 2009 to tackle tagging, has proved to be all but a failure ...
Apparently, your best bet is with this company.
If Pandora's box is destined to be opened, *I* want to be the one to open it.
http://slashdot.org/tag/gps
Now if only timothy would train the other monkeys.
There will be no tagging system that matters. After AI can determine meaning, you won't need a tagging system.
Please do not read this sig. Thank you.
You're assuming that each item only has one natural parent -- which may be true in most taxonomies, but more complex systems (thesaurii*, ontologies), allow for more complex parent-type relationships.
What you're dealing with is even simpler -- facets. You have a bunch of items with two attributes (application, type of file), and each attribute has a limited set of mutually exclusive options. Some file systems can store extended attributes, but they're not always that efficient (as it's not something in high demand). BFS was the only file system that I know of that really pushed it as a main feature.
* Roget's Thesaurus is a synonym ring, not a thesaurus.
Build it, and they will come^Hplain.
The "problem" is that someone is likely needing help to hype some useless new tagging system so they can be bought out by Google.
xanadu.com
Forget tagging, relations is where it's at
#tagging No really, people will self organize on tags all on there own. The simples, and best way to "tag" the internet is to agree on a standard format ala twitter ("the #") and just let it run from there. Parse out the results.
A classic book on the ontology of categories by George Lakoff. The tagging problem, in a nutshell, is that different cultures (and different individuals) create different category systems. The Tower of Babel on the semantic level.
One interesting cross-domain tagging system, which I use extensively, is Fluidinfo. It allows users to attach tags, which can have typed values, to arbitrary objects identified by any unicode string (or by a UUID). There's a query language that lets you find things based on your own tags and, subject to permissions, other people's tags. It was discussed previously on /., but now has more interesting public data in it, such as most of the books from the British Library's catalogue, e.g. Animal Farm and that old /. favourite Pride & Prejudice.
Another recent development that could be significant for tagging is the announcement by Apple that OS X Mavericks will have more extensive support for tags on files both in the OS and in iCloud. Since tags look like being the only way Apple will offer to organize files in iCloud, it is possible these will catch on in a big way, and this could lead to a broader interest in tagging as a general alternative/addition to hierarchical organization.
Who cares about tagging? I don't care about it, and that should be enough for you too.
Can someone please explain...
Ees already got wun you see?
When I was using Flickr, I had one killer app for tags and now I don't use Flickr so I don't use tags. Tags on Flickr were a nice lazy way to organize photos and show people "all the pictures related to #blah" without going through the hassle of creating a set.
I see the tags on Slashdot articles and I'm like... "that's nice"; but I don't use them for anything. If they're useful to you for some reason, fantastic. Come up with your own taxonomy and have a ball. Quit trying to come up with the Ultimate Living Room Organization Scheme (TM), because it's not gonna happen. We all want to put the TV someplace different. Deal with it.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
The real problem of tags is that there's usually fuck all useful semantics associated with them. There's only a benefit to using tags in the first place if many people use the same tagging system and consistently assign the same meaning to the tag as each other. Having just a tag is a bit like just having a scent marker on the information: not much use for saying more than "big primate was here, urinating on this data". There have been clear phases when slashdot tags were exactly on this level. (Does anyone remember when every last post was being tagged with "itsatrap"? It amused me to watch it unfurl, but it was less use than a chocolate bath plug.)
But where there's something more that, a way to get and debate the shared definition of the tag, to see what's been tagged, to be notified when something new receives the tag... that's when the tag acquires real value. There's an advantage to the tagger in using the tag "correctly" and so a fair chance that they will do that. The various stackexchange sites do quite a good job here.
Of course, there's a whole level of tagging above and beyond, with formal semantic tagging via RDF to build a Semantic Web. It would be ever so powerful, except it's really a PITA to work with and needs far more curation to be really useful than web content actually normally has. The very richness enabled by the advanced model they have with formal descriptions of the tags and so on renders it all far less useful precisely because it is so much less commonly used; I suspect a less formal system that has lots of actual data wins out as the semantics are more readily derived from network analysis rather than direct declaration. (I suspect not all my colleagues would agree...)
"Little does he know, but there is no 'I' in 'Idiot'!"
Make slashdot.org/tag the index page for the list of tags. http://slashdot.org/tag/$tag isn't cutting it. Put more than five seconds of effort into its format. Put a link to it in the left column menu, or next to the toe tag icon. Sorted. Optionals: On the tag search page put a top 10 list of "related" tags - tags which most commonly occur in conjunction with this tag in a story. This provides a "conceptual web of themes" or meme map. Allow searching for tag1+tag2-tag3... and so on. Normalize the tag database: in the index list of tags will be some misspellings, synonyms and such - hunt those down with search and replace to get rid of redundant and obvious error tags to get the length of the tag list down to something comprehensible. I would suggest some more, but that's a lot of work already.
Help stamp out iliturcy.
I am not a computational linguist, but I do think one could help.
Here is a solution: DBpedia - A community effort to extract structured information from Wikipedia resulting in a semantically-based solution (ontology) http://dbpedia.org/About http://en.wikipedia.org/wiki/DBpedia
#REDUNDANT
This concerns more file "tagging", but a while ago I grew frustrated with the lack of real solutions for file organization (the oft-discussed but surprisingly absent-in-implementation semantic file system), so I decided to start writing my own. It can best be described as a multidimensional hierarchical abstract file system that is implemented on top of regular POSIX file systems using hard links and a handful of scripts and FUSE. It's still not feature-complete as I want it, but the basic tagging framework is done. Here's the repository for anyone interested: https://github.com/darkfeline/dantalian
Google seems to do pretty well at locating pages, despite many fine pages lacking meta tags (and despite many poor spam articles trying to abuse meta tags.)
I believe the SEO types have been saying for years that meta keyword tags are useless because they were too easy to game, so Google and other search engines basically ignore them now.
Q: What does the "B." in Benoit B. Mandelbrot stand for? A: Benoit B. Mandelbrot
I'm hoping for more tags, so that I don't have to read TFA or TFS. I'll just look at the tags and comments and be done.
The G
If you are: A. In a technical field B. At all competent at your job Understanding basic kinds of metadata like tags, links, and keys is an incredibly basic part of your job.
Sure, and if you have such an understanding—and any real-world experience at all—you will also comprehend that the chances of a useful result being achieved by random people "tagging" an unspecified universe of data objects with a nonstandard meta-data vocabulary are nil.
A tremendous amount of organized effort has been put into creating meta-data structures that can be used to make documents more useful (in one sense or another) over the last 60 years or so. These efforts have certainly not all resulted in failure, but insofar as they have been useful, they have also caused a great amount of pain . I can't prove it any more than I can prove the sun will rise tomorrow, but I'm certain that metadata will always be hard, and that it will involve pain on the part of its creators and users. Based on my knowledge and experience, I say that it's not bloody likely that somebody is going to invent a metadata scheme that is as easy as "tagging" and is also more than marginally useful. I don't say "impossible". I never say that word. But "not bloody likely" is pretty damn close.
Not technical enough for you? Not feeling the pain yet? Let us depart for a short historic stroll down meta-data lane; if you are so inclined, you may follow along, dear reader.
In the days of yore just after the invention of fire, someone came up with the idea that we could make documents more useful by marking them up with standardized generic tags that would help authors structure what they write, and help readers search documents and maybe even do some automated processing. (This involved rapidly riffling stacks of cards with holes punched in them until you could see moving images.) And behold! SGML actually worked fine within certain niches in highly structured environments (read IBM and The Government). If you were ever so unfortunate as to have to work with SGML, you know that these benefits were purchased by the infliction of acute pain—but these organizations have a very high agonic tolerance, so long as the pain is inflicted only on those who do the actual work. But SGML was a standard generalized markup language sort of like the way the Holy Roman Empire was holy, Roman, and an empire. It wasn't just SGML's extreme lawyerly complexity that prevented a rush to public adoption, but that the vocabularies used by it (as specified in the various Document Type Definitions) were arcane, narrow, and specific to a given set of documents, purposes, and institutions. In short, making it useful involved too much bloody pain to tolerate unless you were someone like Caligula, possessing unlimited freedom to impose suffering on others.
Thus matters stood until something came along that intruded into the cozy conferences of the Document Standards Community and caused its members to feel a cold wind up their shorts: HTML. Hypertext Markup Language was simple, and everyone was starting to use it as the new thing called the World Wide Web came out of nowhere and took over. Maybe I'm making this up, but I saw a genuine fear on the part of the advocates of document metadata standards that everyone would just settle on HTML as the universal markup language. That would, of course, have been an Abomination and a Dreadful Mistake, for HTML wasn't invented by a committee. The galloping of the Four Horsemen could be heard in the distance.
To be fair, there were better reasons for alarm. While HTML sort of looked like a document markup language, it wasn't terribly meaningful as far as document metadata goes. (For some reason, I feel a shudder of revulsion whenever I am tempted to use the word "semantic", so I don't, but it's probably not out of place around about here.) HTML was meta-data on whi
Great men are almost always bad men--Lord Acton's Corollary
'Nuff said.
Please tell me. To eliminate diversity of thought? To make it easier for advertisers and others to colonize our lives? What's the GOAL here?