Domain: uni-weimar.de
Stories and comments across the archive that link to uni-weimar.de.
Comments · 17
-
Re:Rules can only get so much
It looks like the winning entry uses all of those attributes plus a bunch more. From pages 3-4 of the paper.
- Anonymous -- Wether the editor is anonymous or not.
Vandals are likely to be anonymous. This feature is used in a way or another in
most antivandalism working bots such as ClueBot and AVBOT. In the PAN-WVC-
10 training set (Potthast, 2010) anonymous edits represent 29% of the regular edits
and 87% of vandalism edits. - Comment length -- Length in characters of the edit summary.
Long comments might indicate regular editing and short or blank ones might suggest vandalism, however, this feature is quite weak, since leaving an empty comment in regular editing is a common practice.
- Upper to lower ratio -- Uppercase to lowercase letters ratio
Vandals often do not follow capitalization rules, writing everything in lowercase or
in uppercase. - Upper to all ratio -- Uppercase letters to all letters ratio.
- Digit ratio -- Digit to all characters ratio
This feature helps to spot minor edits that only change numbers, which might help to find some cases of subtle vandalism where the vandal changes arbitrarily a date or a number to introduce misinformation.
- Non-alphanumeric ratio -- Non-alphanumeric to all characters ratio
An excess of non-alphanumeric characters in short texts might indicate excessive
use of exclamation marks or emoticons. - Character diversity -- Measure of different characters compared to the length of inserted text.
This feature helps to spot random keyboard hits and other non-sense. It should take
into account QWERTY keyboard layout in the future. - Character distribution -- Kullback-Leibler divergence of the character distribution of the inserted text with respect the expectation. Useful to detect non-sense.
- Compressibility -- Compression rate of inserted text using the LZW algorithm.
Useful to detect non-sense, repetitions of the same character or words, etc.
- Size increment -- Absolute increment of size, i.e., |new| |old|.
The value of this feature is already well-established. ClueBot uses various thresholds of size increment for its heuristics, e.g., a big size decrement is considered an
indicator of blanking. - Size ratio -- Size of the new revision relative to the old revision
Complements size increment.
- Average term frequency -- Average relative frequency of inserted words in the new
revision.In long and well-established articles too many words that do not appear in the rest
of the article indicates that the edit might be including non-sense or non-related
content. - Longest word -- Length of the longest word in inserted text.
Useful to detect non-sense.
- Longest character sequence -- Longest consecutive sequence of the same character in
the inserted text.Long sequences of the same character are frequent in vandalism (e.g. aaggggghhhhhhh!!!!!, soooooo huge).
Along with analyzing those basic stats, the winning entry also examines categories of words.
- Vulgarisms -- Vulgar and offensive words, e.g., fuck, suck, stupid.
- Pronouns -- First and second person pronouns, including slang spellings, e.g., I, you, ya.
- Biased -- Colloquial words with high bias, e.g., coolest, huge.
- Sex -- Non-vulgar sex-related words, e.g., sex, penis, nipple.
- Bad -- Hodgepodge category for colloquial contractions (e.g. wanna, gotcha), typos (e.g.
dosent), etc. - All -- A meta-category, containing vulgarisms, pronouns, biased, sex-related and bad
words. - Good -- Words rarely used by vandals, mainly wiki-syntax elements (e.g. __TOC_
- Anonymous -- Wether the editor is anonymous or not.
-
Re:Existing
We have studied the accuracy of ClueBot, and found that (on a small corpus) it has very good precision (low falsy positive rate), but a very low recall (low true positive rate). (see: http://www.uni-weimar.de/medien/webis/publications/downloads/papers/stein_2008c.pdf) But the picture might look quite different on a large scale.
-
Does anyone refer to previous artcles....
Posted on Tuesday was a perfect compliment to this story Stereoscopic Viewing. In THAT article it talked aboout independent stereoscopic viewing without the use of glasses. Further if you read the article and then view the albeit large, but nice, 40mb clip from it, you would see that at one point they are projecting an image onto a wall (or any surface would work I would imagine) and the image was completely 3D and adjusted to the viewing angle of the person watching. Now, I couldn't explain exactly how they are doing this, but the demo shows a couple of people who look like they are in a basement using what look like standard lcd projectors (I could be wrong). Bottom line, all the gripes about 3D everyone is talking about are being addressed, and not only being addressed, but with some progress coupled with it. Take into account the previous article, and the industry's new attitude towards 3D, we might actually have something here!
-
I see they are getting ready for the porn
Already are making prototype naked hologram chicks
http://www.uni-weimar.de/~bimber/gfx/research15_.j pg -
Camera
Add a camera to it and I'm sold. Really, Photo browser without camera kind of useless. Camera would make it perfect mobile augmented reality platform
-
How to do it at home
Well, you can implement cheaper and less robast hand-tracking camera system with little coding using open sourced Augmented reality system - ARToolkit. Put small ARToolkit markers on the gloves as described at this article (photo, and implement some gesture recognition (for example that one )
-
Augmented reality
If you are wondering, how fingers positions tracked by camera, pay attention to small black and white squares on the end of the fingers. Those are square-shaped markers used in the ARToolkit - Open sourced, multiplatform Augmented reality library. ARToolkit is esy to use and with camera connected to PC and having camera SDK you can esily write your own augmented reality application. There are augmented relity libraries for cellular phones and pocket pc in development.
-
Re:Second laser
You're absolutely right about the second laser.
Projecting on non-flat and not uniformly bright surfaces is possible with 'smart projectors', which use camera feedback and pixel shaders to adjust the projected image. -
Scroll down pages for other weird research...
"The Extended Virtual Table"
Yeah right... -
Virtual Table
You guys get a good look at "The Extended Virtual Table" on that page? Pretty Interesting.
-
Re:Yes, Finally!
a lot of people say he build the first computer. he did accomplish quite a bit, considering he did it on his own (exept of the help from some frinds who had to cut metal when they came to visit).
but you can't give him a trophy, he died a couple years back. but my uni made him honorary professor while he was still alive, and a buiding is named after him. -
Let's go explore the universe
This is what I waited for!
Ok, now that that these machines can realy reassemble themselves, let's giv'em the possibility to collect and produce their own resources. Construct an initial seed of nanobots, put them into a small rocket, send them to any planet that seems to be inhabitable for humans. There the bots would reproduce themselves with the materials they find on that paticular planet. Based on their inital "gene-code" they would be programmed to make architectural facilities for humans. Due to their evolutional design they could adapt to regional specialities (such as gravitational and climatic issues), i.e. make very thick walls where radiation is high etc. Just like techno-termites using their own body as building material for the anthill. You could fire some thousand seeds into space know, wait a few hundred years until the technoparasites made up a small colony for you somewhere and all you need to do is move in. (Do not forget to bring your coffemug, linuxbox, plant etc.)
a very humble simulation of this can be seen here (german)
Translation by google here
Conceptual Flash-movie here (click first link) -
Let's go explore the universe
This is what I waited for!
Ok, now that that these machines can realy reassemble themselves, let's giv'em the possibility to collect and produce their own resources. Construct an initial seed of nanobots, put them into a small rocket, send them to any planet that seems to be inhabitable for humans. There the bots would reproduce themselves with the materials they find on that paticular planet. Based on their inital "gene-code" they would be programmed to make architectural facilities for humans. Due to their evolutional design they could adapt to regional specialities (such as gravitational and climatic issues), i.e. make very thick walls where radiation is high etc. Just like techno-termites using their own body as building material for the anthill. You could fire some thousand seeds into space know, wait a few hundred years until the technoparasites made up a small colony for you somewhere and all you need to do is move in. (Do not forget to bring your coffemug, linuxbox, plant etc.)
a very humble simulation of this can be seen here (german)
Translation by google here
Conceptual Flash-movie here (click first link) -
Re:The Achilles' Heal of OSSFunny enough, I know of a project that might revolutionize code development, and it is being developed at everyone's favorite software company, with the research help from Oxford University programming tools group. Yep, you guessed it - Microsoft!
Intentional Programming Official M$ research site has severely limited info on the project, some research papers, etc, but does not present the true picture of the potential. Another good site has some more info. The project recently moved to development stage, and the transfer order sent to every VP has a very good description of some of the features. I had a chance to look at the beta version, but NDA doesn't permit me to talk of much... My guess is, M$ will try to develop and use this thing for their internal development to gain the edge on the rest of dev community. There is really no other reason they should hide all the demos and descriptions of it.
The brain behind the project is Charles Simonyi, M$ chef technology guy, who was one of the researchers working on WYSIWYG at Xerox, and formalized Hungarian Notation. He has written some papers on the topic of Intentional Programming.
This thing is by far superior to code parsers and other on-the-fly code generators. Very good presentation is here - you will need a free Power Point viewer. Code can be written in any language, with any syntax. It can assist with every structured text, even HTML and XML. Internally everything is stored similar to compiler-generated parse tree. Every item is a reference. Identifiers can be renamed at any moment, without complicated Search and Replace. Code can easily be moved between projects because it simply re-computes the references. Every chunk of code can be made into a function with a stroke of a key. At a higher level, methods for performing tasks can be formalized in such a way as to have no concrete data types, only abstractions (somewhat akin to STL, but taken to the language level and extended). I do not like some of the things M$ does, but cannot ignore a good idea when I see one.
-
Re:The Achilles' Heal of OSSFunny enough, I know of a project that might revolutionize code development, and it is being developed at everyone's favorite software company, with the research help from Oxford University programming tools group. Yep, you guessed it - Microsoft!
Intentional Programming Official M$ research site has severely limited info on the project, some research papers, etc, but does not present the true picture of the potential. Another good site has some more info. The project recently moved to development stage, and the transfer order sent to every VP has a very good description of some of the features. I had a chance to look at the beta version, but NDA doesn't permit me to talk of much... My guess is, M$ will try to develop and use this thing for their internal development to gain the edge on the rest of dev community. There is really no other reason they should hide all the demos and descriptions of it.
The brain behind the project is Charles Simonyi, M$ chef technology guy, who was one of the researchers working on WYSIWYG at Xerox, and formalized Hungarian Notation. He has written some papers on the topic of Intentional Programming.
This thing is by far superior to code parsers and other on-the-fly code generators. Very good presentation is here - you will need a free Power Point viewer. Code can be written in any language, with any syntax. It can assist with every structured text, even HTML and XML. Internally everything is stored similar to compiler-generated parse tree. Every item is a reference. Identifiers can be renamed at any moment, without complicated Search and Replace. Code can easily be moved between projects because it simply re-computes the references. Every chunk of code can be made into a function with a stroke of a key. At a higher level, methods for performing tasks can be formalized in such a way as to have no concrete data types, only abstractions (somewhat akin to STL, but taken to the language level and extended). I do not like some of the things M$ does, but cannot ignore a good idea when I see one.
-
/. effect should fight for Truth!I was reading the mass-storage Ask Slashdot thread this morning; somone posted this link there. Apparently, someone submitted the link back in as a news article. Here's a link with good debunking material from that previous thread. So is that all well and good? We know now we won't be buying into their 90GiB chip, much less a new video card from these clowns, right?
But! I spent some time reading their "forum" section. This is a truly frightening place; there seem to be three or four posts daily asking for corroborative links, which are responded to by "avatars" flaming the bejeezus out of the querant. I'm bothered by this; I'm so used to
/.'s freewheeling, the-ones-that-know-tell-everyone-else-what-the-rea l-deal-is nature of slashdot forums. The conscensus of this /. forum is to dismiss it; this is a joke or publicity stunt. In fact it isn't. These guys take themselves very seriously, and are openly hostile to any and all references to actual (peer-reviewed) research.Ask Ed Gehrman what he's experienced with this site. He's posted several comments on their site, but then gets childishly (and publicly) ridiculed by the maintainers of the forum, not on the merit of his posts, but the size of his genitalia, literacy, family, etc. This from the supposed CS/EE's, makers of Tommorow's Tommorrow's Technology who can't even spell "teraherz" or "dialectrics" (sic).
I sent Ed a link to Third Voice, and did a touch of debunking myself. If we all went to the site & tore apart their claims, perhaps we can rescue the idiots who're listening to their claims (and sending $$ and equipment to further research, believe it or not. I saw the posts on the forum today).
Anyway, that's my perfect scenario, now that this snake oil operation as once again resurfaced on
/.: what if 1000's of /.ers descend on their little party armed with facts and reason... "what a wonderful world it would be..."So go forth, my fellow Knights of Reason and Heroines of Truth (or vice/versa
:) ). Take up your expertise, your passion, your wit, and take these goons to task! Yield no quarter, take no prisoners, kick ass, forget names, and have fun with it!jaz 'guevera'
-
/. effect should fight for Truth!I was reading the mass-storage Ask Slashdot thread this morning; somone posted this link there. Apparently, someone submitted the link back in as a news article. Here's a link with good debunking material from that previous thread. So is that all well and good? We know now we won't be buying into their 90GiB chip, much less a new video card from these clowns, right?
But! I spent some time reading their "forum" section. This is a truly frightening place; there seem to be three or four posts daily asking for corroborative links, which are responded to by "avatars" flaming the bejeezus out of the querant. I'm bothered by this; I'm so used to
/.'s freewheeling, the-ones-that-know-tell-everyone-else-what-the-rea l-deal-is nature of slashdot forums. The conscensus of this /. forum is to dismiss it; this is a joke or publicity stunt. In fact it isn't. These guys take themselves very seriously, and are openly hostile to any and all references to actual (peer-reviewed) research.Ask Ed Gehrman what he's experienced with this site. He's posted several comments on their site, but then gets childishly (and publicly) ridiculed by the maintainers of the forum, not on the merit of his posts, but the size of his genitalia, literacy, family, etc. This from the supposed CS/EE's, makers of Tommorow's Tommorrow's Technology who can't even spell "teraherz" or "dialectrics" (sic).
I sent Ed a link to Third Voice, and did a touch of debunking myself. If we all went to the site & tore apart their claims, perhaps we can rescue the idiots who're listening to their claims (and sending $$ and equipment to further research, believe it or not. I saw the posts on the forum today).
Anyway, that's my perfect scenario, now that this snake oil operation as once again resurfaced on
/.: what if 1000's of /.ers descend on their little party armed with facts and reason... "what a wonderful world it would be..."So go forth, my fellow Knights of Reason and Heroines of Truth (or vice/versa
:) ). Take up your expertise, your passion, your wit, and take these goons to task! Yield no quarter, take no prisoners, kick ass, forget names, and have fun with it!jaz 'guevera'