Unicode 6.1 Released
An anonymous reader writes "The latest version of the Unicode standard (v. 6.1.0) was officially released January 31. The latest version includes 732 new characters, including seven brand new scripts. It also adds support for distinguishing emoji-style and text-style symbols and emoticons with variation selectors, updates to the line-breaking algorithm to more accurately reflect Japanese and Hebrew texts, and updates other algorithms and technical notes to reflect new characters and newly documented text behaviors."
Take a good look at glyph 27cb aka \diagup part of the Misc Math Symbols. People are gonna try embedding that in html now. Can't wait.
"Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
has got to be the Love Hotel.
Does anyone know why this is even there?
Before anyone chimes in complaining that Slashdot doesn't even support an old version of Unicode, this is for several reasons. For one thing, there was once a fad of posting pornographic ASCII art on Slashdot, so it appears Slashdot disallows any character that would be more useful for glyph art than for English text. For another, there was once a fad of using bidirectionality override control characters for turning text backwards, which would break the layout and allow spoofing a comment's moderation score.
Yeah but can you write a pile of poo in ASCII?
http://www.fileformat.info/info/unicode/char/1f4a9/index.htm
Yes, lolcats are a standard now.
Seriously, emoticons? Who ever thought it a good idea to include those in a standard? Should we have an encoding for hearts as dots over lower case i as well? And little horseys, too? And y with a big tail that wraps around to the front of the word?
Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
I believe you mean to say that lolcats are in ur standardz, occupyin ur code-points; but not necessarily prescribing ur particular choice of glyph...
£ is Shift+3, what are you on about?
It pays to be obvious, especially if you have a reputation for being subtle.
...filling pages with sexually explicit ASCII art, such as Goatse, male masturbation, and birds perched on a penis...
Yeah, the way they are going they might actually *have* these characters in the set now...
all the Tetris pieces
The polyominoes up to five squares can be composed from U+2580 (upper half block), U+2584 (lower half block), and 2588 (full block) characters. Unicode tends not to introduce precomposed ligatures except when needed for round-tripping with pre-Unicode encodings.
glyphs of game pieces of all well known games
A lot of well-known pre-1923 tabletop games' game pieces already exist in Unicode. Chess is U+2654 through U+265F, and Checkers is U+26C0 through U+26C3. A lot of game pieces are simple enough in form that the Geometric Shapes (U+25A0 through U+25FF) represent them just fine. For example, Othello is U+25CB and U+25CF, as is Connect Four. Even the enemy in Fast Eddie for Atari 2600 is in Miscellaneous Technical (U+237E) as is home plate in Baseball (U+2302).
heck, instead of just the suit symbols why not 52 glyphs for a standard deck of cards
Those can already be composed from a Basic Latin letter or number and a suit symbol. Unicode tends not to introduce precomposed ligatures except when needed for round-tripping with pre-Unicode encodings.
throw the Major Arcana tarot cards in there too
I don't know about Tarot, but all twelve signs of the zodiac are in Miscellaneous Symbols, even the "69" looking sign of Cancer (U+264B).
gang symbols
The symbol of "Folk Nation" gangs is similar to that of Judaism: a Star of David (U+2721). The symbol of "People Nation" gangs is similar to that of Islam: a 5-point star and crescent (U+262A).
If they write a brilliant paragraph a day ago, then deleted it in the morning, they can view the document as it existed yesterday, copy the paragraph back out, and be done with it.
For one thing, an application that saves (and sends) a document's undo history along with the document can disclose things that the document's author did not want to disclose. I seem to vaguely remember scandals with Word's AutoRecover being used to recover redacted parts of a document. For another, how much of the limited space on the drive should be dedicated to saving a document's undo history since creation, especially when the document is a large layered picture or multitrack audio project?
And that's because people forget to save - why not have the OS do it for them?
I agree, but how often should the OS spin up the hard drive to do so?
ASCII leaves off a lot of English punctuation, and accents that are, in fact, used in English (sure, in words of foreign origin, but they are still used.)
English also has the second-worst spelling system on the planet (only outdone by Japanese).
??? WTF are _YOU_ on about? English does not have the worst spelling system on the planet, and Japanese certainly doesn't qualify as the worst. "But they have three different scripts: two syllabaries, and an ideographic set" but...
Look, perhaps I better just demonstrate to you what a real bad spelling system looks like; go look at Irish.
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
ASCII leaves off a lot of English punctuation, and accents that are, in fact, used in English (sure, in words of foreign origin, but they are still used.)
Some that aren't foreign as well. "Coöperate" is an archaic spelling. Basically, any prefix that ends in "o" that is attached to a word that starts with an "o" can archaically be spelled with a diaeresis, in the French/Dutch method of "this vowel should be pronounced separately, and not as part of a diphthong".
WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
ASCII is just 128 characters.
Write boring code, not shiny code!
This is Slashdot, I'm sure you can find any number of examples of people who've written a pile of poo in ASCII.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
Yeah but can you write a pile of poo in ASCII?
As far as I know, Windows was originally written in ASCII... :)
They're only "easy" if you have your system configured for ISO-8859-1. Those of us who use UTF-8 get this result: à é.
Dilbert RSS feed
I'm pretty sure in HTML5 like in HTML4 the document is considered to be made up of unicode characters and other charsets are considered as encodings of unicode. Of course the HTML5 spec doesn't include all unicode characters explicitly that would be insane.
note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
You know that this is the exact situation that Unicode AVOIDED, doesn't you?
Now we have one standard with 3 different representation. Those replaced literaly thousands of standards. Yep, sometimes doing that new standard works.
Rethinking email