Slashdot Mirror


Unicode Consortium Releases Unicode 8.0.0

An anonymous reader writes: The newest version of the Unicode standard adds 7,716 new characters to the existing 21,499 – that's more than 35% growth! Most of them are Chinese, Japan and Korean ideographs, but among those changes Unicode adds support for new languages like Ik, used in Uganda.

4 of 164 comments (clear)

  1. Re: I'm going back to ASCII by OrangeTide · · Score: 2, Insightful

    Sorry, why do we need multiple languages again?

    --
    “Common sense is not so common.” — Voltaire
  2. Re: I'm going back to ASCII by Anonymous Coward · · Score: 0, Insightful

    the world would do well with a single global language instead of the literally 1000s of languages and dialects we have now. communication, or rather the inability to communicate with one another, is the key reason (religion being a close second) for conflict and strife in our society.

  3. Re:Seems like it, but doesn't by Dutch+Gun · · Score: 4, Insightful

    Well, no, it's not, they're still working on that bit meaning that you get to keep upgrading all your programs to use newer and ever bigger libraries supporting more complex rules regularly. It's not stable.

    Nonsense. The Unicode encoding formats are stable, and have been for a very long time. New character are added all the time, but the underlying OS and it's fonts are typically upgraded to support these, and so most programs need to do absolutely nothing once their support is in place. The vast, vast majority of applications that support Unicode don't actually explicitly need to use those "official" Unicode libraries (which are monstrously complex), because all modern operating systems provide most of the support they need. For simple conversions, there are a number of excellent free and simple-to-use libraries (many languages have standard libraries available), or you can just use OS-specific versions, or a number of very easy-to-use free and open-source libraries.

    If you're concerned about size, just use UTF-8. There's no need to "switch encodings on the fly", because that's what variable-width encodings already do for you. And the vast majority of common encodings, even in Asian languages, are only 16-bits, not 24 or 32. The issue of inefficiency of text size with Asian languages is greatly exaggerated, and becoming less and less relevant anyhow with our machines with gigabytes of RAM and processors efficient enough to compress and decompress text on the fly. BTW, you can do that just fine even in Microsoft and Apple environments. It just means you need to transcode from UTF-8 to UTF-16 or back again at any API boundary that takes text, and this is fairly simple to do. I've written my own cross-platform code this way because UTF-8 is a much easier encoding to work with internally IMO.

    I don't think anyone would try to argue that Unicode is a perfect solution, but it's a damn sight better than what we used to have. Your comparison to USB is pretty good, in fact. Ask just about any PC user what they'd prefer - modern USB devices or the old system of parallel, serial, PS/2, and joystick ports. Whatever faults USB has, it's a hell of an improvement over the old system.

    --
    Irony: Agile development has too much intertia to be abandoned now.
  4. Re:Ithought by KiloByte · · Score: 4, Insightful

    That slashdot didn't support unicode

    It does. It's actually fully Unicode-compliant

    No, Slashdot's database works in ISO-8859-1. You're confusing Slashcode which can do Unicode with Slashdot which still hasn't deployed it.

    --
    The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.