Slashdot Mirror


Getting Unicode Character Codes in JavaScript?

jargonCCNA asks: "I've searched high and low across the web, but I can't seem to be able to find any code snippets or even anything that'll help me out here. I'm trying to get a Unicode character code from a data stream in JavaScript and there doesn't seem to be anything out there to help me; JavaScript itself only has onboard support for ISO-Latin_1, or something. I tried hacking my own converter code, but it's rife with errors. Anybody know of some code that I can include in a GPL project?"

"Here's the buggy code, if you're interested:

function unicode2hex( unicode )
{
var hexString = "";
for( var i = 0x0000; i <= 0xFFFF; i++ )
{
test = eval( "\\u" + i );
if ( unicode == test )
{
hexString += i / 4096;
hexString += i / 256;
hexString += i / 16;
hexString += i % 16;
hexString += "";

return hexString;
}
}

return false;
}
"Mozilla's JavaScript console lets me know that '\u0' is an illegal character. I think this would work if I could make it use the string "0000" instead of the number 0 for i.

Just for reference -- I've seen a lot of people get nailed on Ask /. because they didn't do the proper research before asking their question. Google has failed me; I've been trying to figure this out on my own for about a month. I hope someone can shed some light on my situation."

26 comments

  1. One question by Henry+V+.009 · · Score: 5, Funny

    How did this story get past the lameness filter?

    1. Re:One question by Real+World+Stuff · · Score: 2

      Desperation for Quality content.

      --
      If we don't fight for ourselves no one will.
  2. Ask the Experts by Tux2000 · · Score: 1

    Ask the Experts at http://selfforum.teamone.de. It's a german forum, but most people there can read and write english as well. The SelfForum is related to the famous SelfHTML (at least here in Germany, it is famous). Just copy and paste your question there.

    --
    Denken hilft.
  3. Isn't this a question for developer.net? by displague · · Score: 2

    What's the deal? Cliff must have hit the "Accept" instead of the "Reject" button by accident.

    Try asking your question in IRC before hitting up "Ask Slashdot."

    A search on google for unicode and javascript brings back a lot of positive looking results without actually delving into them. It seems like JS1.5 has support for this (from the Google summaries).

    --
    Marques Johansson
    1. Re:Isn't this a question for developer.net? by jargonCCNA · · Score: 1

      A search on google for unicode and javascript brings back a lot positive looking results without actually delving into them.

      Yeah, positive looking. That's the thing. Looks are exceedingly deceiving on a search engine. Try actually delving in; I can almost guarantee that it won't convert Unicode characters to their character codes.

      --
      Matthew G P Coe
      http://mgpcoe.blogspot.com/
  4. Now for the answer.. by displague · · Score: 1

    Ok, I got my "Second Post" in.. Now here's the good answer.

    document.write("\u00A9 Netscape Communications" );

    I just did that in Galeon and it works fine...

    See - http://developer.netscape.com/docs/manuals/js/core /jsguide15/ident.html#1009690

    --
    Marques Johansson
    1. Re:Now for the answer.. by Lazarus+Short · · Score: 1

      That's great, except that it does the opposite of what he wants. He seems to want a function that'll turn the copyright sign to "00A9".

      --
      The most valuable commodity I know of is information. - Michael Douglas as Gordon Gekko, Wall Street
    2. Re:Now for the answer.. by displague · · Score: 1

      Ahhh.. You're right, I'm wrong... But I'll repeat the truly correct answer as I have already lured someone down the wrong path:

      document.write("\u00A9".charCodeAt(0));

      That provides the decimal, then you just have to convert to hex.

      function Dec2Hex (Dec) { var a=Dec % 16; var b=(Dec - a)/16; hex="" + hexChars.charAt(b) + hexChars.charAt(a); return hex; }

      Blatently ripped off from here

      --
      Marques Johansson
  5. cliff cliff cliff.... by Anonymous Coward · · Score: 0

    Why the hell did you let someone place this story under the topic JAVA?? JAVA != JAVASCRIPT. They're two completely different things. this story is a flat out troll.

  6. Straight to the source! by yancey · · Score: 1


    Why don't you ask the Mozilla developers that are working on JavaScript 2.0?

    --
    Ouch! The truth hurts!
  7. Did you try looking at the docs? by Lazarus+Short · · Score: 5, Informative

    No offense, but I haven't used JS in years, and I found this in a matter of minutes.

    document.write("\u00A9 is ");
    document.write("\u00A9".charCodeAt(0));

    That will give you the answer in decimal. I trust you can convert to hex yourself.

    (Note: Requires Javascript 1.3; previous versions used ISO-Latin-1 rather than unicode, and I don't know what they'd do with a character higher than 255.)

    --
    The most valuable commodity I know of is information. - Michael Douglas as Gordon Gekko, Wall Street
    1. Re:Did you try looking at the docs? by jargonCCNA · · Score: 1

      All right, you're officially The Most Helpful Person On Slashdot now.

      I looked through all the documentation I could find; the only thing I found about charCodeAt() was that it use ISO-Latin.. But I think they also said they were JavaScript 1.2-specific.

      Any idea what version of JavaScript IE6 emulates, and Mozilla actually uses?

      --
      Matthew G P Coe
      http://mgpcoe.blogspot.com/
    2. Re:Did you try looking at the docs? by Lazarus+Short · · Score: 1

      Well, the example I used works as expected in IE 5.0 , NS 4.7, and Moz 1.1a.

      (Similar code with characters outside the range of Latin-1 also works on both, though the browsers sometimes display the "no glyph for that" glyph (open box for IE, "?" for NS/Moz).

      Couldn't tell you what JS versions each browser actually uses, though.

      --
      The most valuable commodity I know of is information. - Michael Douglas as Gordon Gekko, Wall Street
    3. Re:Did you try looking at the docs? by Karma+Farmer · · Score: 1

      I have no idea who decides what is officially JavaScript. I'm imagining an oracle sitting on a subway platform somewhere, eating a corndog and spouting off ziggyisms to anyone who will listen.

      But, I'm assuming that IE will just use whatever version of JScript you happen to have installed on your machine. And, as far as I know, JScript really does follow the ECMAScript specification, which is a real spec, with standards bodies and the whole works, unlike "JavaScript", whatever that is, exactly.

      Anyhow, take a look here to get a look at some of the features of the JScript interpreter hosted in some of your favorite applications.

    4. Re:Did you try looking at the docs? by Karma+Farmer · · Score: 1

      Any idea what version of JavaScript IE6 emulates, and Mozilla actually uses?

      IE6 doesn't emulate JavaScript. It uses JScript, which is Microsoft's implimentation of the ECMA-262 Edition 3 language standard (ECMAScript). Similarly, JavaScript is Netscape's implementation of the same standard. Neither is "emulating" anything.

      You can find the ECMAScript standard here: ECMA-262v3. You can discover what your favorite vendor has actually implemented by visiting either mozilla and microsoft documentation for each vendor's implementation.

    5. Re:Did you try looking at the docs? by Anonymous Coward · · Score: 0

      If I were you I would feel incredibly stupid. You research something for a month and in under an hour get back someone who did nothing more than browse the documentation for "a few minutes", and who, let me add, hasn't used the technology in years. admit it, you haven't felt this dumb in ages....

    6. Re:Did you try looking at the docs? by josepha48 · · Score: 2
      Here is something that will convert:
      function tounicode(instr) {
      len = instr.length;
      switch (len) {
      case 1:
      return instr.charCodeAt(0);
      case 2:
      return new String(instr.charCodeAt(1)) + new String(instr.charCodeAt(0));
      case 3:
      return instr.charCodeAt(2) + instr.charCodeAt(1) + instr.charCodeAt(0);
      case 4:
      return instr.charCodeAt(3) + instr.charCodeAt(2) + instr.charCodeAt(1) + instr.charCodeAt(0);
      }
      return "";
      }

      document.write(tounicode("\u002d") + " " + tounicode("-") + "
      ");

      With this you can take a string like "fooo" with a unicode equivalant.

      --

      Only 'flamers' flame!

    7. Re:Did you try looking at the docs? by jargonCCNA · · Score: 1

      If I were you, I'd not only use better grammar, but I'd identify myself. So somebody the results. Good for him. A lucky search.

      --
      Matthew G P Coe
      http://mgpcoe.blogspot.com/
    8. Re:Did you try looking at the docs? by jargonCCNA · · Score: 1

      Oh, okay... The way I've understood it for years was that JScript was a sorta-cheap knockoff of JavaScript.. D'oops!

      --
      Matthew G P Coe
      http://mgpcoe.blogspot.com/
    9. Re:Did you try looking at the docs? by Anonymous Coward · · Score: 0
      So somebody the results.
      This sentance no verb. Anyway, he right, you just too indignant admit it. All these sentances no verb, for those too shortsited notice.
    10. Re:Did you try looking at the docs? by jargonCCNA · · Score: 1

      Hilarious. It's called a typographic error. Your satire would have been perfect had you spelled "sentence" and "shortsighted" correctly.

      --
      Matthew G P Coe
      http://mgpcoe.blogspot.com/
    11. Re:Did you try looking at the docs? by Anonymous Coward · · Score: 0
      No, his satire would have been perfect if he'd quoted a bit more creatively:
      I were you, I'd [...] use better grammar. So somebody the results. Good for him.
  8. Re:Wrong topic Cliff, you cockfoster by jargonCCNA · · Score: 1

    Too bad there's no JavaScript topic, eh there chico?

    Lick your own.

    --
    Matthew G P Coe
    http://mgpcoe.blogspot.com/
  9. Hey, Cliff... by Anonymous Coward · · Score: 0

    Slashdot is a nerd website. We know better than to think JavaScript is at all Java. Change that Coffee Cup Graphic, bud.

  10. Story submissions by yerricde · · Score: 1

    How did this story get past the lameness filter?

    Stories are probably not subject to the lameness filter (or at least they have looser filters) because an editor must approve each story by hand.

    That said, I have a possible (untested) solution: Try changing each += in the inner loop to a +=""+ to force the strings to be concatenated rather than treated as numbers.

    --
    Will I retire or break 10K?