Slashdot Mirror


Google Releases An Open Source Font That Supports 800 Languages (googleblog.com)

An anonymous Slashdot reader quotes Hot Hardware: It's been working on the project over the past five years in collaboration with Monotype in hopes of eradicating so-called "tofu" -- the blank boxes you see when a PC or website can't display a particular text -- from the web. Noto, or No more tofu, is Google's answer, and it's available now to download...

"We are thrilled to have played such an important role in what has become one of the most significant type projects of all time," said Scott Landers, president and CEO of Monotype... Monotype played the biggest role, though Google also collaborated with Adobe and had a network of volunteer reviewers. As far as Monotype is concerned, Noto is one of the expansive typography projects ever undertaken.

There's 110,000 characters, and Google says the project "required design and technical testing in hundreds of languages."

175 comments

  1. Keeping up with the emojis by Megane · · Score: 1

    Isn't a lot of this due to all the new stuff that Unicode keeps adding? I still have a Bitstream Cyberbit font somewhere from... was it back in the late '90s? This is the same thing all over again, just up to date.

    --
    #naabhaprzrag, #sverubfr-000, #agi-fcbafberq, negvpyr[pynff*=' negvpyr-ary-'] { qvfcynl: abar !vzcbegnag; }
    1. Re:Keeping up with the emojis by Anonymous Coward · · Score: 1, Funny

      Isn't a lot of this due to all the new stuff that Unicode keeps adding? I still have a Bitstream Cyberbit font somewhere from... was it back in the late '90s? This is the same thing all over again, just up to date.

      The whole rest of the world just needs to learn fuckin' English!

      Signed,

      Provincial Americans Everywhere (by "everywhere" I mean the USA -- clearly there is no "where" else to be! So what if Candians and other foreigners understand our politics better than we do. That just shows our awesomeness!)

    2. Re: Keeping up with the emojis by Anonymous Coward · · Score: 2, Funny

      I just need the Klingon word for mocking condescension to belittle you with.

    3. Re:Keeping up with the emojis by dmoen · · Score: 5, Informative

      Bitstream Cyberbit was closed source, and had a license incompatible with GPL. Noto is free and open source. The source files for the fonts, and the build tools, are all open.

      Noto is an ongoing open source project that will continue to track the Unicode standard, while Cyberbit implemented Unicode 1.0.1 and then just stopped.

      Noto has Sans and Serif variants in a range of weights and styles, unlike Cyberbit, which had only a single style and weight (serif).

      So that's more than just "the same thing all over again".

      --
      I have written a truly remarkable program which this sig is too small to contain.
    4. Re: Keeping up with the emojis by Anonymous Coward · · Score: 0

      toDSaH.

    5. Re:Keeping up with the emojis by Anonymous Coward · · Score: 4, Interesting

      Hate to say it but I consider the conversion of all emojis to tofu a feature, not a bug. The tofu neatly summarises the vacuousness of the original abomination... I mean, message.

    6. Re:Keeping up with the emojis by mcswell · · Score: 1

      Why not tell the rest of the world (including you) to learn Chinese, or Spanish, or Bangla? That's easy, right?

    7. Re:Keeping up with the emojis by DraconPern · · Score: 2, Insightful

      I think it's more, this is all the glyph in one font, where as before, you had Chinese, Arabic etc. all in separate fonts.  The other half the problem google had was that they didn't have good font rendering in Android, e.g. how you actually render the font.  Microsoft, Apple, and Adobe had it figured out a long time ago and all that knowledge is part of the OS.  So google is basically just playing catch up and open sourcing the data part.  Also... do we really want to load that large of a font when most people only use a fraction of the data? 

    8. Re:Keeping up with the emojis by Michael+Woodhams · · Score: 1

      So make your own branch of Noto called NoEmo, in which all emoji are rendered the same (possibly blank, possibly some generic 'this is an emoji' symbol.) It is open, so there is nothing to stop you.

      --
      Quattuor res in hoc mundo sanctae sunt: libri, liberi, libertas et liberalitas.
    9. Re: Keeping up with the emojis by Guppy · · Score: 4, Funny

      toDSaH

      Wow, Klingons have a word for everything. They're like Space Germans.

    10. Re: Keeping up with the emojis by Ash-Fox · · Score: 2

      You just wrote it in English though, your point is invalid.

      --
      Change is certain; progress is not obligatory.
    11. Re:Keeping up with the emojis by Anonymous Coward · · Score: 1

      I don't think it's a good font though, except for the "fallback if no other font knows how to render it".
      I know people disagree a lot about this, but DejaVu Sans Mono is still the only one that works for me in a terminal (I was hopeful about Hack, but on trying it I actually found it horrible, even for code).
      And for reading, I found that Bitter is the most readable font I've ever seen by leaps and bounds. I never expected that experience... Hopefully it will make its way into distributions at some point.

    12. Re:Keeping up with the emojis by michelcolman · · Score: 1

      Then why does the download page have "Noto Naskh Arabic", "Noto Sans Armenian", "Noto Sans Avestan", "Noto Sans Balinese" and about a hundred or so more?

      I went to the download page hoping to be able to download just one font, or maybe a few for serif, sans, monospaced. But no, a gazillion different fonts. I thought we were past this with Unicode.

    13. Re:Keeping up with the emojis by AmiMoJo · · Score: 3, Informative

      There are still multiple font files for different languages, because you can't have a unified "all language" font with Unicode. It's impossible to support Chinese, Japanese and Korean in the same font, for example.

      Android's font rendering is excellent, has been for years. It also helps that many Android phones, even mid range ones from a few years back, have 1080p or better displays that start to rival print for DPI (400-500 PPI on the screen, 3x that horizontally with sub-pixel rendering, vs. 600 DPI for prints).

      Google just want consistency everywhere and the ability to ship one font that covers all possible languages. You still need hacks because of the Unicode flaw mentioned above, but it's a big step none the less. AFAIK the only other open source font that tries to do this is GNU Unifont, but it's more functional that pretty.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    14. Re: Keeping up with the emojis by Anonymous Coward · · Score: 0

      patock cuplaw! i belive thats what lootent warf wud say!

    15. Re:Keeping up with the emojis by Anonymous Coward · · Score: 0

      It is probably split into subsets of unicode for convenience in some systems.

      The terminology can be confusing, though. The "font" can refer to a "type family" which is the abstract idea and the collection of all the font files that can be included in the "font". Or it can refer to an individual file. See http://graphicdesign.stackexchange.com/questions/35619/difference-between-font-face-typeface-font-in-the-context-of-typography

    16. Re:Keeping up with the emojis by Anonymous Coward · · Score: 0

      If you download the BIG files, the 420+ MByte files, you get all scripts. But if you are outfitting a flip-phone or other small resources device, you may only be able to devote a few MB to fonts. So for those uses, they have unrolled specific regional glyph sets. Maybe all you need is Urdu, Arabic & Latin for your small device. On my desktop or laptop of course I want the full unicode set, why not? But on my smartphone with a paltry 32GB, do I want the full CJK set that I can't read in any case? I'd rather devote that space to cute videos of my dog!

    17. Re: Keeping up with the emojis by Anonymous Coward · · Score: 0

      Wèishéme bù gàosù shìjiè qít dìq (bokuò n) xuéxí zhngguó huò xbnyá y huò mèngjil? Zhè hn jindn, bùshì ma?

    18. Re: Keeping up with the emojis by Ash-Fox · · Score: 1

      Oh look, one of those randomized spam messages.

      --
      Change is certain; progress is not obligatory.
    19. Re: Keeping up with the emojis by Anonymous Coward · · Score: 0

      What the hell kind of gibberish is this? It almost look like PinYin, but some of it is nonsense. And learn the character if you really want to learn Chinese :)

    20. Re:Keeping up with the emojis by Anonymous Coward · · Score: 0

      Ah, working in the terminal. Where my Chinese fonts get cut off, so I had to change the terminal app (Konsole) to English. I don't know who's problem it is (KDE? Konsole? the font?) but this is really annoying bug.

    21. Re:Keeping up with the emojis by Anonymous Coward · · Score: 0

      practically speaking English is the current lingua franca world wide
      it's pretty much the only language you can be sure to find somebody that speaks it nomatter where you are

      (btw check out the youtube clips of entire stadiums of chinese learning english, pretty amazing)

    22. Re: Keeping up with the emojis by stephenmac7 · · Score: 1

      It's slashdot. No Unicode support, so he can't write with Hanzi.

      --
      "No man's life, liberty, or property are safe while the legislature is in session." -- Judge Gideon J. Tucker
    23. Re:Keeping up with the emojis by nullchar · · Score: 1

      This single file font, arialuni.ttf, supports a ton of languages and includes glyphs for many characters:

      Font Specifications and Notes

      Source: Developed by Microsoft Corporation and supplied with the latest versions of Microsoft Office (2000, XP, and 2003). Also available with Microsoft's FrontPage 2000 and Publisher 20002.

      Stats: Version 1.00 has 50,377 glyphs and no kerning pairs.

      Support: This large font includes support for the following languages: Arabic script (Arabic including some dialect-specfic letters, Balochi, Persian, Punjabi Shahmukhi, Urdu), Armenian, Cyrillic (all or most of range), Devanagari, Georgian (Mkhedruli & Asomtavruli), Greek (including polytonic and Coptic characters), Gurmukhi, Hebrew, IPA, Japanese (Hiragana, Katakana, Kanji/Han Ideographs), Kannada, Korean (Hangul only), Latin, Tamil, Thai, Vietnamese.

      OpenType Layout Tables: Arabic (default, Farsi, Urdu), Devanagari, Gujarati, Gurmukhi, Han Ideographic (default, Japanese, Chinese simplified, Chinese traditional), Kana (default, Japanese), Kannada, Korean, Tamil.

    24. Re:Keeping up with the emojis by Anonymous Coward · · Score: 0

      Actually, the whole point of unicode is to allow a single font file for all language glyphs on Earth. So you CAN have a unified "all language" font with Unicode. If you download the big (nearly half a gigabyte) files you will get the all-in-one package. But for smaller devices such as flip phones, there may not be storage space for the big files. Some folks with limited devices may be happy with only, say, Cyrillic, Latin, and Kazakh glyphs. Even the languages such as Chinese, Korean, and Japanese may have use cases where they don't want to bother with nearly half a gigabyte of glyphs.

    25. Re: Keeping up with the emojis by Hognoxious · · Score: 1

      I'd say that's more of a feature than a bug.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    26. Re:Keeping up with the emojis by Hognoxious · · Score: 1

      English is the current lingua franca

      Je vois ce que vous avez fait là.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    27. Re:Keeping up with the emojis by HideyoshiJP · · Score: 1

      Yeah, I was curious how they were going to handle language dependent characters that occupy the same unicode space.

    28. Re:Keeping up with the emojis by Thanatiel · · Score: 1

      There is no real french equivalent of "I see what you did there". (a.k.a. : saying in one expression you a witty remark)
      Perhaps one could say "joli" (nice), or "bien dit" (well said) maybe with a smiley next to it ... but it does not feel natural and they do not require more than 7 bits per character.
      Most french-speaking kids would probably end up using the english expression without a clue of how to write it nor its exact meaning.

      --
      Irrelevant news and morons using moderation to mod down what they disagree on. 2018 resolution: so long.
    29. Re: Keeping up with the emojis by paiute · · Score: 1

      Ziplock!

      --
      If Slashdot were chemistry it would look like this:Cadaverine
    30. Re:Keeping up with the emojis by greenfruitsalad · · Score: 1

      > it's pretty much the only language you can be sure to find somebody that speaks it nomatter where you are

      this is why i'm learning spanish.

    31. Re: Keeping up with the emojis by WallyL · · Score: 1

      You misspelled p'takh!

    32. Re:Keeping up with the emojis by unixisc · · Score: 1

      Isn't a lot of this due to all the new stuff that Unicode keeps adding? I still have a Bitstream Cyberbit font somewhere from... was it back in the late '90s? This is the same thing all over again, just up to date.

      Did Noto need to support emojis?

    33. Re:Keeping up with the emojis by Anonymous Coward · · Score: 0

      I don't think it's a good font though, except for the "fallback if no other font knows how to render it".
      I know people disagree a lot about this, but DejaVu Sans Mono is still the only one that works for me in a terminal (I was hopeful about Hack, but on trying it I actually found it horrible, even for code).
      And for reading, I found that Bitter is the most readable font I've ever seen by leaps and bounds. I never expected that experience... Hopefully it will make its way into distributions at some point.

      DejaVu Sans Mono 10.5 for LIFE!! I can't believe in all this time a better terminal font still hasn't come around.

    34. Re:Keeping up with the emojis by thegarbz · · Score: 1

      Also... do we really want to load that large of a font when most people only use a fraction of the data?

      The problem with this argument is that people only use a fraction of the data right up until the point where they don't, and then everything breaks. I don't speak a word of Japanese but I have Japanese fonts on my computer. Why? Because at some point something important was embedded in a PDF which had some Japanese in it and it refused to render. Up until that point I would have agreed with you, but really the ability to see things how they are supposed to be trumps having a broken view that could construe meaning, and since then I have always questioned why someone would do the insane thing of not shipping fonts for all languages by default.

    35. Re:Keeping up with the emojis by Anonymous Coward · · Score: 0

      Android's font rendering is excellent, has been for years.

      The supplied font is dull as hell though. Roboto or something?

    36. Re:Keeping up with the emojis by Anonymous Coward · · Score: 0

      Ah, I was thinking of Droid Sans.

      But Roboto did not look all that great either (it looks a bit better now):

      http://typographica.org/on-typography/roboto-typeface-is-a-four-headed-frankenstein/

  2. A tool that must serve multiple purposes . . . . by Anonymous Coward · · Score: 0

    Cannot be the best at just one thing.

  3. Bring on the Glyphs by Anonymous Coward · · Score: 0

    I'm so tired of missing Unicode characters.

    1. Re:Bring on the Glyphs by Anonymous Coward · · Score: 0

      lol how about you talk the REAL reason?

    2. Re:Bring on the Glyphs by Anonymous Coward · · Score: 0

      This might help Google developers, but it won't help me as a reader who doesn't know all 800 languages. A phrase displayed in Hindi is just as impenetrable to me as those little boxes.

    3. Re: Bring on the Glyphs by Anonymous Coward · · Score: 0

      At least you know it's another language entirely and not just a rendering problem

  4. "Now available to download" link by aneroid · · Score: 4, Informative

    https://www.google.com/get/not... You're welcome

    Came across this a few days ago when I borked my Slackware upgrade. Everything went fine except GUI login; X kept crashing because I deleted the fonts it was trying to use. One of the google search results was Noto.

    All fonts = 472.6 MB.

    1. Re:"Now available to download" link by aneroid · · Score: 0

      Forgot to mention - this still doesn't solve the tofu problem since you need to have the font installed to not see tofu. In which case Google Web Fonts is still the way to go. You just pick a font which supports your content/language. Or one of the Noto fonts.

    2. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      472.6 MB? Holy shit, someone at Google needs to be fired immediately.

    3. Re:"Now available to download" link by jonwil · · Score: 0

      By far the vast majority of that download size is taken up by the fonts for the 1000s of characters in Japanese, Korean, Simplified Chinese and Traditional Chinese.

      All the other fonts only total to about 10mb or so.

    4. Re:"Now available to download" link by aneroid · · Score: 4, Informative

      1. On the emjoi's fonts there's "Raised Hand With Part Between Middle And Ring Fingers" - WhyTF is that not called "live long and prosper"? Some fonts are described by how they look while others are described by what they mean. A bit inconsistent but I guess that's more of a Unicode consortium issue.

      2. Some of the hand emoji's like "White Left Pointing Backhand Index" are all called "white..." even though they've clearly done the race/skin tone colour spectrum ala whatsapp.

      2b. The colours are a second unicode code (emoji modifier sequence) on the emoji ranging from U+1F3FB (white/pale) to 1F3FF (black/dark). (Btw, that's counter intuitive to programmers since RGB colour codes have "#00" being dark and "#FF" being light.) P.S. I haven't decided if the skin colour aspect of emoji's is racist or not. There may be some people who found the default yellow emoji's racist.

      Answer to #2:
       

      Names of symbols such as BLACK MEDIUM SQUARE or WHITE MEDIUM SQUARE are not meant to indicate that the corresponding character must be presented in black or white, respectively; rather, the use of “black” and “white” in the names is generally just to contrast filled versus outline shapes, or a darker color fill versus a lighter color fill. Similarly, in other symbols such as the hands U+261A BLACK LEFT POINTING INDEX and U+261C WHITE LEFT POINTING INDEX, the words “white” and “black” also refer to outlined versus filled, and do not indicate skin color.

      and

      General-purpose emoji for people and body parts should also not be given overly specific images: the general recommendation is to be as neutral as possible regarding race, ethnicity, and gender. Thus for the character U+1F777 CONSTRUCTION WORKER, the recommendation is to use a neutral graphic like (with an orange skin tone) instead of an overly specific image like (with a light skin tone). This includes the emoji modifier base characters listed in Sample Emoji Modifier Bases. The emoji modifiers allow for variations in skin tone to be expressed.

    5. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      Well, there are ads embedded in every glyph...

    6. Re:"Now available to download" link by Qzukk · · Score: 4, Informative

      Way back when Unicode decided to unify all the CJK glyphs they made several screwups in unifiying characters that were not actually the same in each of the languages. Aside from the character looking wrong in Chinese or Japanese (whichever language you don't have installed as default) they may sort differently in different languages so collation is wrong too. More information (note that you'll need a full CJK font and a browser supporting language selection to see the differences).

      Noto's solution was to create a font with every possible glyph, then for systems which can't support identifying the correct glyph based on language, they made versions of the fonts where the default characters are the Japanese versions or the Chinese versions or so on, then for embedded stuff they made versions of the fonts with just one language's characters. Noto's explanation of their CJK fonts. In other words, you only need one of the 110MB font files.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    7. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      It's still extremely large. And I can't find any indication of what makes those fonts different than the alternatives, for single languages.

    8. Re:"Now available to download" link by _merlin · · Score: 1

      It's for situations where you allow user input, and don't want to limit them to entering text in a single language. Or if you want to display filenames, or the contents of e-mails, or whatever.

    9. Re:"Now available to download" link by ptaff · · Score: 3, Insightful

      Google Web Fonts is still the way to go.

      And helps Google track users one more way. Please be a good hacker and serve fonts from your own domain. Thank you.

    10. Re:"Now available to download" link by Travis+Mansbridge · · Score: 2

      In HTML5 you can serve fonts, so it's just a matter of including Noto on sites where tofu might be a problem.

    11. Re: "Now available to download" link by Anonymous Coward · · Score: 0

      Thank you!

      So many lazy fucking "web designers" building pages pulling in 5+ fonts from Google Fonts drives me crazy. It takes 15 seconds to download them and serve them from your own domain name.

    12. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      You want task kill, you wanna go BIG n it.Nobody needs 4k screens no but people do want 6Gb ram. If your pyone becomes your computer people wanna run all appp, so 6gb is min! or 8. Bloat is bloat but people need to browse, need to go social,, for the app and the common task.

      don't think that because memry ever increase we are at the end. some day like the 640k we will look back with comic and funny about how we did not need the 6gb.

      "640k is enoguh for anybody" -- Bill Gate

    13. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      how about you talk the REAL reason,why you no talk tracking system for Goog?

    14. Re:"Now available to download" link by aneroid · · Score: 1

      All fonts = 472.6 MB.

      That's for all of them. Individual fonts are reasonably and typically sized. Bear in mind, having these many more glyphs for so many languages does require them to be bigger.

      Noto Sans: 657 KB (4 styles, 581 languages)
      Noto Serif: 838 KB (4 styles, 581 languages)
      Noto Mono: 69.5 KB (1 style, 209 languages) # this should have had 581 langs

    15. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      i loooooooled! noone no 800 langs.

    16. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      Gates NEVER SAID that. It's a myth. Urban legend.

    17. Re:"Now available to download" link by _merlin · · Score: 3, Interesting

      Yeah, but it's like "90% of people use 10% of features" - everyone uses a different 10%, so 100% of features are used. Similarly, everyone needs a different combination of languages, so if you're going to use one family of fonts, you want to have massive coverage.

    18. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      (worthless moon language balloons the whole thing to hundreds of megs, go figure)

      ching ching chong haha ^_^ multiculturalism ftw

    19. Re:"Now available to download" link by jrumney · · Score: 1

      I'd hazard a guess that the color emoji are taking up considerably more room than the fairly standard CJK glyphs that have been shipping in fonts around 3-4MB in size for the last 20 years..

    20. Re: "Now available to download" link by TheRaven64 · · Score: 4, Insightful

      It's not always laziness (or tracking, from Google's perspective). Google sets a long cache value for most of these resources. If 10 different sites all host them individually, then someone visiting the site will have to download the fonts 10 times. Alternatively, if they all point to Google then they'll download once and cache the copy locally for the other 9 sites.

      There was a proposal a couple of years ago to embed a cryptographic hash of the resource in the link. This would allow you to specify a download location, but if you've already downloaded the file from another source then you could still use it (it would also make caches more efficient, because you could set an infinite timeout and make clients redownload by having a different hash in the link - clients would keep their copy potentially forever, until you updated the version). I don't know of any browsers that implemented it though.

      --
      I am TheRaven on Soylent News
    21. Re:"Now available to download" link by TheRaven64 · · Score: 1

      Aside from the character looking wrong in Chinese or Japanese (whichever language you don't have installed as default) they may sort differently in different languages so collation is wrong too.

      Collation shouldn't be broken. Collation is always locale-specific. German, English, and French all have different collation orders, even though they're using the same character set (how you sort capitals vs lower case vs accented variants is different in each). The only reason that this would break collation would be if, for example, Japanese sorts Chinese characters differently from the equivalent Kanji (does it? I have no idea).

      --
      I am TheRaven on Soylent News
    22. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      Except that half of my browser are configured to not load fonts (which can be malicious) from random web pages.
      So in fact, these are actually what CAUSE the problem for me.

    23. Re:"Now available to download" link by Anonymous Coward · · Score: 2, Informative

      German and Swedish might be a better example.
      They both have ö and ä, but German orders ö like o and ä like a, while Swedish puts them after z.
      And those very much ARE the same characters.

    24. Re:"Now available to download" link by AmiMoJo · · Score: 1

      Imagine you were writing software for an airline that operates in East Asia. Naturally you have customers from Japan, China and Korea, and naturally they expect their names to be rendered correctly on your web site and on printed material like tickets and boarding passes. They expect to be able to book online. Note that HTML doesn't allow mixing Japanese and Chinese in the same page, the most you can do is Unicode and the browser is guaranteed to render some characters incorrectly for your international customers.

      Now imagine your customer gets to the airport with their passport, name printed correctly, and boarding pass, name printed incorrectly. Better hope that the security staff understand this, especially the ones in Europe or the US who can't actually read CJK but can see that the squiggles don't match.

      Imagine you are trying to publish music by a Japanese artist in Korea, something that happens quite often in fact. Imagine you are Korean and married a Japanese person, but the marriage certificate was prepared on a computer that couldn't render their name correctly.

      This is why Unicode hasn't taken off in East Asia - it simply doesn't work. Instead their systems tend to use locally developed encodings, like Shift-JIS for Japanese and IIRC Big 5 for Chinese. They then include metadata to indicate which one to use, and some hacks like rendering an image on the server to mixed material. There is also TRON encoding, which is somewhat compatible with Unicode but fixes the broken stuff and is a lot more sane.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    25. Re:"Now available to download" link by hcs_$reboot · · Score: 1

      Big. But with some luck, they will be integrated into Chrome, at least the main ones, regular / bold / italics. The size would go down 75+%.

      --
      Slashdot, fix the reply notifications... You won't get away with it...
    26. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      "Note that HTML doesn't allow mixing Japanese and Chinese in the same page, "

      English usage note: "Note that" might seem like a cool way to just make shit up, but actually you're supposed to say something true, not just make it up! Weird huh? Let's try your way

      "Note that 1 + 1 = 5 so actually adding together two odd numbers does give another odd number. If AmiMoJo eats enough of his own shit he'll be able to fly"

      And then the way it's actually supposed to work in English

      "Note that AmiMoJo has no idea what they're talking about. What a buffoon".

    27. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      Actual linguists understand this. But the sort of "linguists" who'll show up on Slashdot to tell you that Han unification means Unicode is broken and that a "new font" will somehow "fix" that don't know anything.

      The existence of language / cultural preference for different glyphs for the same character isn't CJK-only either. It happens in Cyrillic systems very often, and isn't even unheard of in Latin. The unification approach was selected in each case, without a fuss.

    28. Re:"Now available to download" link by Rockoon · · Score: 1

      Note that HTML doesn't allow mixing Japanese and Chinese in the same page

      Note that the font is half a gigabyte and any web page that attempts to send it off to your browser, because a character might look slightly different otherwise, should be removed from the internet.

      --
      "His name was James Damore."
    29. Re:"Now available to download" link by AmiMoJo · · Score: 1

      All modern operating systems come with Japanese and Chinese fonts. The issue is that each HTML page can only specify one character encoding. If it says "Unicode" it can also specify a language to give the computer a hint as to which font to use, but again only one.

      If you look at pages like Chinese language lessons for Japanese readers they often use images or Flash to render the Chinese text correctly, because the browser can't do it. More recently it became possible to hack it with CSS and font stacks, but such pages only tend to look right on Windows and maybe MacOS because the fonts have to be named.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    30. Re:"Now available to download" link by hattig · · Score: 1

      I thought web fonts only downloaded the glyphs required to render the page?

    31. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      An airline in Asia? All passports from China, Hong Kong, and Japan (I can't speak for the others, but seems likely they do too) use latin characters only in the name portion, so that's what gets printed on boarding pass, passport, and travel documents. The name is a romanization of the official system in the country (for example, China is Pinyin, no idea about the others but they have their own systems too). So no need for complicated fonts at all.

    32. Re:"Now available to download" link by omnichad · · Score: 1

      Fonts generally don't have support for color. It's just lines, fills, and ligature instructions. There are just a LOT of languages out there.

    33. Re: "Now available to download" link by Anonymous Coward · · Score: 0

      it wouldn't be very hard to implement this in HTTP.
      GET document --->
      <---- HASH MD5 jsgjsiojeiososghudghdiughishg...
      (if hash is unknown) CONTINUE ----->
      <------ receive data
      (if hash is known) KTHXBAI ----->
      [end]

    34. Re:"Now available to download" link by jrumney · · Score: 1

      You might like to update your knowledge of the topic.

    35. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      You know, xhtml had this crap right (because xml had it right and had a namespace for the language of any elements *content* -- which you can just fucking assume to also apply to its attributes). It was never implemented properly, and it might have ended up lost when it moved from xhtml to html5.

    36. Re:"Now available to download" link by Hognoxious · · Score: 1

      pyone

      Is that a combination of pyrotechnic + phone?

      I think Samsung have a patent on that.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    37. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      Actual linguists understand this.

      Ah well, that's the problem right there. Actual linguists don't write software, so you're stuck with us programmers who have a series of bytes in a utf8mb4 database field and are exasperated by the "actual linguists" suggesting that obviously the computer should use the color of the bits to determine the original language.

      Oh, but of course, we're expected to rewrite every application where every box a user could type an international name or address or text has a separate drop down to select a language. That's totally less exasperating.

      and that a "new font" will somehow "fix" that don't know anything.

      I'm pretty sure most of the people complaining about Han unification are complaining because it can't be fixed by just a "new font". It has to be fixed by adding metadata to every string ever with the correct language to display in. The best Noto can do is hope that Japanese people mostly look at Japanese text and installs the font where all unlabeled characters default to Japanese.

      It happens in Cyrillic systems very often, and isn't even unheard of in Latin

      Examples?

      The unification approach was selected in each case

      Except the characters that weren't unified, which is how we ended up with https://en.wikipedia.org/wiki/...

    38. Re:"Now available to download" link by Qzukk · · Score: 1

      I can't find any indication of what makes those fonts different than the alternatives

      The CJK Unified Ideographs block has 20950 assigned code points most of which are significantly more complicated than Latin script. Add to that katakana, hiragana, hangul, radicals, and so on and there are a lot of characters, making the font significantly larger than fonts for latin-1.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    39. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      [[Note that HTML doesn't allow mixing Japanese and Chinese in the same page]]
      That is incorrect; The "lang" attribute is a global attribute that means you can set it on "body", "div", "span", et al tags.
      You can literally have them right next to each other using the span tag .
      https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/lang
      https://www.w3.org/TR/html5/dom.html#the-lang-and-xml:lang-attributes

    40. Re:"Now available to download" link by doom · · Score: 1

      utf8mb4

      Real programmers avoid using MySQL.

      The "Han Unification" hack does have it's problems (often exaggerated, but still there there), but I wouldn't say that that's the real problem: I think you're right about needing metadata for every string, and the real question in my mind is why isn't that part of unicode itself? There used to be a way to embed locale hints in the text, but that was deprecated with Unicode 5. WTF? What exactly were they thinking?

      There's another issue I don't get at all, which is why doesn't someone out there (like say, google's web fonts?) index fonts according to the codepoints they cover? Then you could do things like check the content you need to display, and make sure you've specified fonts that cover the entire range you're working with. (Or perhaps even better: wouldn't it be cool if the *browser* automatically supplied default fonts if the specified fonts couldn't handle it? Then no more tofu!).

    41. Re:"Now available to download" link by omnichad · · Score: 1

      That doesn't negate anything I've said. In fact, what you linked to says that support is rare or even difficult to get working. And that's even on Linux, so that's likely some sort of non-standard extension.

      But you can see on the download page that Noto color emoji is only 2.8MB.

    42. Re:"Now available to download" link by Anonymous Coward · · Score: 0

      Collation is always locale-specific

      For Chinese and Japanese, characters are sorted by number of strokes. For some of the unified code points, the glyph is different enough that the number of strokes varies by language.

      Locale-specific sorting is fine for the most part, as long as you're generally looking at data in your own language and accepting that data from other languages will be wrong. For an analogy of how these two problems fit together, consider if you wanted to buy a Van Halen album in Germany. There, the shelf says "von Halen" because someone thought "Van" and "von" were the same and their albums are in the "H" section because that's how Germans sort names with "von".

    43. Re:"Now available to download" link by stephows · · Score: 1

      A HTML page encoded in Unicode is perfectly capable of displaying Chinese, Japanese and Korean (CJK) on the same page.

      The problem is that most Chinese, Japanese and Korean pages are not encoded in Unicode.
      Hong Kong and Taiwan tend to encode using BIG-5, Mainland China tends to use GB (means national standard), Japanese tends to use Shift-JIS (Japanese Industrial Standard) and I can't remember what Korea uses.
      Note that each of these is a 16-bit encoding that copies 7-bit ASCII into the lowest 128 characters.
      These different encoding are incompatible with each other similar to the way that French and Russian 8-bit extensions to ASCII were incompatible with each other (I had ex-Russian colleagues that often asked me to display emails for them that could be in any one of 5 different 8-bit encoding schemes, depending on who sent them, KOI was the most common).

      A second problem is that Mainland China redefined the way they write characters in the 1950s.
      The rest of the world continued using the traditional characters.
      The traditional way uses many more strokes but in well defined patterns, the new "simplified" way uses fewer strokes but with less pattern to them.
      Kind of like how some Latin based fonts have an "a" with just the plain circle with a straight line on the right and others add a fancy curve on top.
      Except much more so and readers of one type have trouble reading the other without lots of practice.
      Unicode doesn't solve this.
      Partial solution is to have 2 fonts.
      Both fonts display all characters but in the appropriate writing style - traditional or simplified.

      Normally users in Mainland China have a font that encodes GB with the simplified glyphs.
      Normally users in Hong Kong and Taiwan have a font that encodes BIG-5 with the traditional glyphs.
      But there are some font files out there that do them in the opposite way so that a reader in Hong Kong can read a page from Mainland China (encoded in GB but displayed with Traditional glyphs) and vice-versa.

      I spent a number of years in Hong Kong, China and Taiwan writing software for EFTPOS credit card terminals that had to automatically display English, Traditional Chinese or Simplified Chinese depending on information found on the users credit card (choosing the language was a black art but my code got it right for each of the card types that we had to cater for).

    44. Re:"Now available to download" link by TheRaven64 · · Score: 1

      Oh, but of course, we're expected to rewrite every application where every box a user could type an international name or address or text has a separate drop down to select a language. That's totally less exasperating.

      No you're not. If it's a desktop application, you get the locale from the local user's settings. If it's a web application, you get it from the Accept-Language HTTP request header field. And then you just use that. Since POSIX 2008, even libc has contained thread-safe interfaces for locale-aware sorting. If you're using a database that doesn't support locale-aware collation, then I suggest that you find one that doesn't suck: PostgreSQL has had support for it for well over a decade and can use either libc or ICU, so if your libc implementation is too slow it will still perform well (though with an extra dependency).

      --
      I am TheRaven on Soylent News
  5. My tool was recently used by Anonymous Coward · · Score: 0, Funny

    Cannot be the best at just one thing.

    My penis-tool clearly served the purpose at being the best to pleasure your mom's holes. All of them.

    Although if you ever try double-penetration where you pick the anus and the other dude picks the vagina, or vice versa, well there's a thin membrane separating them. It's pretty weird feeling your dick rubbing the other dude's dick through that bit of flesh. It will teach you who your REAL buddies are! But both dicks are in a female's hole, so it's definitely not gay or anything.

    Even though the desire for anal sex with a woman isn't too distant from gay sex with a man. I mean, let's be objective here. The vagina is TOTALLY different from the penis! But an anus is an anus is an anus. Man or woman, they both make your dick stink a bit. But if she's really kinky she will lick it off!! Or "he" if that's what floats your boat and tickles your pickle. Your mother is kinky. More than you could believe. Very much more.

    1. Re:My tool was recently used by Anonymous Coward · · Score: 0

      first off-topic AC post that actually made me laugh

    2. Re:My tool was recently used by Anonymous Coward · · Score: 0

      first off-topic AC post that actually made me laugh

      But it will never get the +5 Funny it deserves. Meanwhile that award goes to ... regurgitated memes about hot grits, sharks with lasers on their heads, combinations on briefcases and other tired, old, worn-out SHIT. This is a very up-tight and straitlaced readership that feels a strong unconscious desire to "be included, be part of the group". So tired old shit gets modded up simply because it's familiar, and anything actually edgy and amusing is deemed offensive even though we're all adults here.

      - Yes, I am the same AC who wrote that. Glad you enjoyed it! Trolling isn't supposed to be parasitic. Ideally everyone can enjoy a good troll post!

    3. Re:My tool was recently used by Anonymous Coward · · Score: 0

      A lot of humour is in the timing, not new material. I enjoyed your post. It drags the usual 'your mom' posts into an almost uncomfortably intimate place. It's clever; it's amusing; it's even challenging and I'd like to see more of this calibre (and I'd have thrown it a mod point if I'd had any, today) but it's not particularly funny.

      I read at -1, modding or not. Those others who do will read your original post. Those who don't are more likely to be those more comfortable with groupthink and aren't going to like it, modded or not.

    4. Re: My tool was recently used by Anonymous Coward · · Score: 1

      I'm not convinced about the "it's not gay" bit.

    5. Re:My tool was recently used by trabby · · Score: 1

      I for one welcome our sharks with lasers on their heads, eating hot grits, suitcase cracking overlords! From Soviet Russia, in the name of longcat.

    6. Re:My tool was recently used by the_B0fh · · Score: 1

      I for one welcome our sharks with lasers on their heads, eating hot grits, suitcase cracking overlords! From Soviet Russia, in the name of longcat.

      Whatever happened to Natalie Portman...?

  6. This should have been put together by Unicode by complete+loony · · Score: 5, Insightful

    The Unicode consortium should have published glyphs like these as part of the effort of defining the standard.

    Why did it take a separate private company to do this?

    --
    09F91102 no, 455FE104 nope, F190A1E8 uh-uh, 7A5F8A09 that's not it, C87294CE no. Ah! 452F6E403CDF10714E41DFAA257D313F.
    1. Re:This should have been put together by Unicode by Anonymous Coward · · Score: 0

      Because "private" is the only thing that ever works.

      When you choose to allocate your time to an open source project, you are choosing to allocate your "private" capital to that project.

    2. Re:This should have been put together by Unicode by speedplane · · Score: 2

      The Unicode consortium should have published glyphs like these as part of the effort of defining the standard.

      Why did it take a separate private company to do this?

      Probably because building a consortium to even define the characters is hard enough and expensive. Getting buy-in from everyone in the consortium to develop high quality glyphs for orphan languages would have reduced overall support. I agree they should have, but I don't think most company's are as generous as Google.

      --
      Fast Federal Court and I.T.C. updates
    3. Re:This should have been put together by Unicode by AmiMoJo · · Score: 1

      Unicode doesn't consider renderings, that's why. A lot of characters can be rendered in multiple ways, but there is only one code point for all of them and it's up to the font designer which one they want to use. It's actually a huge problem in Chinese, Japanese and Korean, as well as other languages.

      It's time Unicode was deprecated and we moved on to something better. There is the TRON system that fixes or avoids most of the problems with Unicode, for example. Wouldn't be much of a change for applications and it has Unicode backwards compatibility built in.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    4. Re:This should have been put together by Unicode by TheRaven64 · · Score: 2
      The entire point of unicode is that the glyphs are separate from the codepoints. The codepoints (defined by the unicode spec) convey semantics, not presentation. There are lots of different (valid) ways of representing each codepoint (if there weren't, then you wouldn't need fonts at all).

      Then along came emojis and the entire clusterfuck that led to.

      --
      I am TheRaven on Soylent News
    5. Re:This should have been put together by Unicode by ColdWetDog · · Score: 1

      It's time Unicode was deprecated and we moved on to something better.

      So we can be even further behind the curve here?

      --
      Faster! Faster! Faster would be better!
    6. Re:This should have been put together by Unicode by praxis · · Score: 1

      When you choose to allocate your time to an open source project, you are choosing to allocate your "private" capital to that project.

      That's true but there are also people who's public time is allocated to an open source project.

  7. No programmers' typeface by tdelaney · · Score: 4, Insightful

    They have a monospaced typeface, but it's not useable for programming - doesn't even have a significant distinction between zero and O, let alone any other programmer-friendly features.

    Since I presume they're going to want people at Google to use Noto as standard, it seems sensible to me that they create a programmers' version.

    1. Re:No programmers' typeface by Anonymous Coward · · Score: 2, Insightful

      I don't see why distinguishing between the zero digit and the letter O is more important for programmers than for anyone else. Sure, programmers might make mistakes when writing code and want to fix them; but that's true for other people writing text that might contain digits and letters, too.

      If anything, distinguishing between the characters is less important for programmers than other people because programmers will already notice the problem when their code won't compile. I think it is very probable not distinguishing the zero digit and the letter O was a deliberate design decision, and I doubt distinguishing between letters is as important as programmers seem to think it is.

    2. Re:No programmers' typeface by Anonymous Coward · · Score: 0

      Double-O Seven!

      Double-Naught Seven?

      See, the users are better off not knowing there's a difference!

    3. Re:No programmers' typeface by Hypoon · · Score: 4, Insightful

      ...because programmers will already notice the problem when their code won't compile.

      Substitutions of the letter 'O' for the number zero in numeric literals, function names, variable names, and other similar constructs will usually generate syntax errors, yes. (This makes me want to create a library called "Input0utput", just for headaches.)

      However, the compiler probably won't notice if you make the substitution within a string or character literal (if the user types "Outbound", but the software is expecting "0utbound", this might be a hard problem to debug). I've only done this once or twice, but it was infuriating. It's one of those few times when commenting out the line and retyping it verbatim will actually fix the problem.

      The fact that the keys are adjacent on QWERTY keyboards doesn't help anything.

      ...but that's true for other people writing text that might contain digits and letters, too.

      I misunderstood this at first. I was picturing something like, "Mr. Orville's appointment is at 1O:OO.", where the substitution is harmless, so I didn't understand. In something like a model number, "MSO001" might be the first (001) release of a Mixed Signal Oscilloscope (MSO). Writing it as "MSOOO1" definitely obfuscates the meaning behind the model number. Of course, "MSO-001" would probably be best, but it's preferable to match the label on the hardware itself. So yes, I see your point.

      But no, I'm firmly of the belief that the average programmer has a greater need (than the average typist) for easily distinguishable characters.

    4. Re:No programmers' typeface by Nethead · · Score: 3, Insightful

      Where I find the problem is in randomly generated passwords. I have a large spreadsheet of VPN passwords for users at work that I had to change the the password column to an OCR font just to make sure I was giving out the correct code.

      The original C64 had this issue which was worse on the SX64 with its 5" screen. I went as far as to design a custom font and burn it into the font EPROM.

      --
      -- I have a private email server in my basement.
    5. Re:No programmers' typeface by johannesg · · Score: 1

      Since I presume they're going to want people at Google to use Noto as standard, it seems sensible to me that they create a programmers' version.

      What kind of madness makes you presume Google wants all its employees to use this font?

      Tell, I'm genuinely curious. Do you also believe they do all their programming on phones running Android? Or do you suppose they might be allowed to use, I don't know, laptops or normal desktops?

    6. Re:No programmers' typeface by locofungus · · Score: 1

      Saying oh for zero is common in (British) English.

      Dialing code for London:
      020 - Oh two oh.

      Start of a telephone number:
      700 - seven double oh.

      International dialing code for the US:
      00 1 - oh oh one. (Don't know why we don't say double oh but I've never heard it said that way.)

      Bus number:
      205 - two oh five

      In normal spoken or written English you can usually determine whether it's a zero or a letter-o from the context and where you can't it rarely matters.

      --
      God said, "div D = rho, div B = 0, curl E = -@B/@t, curl H = J + @D/@t," and there was light.
    7. Re:No programmers' typeface by UberVegeta · · Score: 2

      00 1 - oh oh one. (Don't know why we don't say double oh but I've never heard it said that way.)

      You mean in the same way that nobody says "double oh seven?"

      --
      I knew I needed to stop reading Slashdot and finish my PhD when I started to miss articles by Bennett Haselton.
    8. Re:No programmers' typeface by AmiMoJo · · Score: 1

      I see this mistake a lot with my girlfriend's handwritten text entries. She writes in Chinese and occasionally inserts Arabic numerals (0123456789). The zero is often interpreted either as a capital O or as a Chinese character that seems to have been adopted from Japanese that is just a perfect circle, used as a substitute for censored characters. It's similar to how newspapers write "sh*t" in English (maybe it's a British thing).

      She knows my Chinese is crap so sometimes writes '9' in Chinese and then selects an Arabic 9 from the list of suggestions offered by the IME, except that it's some kind of emoji character in a box... The iOS Chinese IME seems a bit strange.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    9. Re:No programmers' typeface by TheRaven64 · · Score: 1

      This font is intended as the fallback font. When the currently selected font doesn't have a glyph for the desired codepoint, your font engine will provide a substitute. It will start with similar styles (e.g. sans serif, monospace) and if that fails it will fall back to a generic font that has large coverage. That's the point of this font. If you're using it for most of the glyphs you're rendering, then you're doing it wrong.

      If you want a good font for programming, Adobe released Source Code Pro a couple of years ago under the SIL Open Font License, and it's the nicest that I've found so far.

      --
      I am TheRaven on Soylent News
    10. Re:No programmers' typeface by hattig · · Score: 1

      Every programmer has their own favourite font, from 8x13 bitmap through Courier (why oh why!), old school mono fonts like Andale Mono, Monaco and Consolas, and more modern ones like Firacode, Hasklig, Iosevka, Monoid and SourceCode Pro.

      Whilst it's nice to have another option, I don't think this mono font's main aim is to satisfy programmers.

    11. Re:No programmers' typeface by Anonymous Coward · · Score: 0

      I think you mean Western numerals. Zero in arabic numerals is a dot . and would not be confused with an O or 0

    12. Re:No programmers' typeface by Ken+D · · Score: 1

      I once transcribed a program from a magazine into my first computer... as a hex dump.
      The magazine chose a font where 0, 8, and B were practically identical. That's ~20% of the hexadecimal digit space that's confusing.

      I guess I was a glutton for punishment, because I did get the program to run.

    13. Re:No programmers' typeface by locofungus · · Score: 1

      The international dialing code for Kazakhstan from the UK would be 00 7 (I've just looked it up). I've never heard anyone quote a Kazakhstan telephone number to call from the UK but I would expect them to say oh oh seven, not double oh seven. Apart from anything else, if you did try to tell someone a Kazakhstan telephone number and started double oh seven I'd expect them to not hear the rest of the number while they were laughing.

      --
      God said, "div D = rho, div B = 0, curl E = -@B/@t, curl H = J + @D/@t," and there was light.
    14. Re:No programmers' typeface by bigal123 · · Score: 1

      I was also disappointed they did not make a strong monospace and even proportional font with distinctions for zero and O, sometimes one and letter I for some fonts and etc. Even for proportional writing it would be helpful.

      The two fonts I have used in the past but only good for normal Latin characters are these two (below). Still can't decide what font like better in an editor though.

      * Hack -- Open Source Coding Font (Free, Open Source) http://sourcefoundry.org/hack/ or https://github.com/chrissimpki...

      * Monoid -- Open Source Coding Font, and my current favorite (Free, Open Source)
        http://larsenwork.com/monoid/ or https://github.com/larsenwork/...

    15. Re:No programmers' typeface by hackertourist · · Score: 1

      Except Source Code Pro only contains English glyphs, so it's useless for e.g. debugging exotic-language XML files. I keep switching between Source Code Pro and Arial Unicode MS, which has pretty good language support.

    16. Re:No programmers' typeface by Anonymous Coward · · Score: 0

      Either you're trolling or you've never seen a James Bond movie in your entire life. But I would take the second answer as being a troll too.

    17. Re:No programmers' typeface by Yvan256 · · Score: 1

      I'm pretty sure AmiMoJo meant arabic numerals.

    18. Re: No programmers' typeface by Anonymous Coward · · Score: 0

      Idiot, he's saying that people don't say "double-oh" IN THE SPECIFIC CONTEXT OFF THE INTERNATIONAL DIALLING CODE FOR NORTH AMERICA, not that people don't say it ever. You can even look at the example IMMEDIATELY BEFORE that one in his post, for a context where people do say that.

    19. Re:No programmers' typeface by PRMan · · Score: 1

      I use Verdana. Everyone hates the fact that I use a proportional font, but we have laid out text in decades...

      --
      Peter predicted that you would "deliberately forget" creation 2000 years ago...
    20. Re:No programmers' typeface by lars_stefan_axelsson · · Score: 1

      Where I find the problem is in randomly generated passwords.

      Yes. KeePassXs "exclude lookalike characters" when generating is really useful. I doesn't drop that many bits, and for most situations I can just make the PW a bit longer if it's a concern.

      Trying to type a "random" generated password with lookalikes is an exercise in futility.

      --
      Stefan Axelsson
    21. Re:No programmers' typeface by Anonymous Coward · · Score: 0

      And you have never heard of one Jethro Bodine, self-proclaimed double-naught spy!

      Don't bully, troll, or other-wise have any sort of abusive fun.

      Your M,
      Skankhunt42

  8. Slashdot solved the tofu problem long ago by hcs_$reboot · · Score: 0

    fu

    --
    Slashdot, fix the reply notifications... You won't get away with it...
  9. app store? by Anonymous Coward · · Score: 0

    cannot find on app store.

    1. Re:app store? by hcs_$reboot · · Score: 1

      Try this one instead.

      --
      Slashdot, fix the reply notifications... You won't get away with it...
  10. BLOAT! by Anonymous Coward · · Score: 0

    I never knew I wanted that, or needed that, or even dreamnt such a font were possible.

    Thank you, Teh G, for making my life even fatter!

  11. Horrible Mono Font by brianerst · · Score: 1

    That lowercase 'm' is a horror show. Simply awful.

    It's also no good as a coding font (lack of distinction between various problematic glyphs) but that's probably not its audience.

    1. Re:Horrible Mono Font by KozmoStevnNaut · · Score: 1

      Yeah, it's a bit naff, and obviously not their main focus. Luckily, there are tons of awesome monospaced fonts out there, and coding rarely needs full Unicode coverage.

      --
      Eat the rich.
    2. Re:Horrible Mono Font by omnichad · · Score: 1

      I don't think it's intended for use as a general-purpose font at all. Just for filling in gaps if the font you're reading in is missing a glyph for a particular codepoint. As an English reader/writer, it's unlikely you'll be seeing an 'm' substituted in.

      Anywhere you would see a square box now for missing characters, this font would render in. Will be really useful for viewing Wikipedia (where I see this the most).

  12. Repairing the Unicode Consortium Clusterfuck by Anonymous Coward · · Score: 5, Interesting

    Thank you Google! This is badly needed because the Unicode Consortium screwed up Asian language support badly. The problem started when a bunch of Silicon Valley WASPS got together and formed the Unicode Consortium. Their experts were a joke. They had a foreign language expert who by his own admission couldn't speak the language he was supposedly expert it.

    Then without consulting Asian language speakers they decided to combine all the Asian language characters - including those that were physically different.The result was like some elitist looking at the Greek and Roman alphabets and deciding 'a' is a lot like alpha, 'b' a lot like beta, so why not comine the two of them into a single alphabet, then tell you your name isn't Sam, it's "S". (Slashdot probably won't display this but you get the idea.) This affected eastern and central and south east asian languages.

    This created the absurd situation where some people couldn't even spell write their names or enter them into databases prompting the famous "I Can Text You A Pile of Poo, But I Can't Write My Name" https://modelviewculture.com/p...

    When it was pointed out did the Unicode Consortium admit they fucked up and fix it? Nope. They dug in their heels and insisted each country produce their own font which would display each Unicode character differently to suit their own language. Given the original goals of Unicode this was an amazing backflip. https://en.wikipedia.org/wiki/... https://books.google.com/books... https://plus.google.com/+LizHa... There are other problems too: The encoding the consortium expected makes asian codepages use more space than the standards they were supposed to replace. This was stupid since ASCII was already super efficient for English language, so what was the point?

    If you only write English language software and ASCII is good enough you won't notice any of this but if you have to write International software it's a nightmare. Yes, you might think adding Unicode support allows any your app to run in any language, but it doesn't work like that because of this clusterfuck. You still have to provide different fonts for different countries, and you often have to provide support for old codepages (the various BIG5 variants) for fallback which Unicode was supposed to replace. It also makes translation very hard.

    But Unicode fixed it eventually? Nope. The Unicode consortium continued to ignore it to this very day and instead started churning out stupid emoji: a steaming pile of poo, a taco, and farcical 'equality' emoticons. https://www.theguardian.com/te... https://www.theguardian.com/ar...

    I hope this new font gives us one font which can display all languages and fuck the Unicode Consortium

    1. Re:Repairing the Unicode Consortium Clusterfuck by UnknownSoldier · · Score: 1

      Mod Parent +1 Informative !

      I've running into my own problems of Unicode's shortsightedness.

      2 common glyph are:

      * mouse pointer (See fa-mouse-pointer [])
      * cardinal 4 direction arrows (such as used on Windows, Move) (See fa-arrows [])

      Yet are nowhere to be found in Unicode.

      You're definitely right - the Unicode Consortium is more interested in fluff crap like emoji then practical stuff.

      If the Unicode Consortium didn't have their head's up their asses we wouldn't even need fonts like Font Awesome

      The funny thing is that it is open source on GitHub:
      * https://github.com/FortAwesome...

    2. Re:Repairing the Unicode Consortium Clusterfuck by Grady+Martin · · Score: 1

      I was looking for a glyph of directional arrows no more than two days ago but gave up, convinced that I had overlooked it and would be more likely to find it on a fresh search some other day. In other words, directional arrows is so basic of a glyph, I was more cognitively pleased with personal incompetence than the incompetence of an entire consortium of supposed experts.

      Thank you for confirming it does not exist. However, I'm not sure whether to be pleased or even more disappointed... though I do find solace in Unicode enabling with U+1F4A9 the expression of my feelings concerning consortiums.

    3. Re:Repairing the Unicode Consortium Clusterfuck by KozmoStevnNaut · · Score: 2

      I've been using the Noto font(s) for a while, they're installed by default in Linux Mint (probably Ubuntu and others, too), so I assume this is an incremental release, where they've finally achieved some semblance of full(ish) coverage.

      While I have a couple of minor issues with the fonts design (the lowercase 'm' and 0/O distinction in Noto Mono are atrocious), the font is quite nice on the whole. And while I will never personally use all of the myriads of different scripts included, I whole-heartedly applaud the effort taken to produce a font family that finally covers East Asian languages in a sensible way. I have many colleagues from India (specifically Bengal) and China. It has been a real shitshow for them how the Unicode Consortium first completely neglected and then mishandled their languages.

      We can blame Google for a great many things, but Noto is one thing they definitely got right, and I hope they continue to evolve and refine it, perhaps fix the small font design annoyances, even though they're relatively minor for what is an absolutely huge project.

      --
      Eat the rich.
    4. Re:Repairing the Unicode Consortium Clusterfuck by Anonymous Coward · · Score: 0

      This is not a fair assessment.
      The character issue is not the same as "deciding 'a' is a lot like alpha". There are two issues here.
      1. There are characters that are exactly the same used across Asian languages, and Unicode decided to treat them as single characters, despite the efforts of some people to create a separate Chinese section which would have contained thousands of duplicates with the Japanese section.
      So the real equivalent would be having the French insist that their "a" is somehow different from an "a" written in English, and forcing hundreds of duplicate "a" in the Standard, one for each language that uses the letter.
      2. There are variants, which are also often common across languages, and are used in the same way. The real equivalent is not "a" versus alpha, it is like "ae" versus the archaic ligature "æ". An extreme example, the Japanese gives you the choice of or , the latter being the post-war version of the former, and which should be handled as a glyph issue, not as a character definition issue. Most examples are far more subtle in their differences are are more obviously just written variants.

      Understanding that there are many grey areas here, particularly with people's names, the Unicode Consortium chose not to create a chaotic structure of duplicate characters, and to err in the other direction, knowing that there were entire blocks reserved for special characters not yet covered by the Standard, which could act as a short-term solution.

      The best way to think of a Unicode point is as a kind of Platonic ideal of a character, with the implementation being left up to the various standards such as UTF-8, UTF-16, etc., and the display being properly the domain of fonts.

    5. Re:Repairing the Unicode Consortium Clusterfuck by alexhs · · Score: 1

      Arrows are definitely present in Unicode.

      --
      I have discovered a truly marvelous proof of killer sig, which this margin is too narrow to contain.
    6. Re:Repairing the Unicode Consortium Clusterfuck by AmiMoJo · · Score: 4, Informative

      It's even worse than that. On many systems, e.g. Windows, w_char is defined as 16 bits, meaning it can only ever support the Unicode Basic Multilingual Plane without hacks. Since a lot of the fixed CJK characters are outside this plane, software that uses w_char usually doesn't support them. Some of this is baked into hardware, for example Unicode uses UTF16,

      I'm seriously thinking about writing an open source library to support TRON encoding. The lack of a good alternative seems to be what is preventing Unicode from being deprecated in favour of something better.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    7. Re:Repairing the Unicode Consortium Clusterfuck by hackertourist · · Score: 1

      Note that this new font doesn't fix the 'Han unification' problem. It just provides 3 versions of the font, one for C, one for J and one for K. This sidesteps the clusterfuck (and forces you to select a different font for each language), but does not fix it.

    8. Re:Repairing the Unicode Consortium Clusterfuck by hattig · · Score: 1

      And not one of them is the cardinal 4 directions arrow, as mentioned.

    9. Re:Repairing the Unicode Consortium Clusterfuck by hattig · · Score: 1

      Has anyone come up with a reasonable, backward-compatable, manner in which this could be achieved (technically) within the current Unicode standard?

    10. Re:Repairing the Unicode Consortium Clusterfuck by omnichad · · Score: 1

      This is a substitution/fallback font - and shouldn't be used for design or UI except where the chosen font is missing a character. If your native language is Chinese, you won't be using this font to view any glyphs that are already included in your Chinese font.

    11. Re:Repairing the Unicode Consortium Clusterfuck by Anonymous Coward · · Score: 0

      Note the rant of the other Anonymous Coward pops up whenever unicode comes up and is a straight up lie, even his linked wikipedia article contradicts him. The han unification was done by a local consortium of native speakers trying to fit everything in the remaining codepoints that could still be 16 bit encoded for Unicode 1.0 . Of course that became moot with Unicode 2.0, so hindsight is 20:20 - however it has nothing to do with ignorant westerners.

    12. Re:Repairing the Unicode Consortium Clusterfuck by UnknownSoldier · · Score: 1

      And the cardinal 4 direction arrows are _where_ again in Unicode ??

      We're not talking about general arrows, we are talking about a specific arrow. If you don't want to look like a fool, learn to read before replying, please.

    13. Re: Repairing the Unicode Consortium Clusterfuck by Anonymous Coward · · Score: 0

      Asians should should have separate internet . Each race should have separate internet .

    14. Re:Repairing the Unicode Consortium Clusterfuck by legRoom · · Score: 1

      On many systems, e.g. Windows, w_char is defined as 16 bits, meaning it can only ever support the Unicode Basic Multilingual Plane without hacks.

      True UTF-16 supports non-BMP code points just fine, and is not a "hack". In fact, it's actually slightly easier to do so in UTF-16 than with UTF-8 (the only other common Unicode encoding).

      The real problem is that there is no single concept in Unicode that maps to the "character" of the old, simple ASCII standard with which most programmers are familiar. Depending on the task at hand, the correct substitute under Unicode may be code units, code points, or graphemes. Ignorant and/or lazy programmers who make incorrect selections between those three are the cause of many Unicode-related bugs.

      Also, some important "Unicode" APIs were stabilized before the standard evolved into its present complex form, and cannot be completely fixed for backwards compatibility reasons: notably, the Java standard library and the Win32 API.

      These problems persist for the same reason that commercial software usually doesn't even try to support Linux: the additional market share (in dollars , not just users) which can be captured is perceived to be worth less than the cost of properly writing, optimizing, and testing the considerably more complex and slower code required for full Unicode support.

    15. Re:Repairing the Unicode Consortium Clusterfuck by AmiMoJo · · Score: 1

      A lot of developers just throw in Unicode support and assume their software supports all languages. We need something better that actually does that, rather than Unicode.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    16. Re:Repairing the Unicode Consortium Clusterfuck by Anonymous Coward · · Score: 0

      I always think it's funny that the PC crowd uses the term "WASP" (White Anglo-Saxon Protestant). How is this not racist? You are painting an entire demographic with one brush by using this term, and yet feel no shame?

    17. Re:Repairing the Unicode Consortium Clusterfuck by legRoom · · Score: 1

      I get the feeling that you don't understand how numerous, complex, arbitrary, diverse, ambiguous, etc. natural languages are. That phrase, "all languages", doesn't even have a knowable, well-defined meaning, either in theory or in practice.

      It would certainly be possible to improve upon Unicode, if you're willing to sacrifice backwards compatibility. However, it will never achieve your stated goal of guaranteeing support for "all languages" just by "throwing in" a new text processing library.

      Projects that refuse to invest in internationalization will continue to fail badly at it, regardless of whether they use Unicode, or a "lessons learned" successor encoding.

    18. Re:Repairing the Unicode Consortium Clusterfuck by AmiMoJo · · Score: 1

      That was my point. Developers who aren't experts in languages just assume that if they tick the Unicode box in the compiler options their software supports everything, but in reality that's far from the case.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
  13. That's great .. there's nothing more annoying by Chrisq · · Score: 1

    That's great .. there's nothing more annoying than having little rectangles on a web page instead of the proper glyph that you wouldn't understand anyway!

    1. Re:That's great .. there's nothing more annoying by KozmoStevnNaut · · Score: 1

      I'm sure a lot of East Asian people share your annoyance

      --
      Eat the rich.
    2. Re:That's great .. there's nothing more annoying by baka_toroi · · Score: 1

      Currently the world is focused on more demographics than monolingual English speakers.

  14. Accept headers schmaccept schmeaders. by Hognoxious · · Score: 1

    I can see why this is important to Google, since they seem to like showing me ads in the wrong language.

    --
    Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    1. Re:Accept headers schmaccept schmeaders. by ColdWetDog · · Score: 1

      Feature, not a bug. Be quiet.

      --
      Faster! Faster! Faster would be better!
  15. Why not just a single font? by alfino · · Score: 1

    So now I have a couple dozens of "Noto" fonts on my machine, but wasn't the idea of Unicode to approach it "one size fits all"? I.e. why is there not a single "Noto serif" font that combines them all? Or how else is one supposed to configure the browser now to give access to all those symbols?

    --
    echo mailto: !#^."<*>"|tr "<*> mailto:" net@madduck
    1. Re:Why not just a single font? by fendragon · · Score: 1

      why is there not a single "Noto serif" font that combines them all? Or how else is one supposed to configure the browser now to give access to all those symbols?

      A single font for all of them, as has been mentioned above, is possible but would be over 400MB, which is a problem for some of us.
      Browsers will search other available fonts for a code point that's not in the current font, so you can install a collection of subset fonts that includes all the characters you are likely to need.

    2. Re:Why not just a single font? by ledow · · Score: 1

      That's not what Unicode is for.

      If you want Serif or Sans Serif, those are entirely different typefaces.

      If you want monospaced or not, again those are entirely different typefaces.

      All Unicode does - especially when you combine it with TrueType semantics or want a font that works everywhere - is provide characters for everything you might need.

  16. hells teeth by johnjones · · Score: 3, Interesting

    honestly

    where is the mathematical fonts and symbols for science ?

    STIX goes some way but why this is not in noto ?

    why would you send a mathematical explanation into the stars but we cant express those notations on machines we use every day ?

    thanks

    John Jones

  17. Ongoing effort? by Anonymous Coward · · Score: 0

    I have Noto Sans; I have the habit of selecting the fonts I like best from Google Fonts and have downloaded it.

    On this particular computer, though, it seems it came with Mint 17.x. Maybe it's not complete as offered by Linux Mint (I personally find that improbable).

    The download page has an ironic first comment which includes a char which still gets converted to tofu (I'm forcing Noto Sans with Firefox): it's a square target mark ( http://www.fileformat.info/info/unicode/char/2bd0/index.htm ).

    Noto looks nice and probably a good bet for the future; for my tastes, PT Serif / Sans / Mono are better (but probably not useful for many Eastern languages).

  18. "Reelelsed"? When? by Zanadou · · Score: 1

    http://www.google.com/get/noto/updates

    Last entry: "September 29, 2015"

    Yeah... so it's the same thing I downloaded and installed last year.

    I'm so glad Slashdot is catching up...

  19. Google management is becoming more and more messy. by Futurepower(R) · · Score: 1

    Thanks for the explanation.

    I notice that Noto Serif is a well-designed font. There is an italic and a bold, but no semi-bold. The Google Noto font download web page is a mess. How is NotoSansMandaic-unhinted different from NotoSans? When I look at the font in Windows font preview, I see no difference.

    I see many examples of Google management becoming more and more messy.

  20. This is dumb. by jimbob6 · · Score: 1

    So what are they using besides "tofu", nothing? Blank spaces?
    If I don't have the character set installed that the page is written in, then its probably because I can't read that language any way.
    And I damn sure don't want to load every character set that exists on the web into my browser. It would run like balls.
    If you go to a page that has characters that your browser doesn't understand and you need to get to the information, use Google translate.

  21. Bugger ! by Anonymous Coward · · Score: 0

    from wiki: The Noto Color Emoji font only works under Android and Linux, and cannot be installed under macOS or Microsoft Windows.[7]

    So I'm still going to get a clusterfuck of "tofu" on my non-Android mobile phone?

    Brilliant. That was clever.

  22. tofu is faster by OrangeTide · · Score: 1

    Don't show me glyphs that I am not trained to read. i'd really rather see square boxes in situations where foreign text was displayed with the wrong font. Wrong font being the font that I'm using.

    --
    “Common sense is not so common.” — Voltaire
  23. Don't favor minor cache savings over tracking. by jbn-o · · Score: 1

    Storage is cheap and plentiful these days; the caching argument doesn't convince me and minor improvements strike me as possibly nice conveniences but nothing significant. I'd rather promote not centralizing the web and not encouraging doing work with known trackers including Google.

    1. Re:Don't favor minor cache savings over tracking. by thegarbz · · Score: 1

      Storage is cheap and plentiful these days;

      Tell that to my mobile phone contract struggling under the weight of yet another multi-megabyte websites that does not need to be.

    2. Re:Don't favor minor cache savings over tracking. by TheRaven64 · · Score: 1

      Storage is cheap, bandwidth is often not. When you're downloading 500KB of a JavaScript library for multiple different sites, that adds up quickly on mobile devices. It also adds to the page load times - the odds are your users will have cached the thing from Google already, so it doesn't add anything to their load times. Additionally, for JavaScript, it's possible to store commonly-used libraries in pre-parsed form (Safari will keep the bytecode for cached JavaScript libraries for a while), which also improves performance.

      --
      I am TheRaven on Soylent News