Apple Updates All of Its Operating Systems To Fix App-crashing Bug (engadget.com)

← Back to Stories (view on slashdot.org)

Apple Updates All of Its Operating Systems To Fix App-crashing Bug (engadget.com)

Posted by msmash on Monday February 19, 2018 @08:00PM from the fixing-things dept.

It took a few days, but Apple already has a fix out for a bug that caused crashes on each of its platforms. From a report: The company pushed new versions of iOS, macOS and watchOS to fix the issue, which was caused when someone pasted in or received a single Indian-language character in select communications apps -- most notably in iMessages, Safari and the app store. Using a specific character in the Telugu language native to India was enough to crash a variety of chat apps, including iMessage, WhatsApp, Twitter, Facebook Messenger, Gmail and Outlook, though Telegram and Skype were seemingly immune.

1 of 70 comments (clear)

Min score:

Reason:

Sort:

Re:The Source Code by tlhIngan · 2018-02-19 21:39 · Score: 5, Informative

It really is time to replace Unicode with something more robust. These errors due to things like combinational characters and tricks like using the text flow control characters to mask file extensions keep coming up.
Programmers aren't language experts, there are no good libraries for handling Unicode, can't even agree on one sane encoding for it... And it's so bad that it's avoided in east Asia for the most part, or just some incompatible subset is used.
The problem is, text is hard. The rules for text make no sense. Western text is easy - we're used to it, and have a generally controllable amount of characters. We can choose to encode it as individual letters (so accented characters are stored as individual characters) because there are a limited number of them.
But other cultures, not so much. Arabic can be hard and most are decorations that affect a base character. Plus, character pairings don't make sense - adding a character can make the entire word being displayed shorter and more compact than without that character (instead of longer).
It's bad enough that people keep wondering when /. will support Unicode. Internally, it already does, and has for over a decade (and probably since the turn of the millennium). Problem was, people realized the potential for chaos and trolls spent absurd amounts of time crafting Unicode text bombs that would cause the comment section to be displayed incorrectly or overwritten by characters that were thousands of pixels tall and unreadable. In the end it got so bad the only solution was a approved character whitelist - the only accepted characters for comments were on a whitelist, and basically was what you could represent n ASCII. Eventually they added a display filter that killed the crap comments in affected articles as well so the archives were usable.
Unicode is composed of codepoints. A character may be composed of one or more codepoints. Trolls have managed to generate characters that are composed of thousands of codepoints (imagine using 10kB of data which represents one character - how will you program that?).
Of course, I suppose it disappoints lots of people who were hoping to embed the character everywhere to crash iOS devices...